Linux - BrixIT Blog

Using a Blackmagic Design camera as wildlife camera

Martijn Braam — Fri, 17 Oct 2025 16:08:51 -0000

The sane solution to capturing wildlife going through your garden is to use a dedicated wildlife camera. In my experience these things are pretty much always trash. The devices I've seen are all basically cheap dash-cam boards that happen to have motion detection on it and produce the image quality of a 2001 webcam.

It's also a bit wasteful to get a new device to capture video when I already have several of those that produce way better quality, they just lack the features for it. For this specifically I want to use the Blackmagic Design Pocket Cinema Camera 4k, or bmpcc4k for short because that's a ridiculous name.

With recent firmware updates this camera has gained a few interesting features for doing this, the first one is that it has an USB webcam mode and the second is the REST api for controlling the camera.

To add the motion detection capabilities to this camera I connect it to a laptop with a USB cable and run motion on it which is usually for using webcams and network cameras as a security camera. The trick to making this work is disabling all the recording methods in the software and using the shell hooks to send API commands to the camera when an event is detected.

The config file I'm using with motion is:

daemon off
setup_mode off
log_level 6

# Find video device with lsplug -rd
video_device /dev/video2

# The bmpcc4k always outputs 1080p on the webcam
width 1920
height 1080
framerate 30

# Show the motion view in the web ui
stream_preview_method 3

# Detection threshold for the motion detection
threshold 1500
minimum_motion_frames 1
despeckle_filter EedDl

# The event ends after 4 seconds of no motion
event_gap 4

# Don't store recordings, the camera does that
picture_output off
movie_output off

# Send commands to the API to start and stop recording
on_event_start curl --request POST http://172.26.209.29/control/api/v1/transports/0/record
on_event_end curl --request POST http://172.26.209.29/control/api/v1/transports/0/stop

# Enable the webinterface for monitoring
webcontrol_port 8080
webcontrol_localhost off
stream_port 8081
stream_localhost off
stream_maxrate 15

The IP address used to communicate with the camera here is the one for the USB network interface created by the camera. In my case the IP address is the same after re-plugging it but it might be different for your camera. The DHCP server in the camera gives your computer a /30 address (in my case my laptop got 172.26.209.30/30) and since this subnet size allows for two machines the only other one is the camera. In practice just substract 1 from the ip address received from the camera.

So far I haven't been able to capture the squirrel, but at least the setup is working as it did capture some birds.

Don't pick weird subnets for embedded networks, use VRFs

Martijn Braam — Thu, 21 Aug 2025 10:55:29 -0000

So what is an embedded network? I found it pretty hard to find a good name for this but I've come across them many times and created some myself. A good example here is a portable video rack. You drag a rack case around with video and network equipment and you need to connect it to the network of the venue to stream to the internet. The devices in the rack need to communicate with each other but you don't want to reconfigure their addresses every time you move to another venue because it happens to use another subnet.

The solution to this is easy! Just add a small router in the rack so you have a consistent subnet inside the rack and the NAT isolates you from the changing IP addresses outside your small network. Your rack has 10.0.0.0/24 addresses because they are easy to remember and the router gets an IP from the venue using DHCP.

This works perfectly fine until the public network in the venue also is on a 10.0.0.0/24 network. The router suddenly has the same subnet on both interfaces and addresses in your rack start conflicting with other hardware in the venue.

This is the point where I see people often picking weird subnets for portable equipment. "What are the chances the venue has 172.16.42.0/24?, or 10.11.12.0/24"? And sure this works, until you get a conflict on those because humans are simply not that great at picking random numbers. It is not actually neccesary to have network seperation by random chance, you just need router features beyond common consumer routers.

The IPv6 solution

The most official solution for this is IPv6 of course. If your private network has IPv6 internally you can just address every device by it's link local address. Due to having a router splitting the network segment between your rack and the public network you know that all the link local addresses are always your own devices.

You don't even need to have a DHCP server in your local net anymore. Every IPv6 device just gets it's own fe80:: address and you'd use network discovery protocols to find the address of the device to talk to. For example avahi can be used to have name resolution for these addresses.

In this case the router should send out route advertisements so the devices still get internet connectivity without configuring any addresses. This basically causes the devices in the rack to set the link local address of the router as the gateway address.

The massive problem with this solution is that support for IPv6 on devices that are not general-purpose computers is usually non-existent. I have a portable rack right here with a Behringer X-Air audio mixer in it and this is just modern enough to allow you to use DHCP to get an address, no IPv6 support in here whatsoever. If I grab a random video mixer here it will most likely also not support IPv6, some of them don't even do DHCP yet.

Technically the same solution is also possible on an IPv4 network. You've probably encountered it before: Getting an 169.254.0.0/16 address on your computer when you selected DHCP but the network did not have DHCP server running. This is an APIPA address and is the IPv4 equivalent of an fe80:: IPv6 address. Sadly this doesn't work that well for this though since with APIPA networking there is no way to have a gateway so you'll never have internet connectivity. Also if your machine does get a valid IP address so you can connect to the internet the APIPA address will normally be dropped on that interface since it's no longer needed. And just like IPv6 support in embedded hardware, it's most likely not implemented.

A neat trick if you are dealing with an IPv6 network though for finding the addresses of the hosts in your segment:

$ ping -c 2 -I enp3s0 ff02::1
... responses from multiple devices here ...
$ ip -6 neigh show dev enp3s0
fe80::cc55:13ff:fe9b:82a2 lladdr ce:55:13:9b:82:a2 REACHABLE 
fe80::df8c:9c93:6850:7132 lladdr b0:35:9f:5a:b2:49 REACHABLE 
fe80::ac2b:55ff:fe62:a34 lladdr ae:2b:55:63:0a:34 REACHABLE 
fe80::a0b8:78ff:fe5c:8229 lladdr a2:b8:78:5a:82:29 REACHABLE 
fe80::5660:9ff:fee1:cb4b lladdr 54:60:09:e1:cb:4b REACHABLE 
fe80::f61e:57ff:fe61:9a58 lladdr f4:1e:57:91:9a:58 router REACHABLE

This sends a broadcast ping to your local segment and then you can read out the list of neighbors known on that interface, which also conveniently tells you which device is advertising routes. The ff02::1 is one of the "notable ipv6 multicast addresses" for which there's a convenient list on Wikipedia.

A more generic solution

So instead of having more exotic addressing inside your embedded network there's also a solution that contains all the weirdness to only the router. It is possible to configure a router so that both networks it connects to have the same subnet by having separate routing tables for the interfaces.

This means your internal network can be 10.0.0.0/24 and the venue network can be 10.0.0.0/24 and it all just works. The video mixer in the rack can have the 10.0.0.4 address and there can be a 10.0.0.4 address in the venue network and nothing will conflict. This comes with a tradeoff of course and in this case is that you no longer can reach devices on the venue network, which shouldn't be a problem if you're only connected there for internet connectivity.

The magic solution here is VRFs. Normally the router has a single routing table that defines where traffic is supposed to go to based on the destination address. The routing table is the main issue in this situation because having the same subnets on the inside and outside would mean you just have two routes in the table saying 10.0.0.0/24 should go to the LAN interface and the WAN interface

$ ip route
default via 10.0.0.1 dev enp1s0 proto dhcp src 10.0.0.33 metric 100
10.0.0.0/24 dev enp1s0 proto kernel scope link src 10.0.0.33 metric 100
10.0.0.0/24 dev enp2s0 proto kernel scope link src 10.0.0.254 metric 100

This is basically telling the router that the same 10.0.0.0/24 network is reachable over both interfaces. The only way to tell the kernel that these are actually seperate is by having a VRF here.

With a VRF you can have a completely isolated routing table which you can link to your network interfaces. Traffic from those interfaces are now matched to that table instead of the main one for the routing decisions.

For my test setup I've used a Mikrotik device since it allows me to configure VRFs as I didn't have a Linux box with enough interfaces on it to make an easy test setup. But this also shows that this setup isn't so esoteric that you need to have a full PC doing the routing for it. I'm using a Mikrotik hAP mini which is a $15 3-port router which is probably the cheapest device available that just has this feature available. For testing I have the WAN port (ether1) connected to my home network and the other two ports are connected to laptops which are supposed to communicate together but still be isolated from my home network.

The basic config is already set on this device making it do NAT between the two LAN ports and the WAN port so the only thing I need to do is break it by setting the internal subnet to the same one as my home network and then configuring the VRF related things to make it actually route properly.

# bridge-internal has the two ports for the internal devices
/interface bridge
add name=bridge-internal
/interface bridge port
add bridge=bridge-internal interface=ether2
add bridge=bridge-internal interface=ether3

# Give the router an address on the internal network
/ip address
add address=10.0.0.1/24 interface=bridge-internal

# Get an address for the WAN interface by DHCP
/ip dhcp-client
add interface=ether1

# Create the rack vrf for the bridge
/ip vrf
add interfaces=bridge-internal name=rack

# In the rack VRF tell it the default gateway is on the main VRF at the
# 10.0.0.254 address. The gateway IP _can_ be the same as the 10.0.0.1 one
# used for this router but kept seperate here to make the config easier to
# understand
/ip route
add gateway=10.0.0.254@main routing-table=rack

# The iptables rule for regular 'ol NAT for all traffic going out the WAN port
/ip firewall nat
add action=masquerade chain=srcnat out-interface=ether1

# Use connection tracking to mark traffic from the internal VRF so the return
# traffic can be delivered in the same VRF. The first rule gives the
# connection the "rack" mark in the connection tracking table. The second
# rule matches traffic from the WAN port that match a connection that has the
# "rack" mark and then set the routing table to use to "rack" which is
# the VRF. This is in the prerouting chain so these rules are executed
# before anything with the routing tables are done.
/ip firewall mangle
add action=mark-connection chain=prerouting connection-state=new in-interface=bridge-internal new-connection-mark=rack
add action=mark-routing chain=prerouting connection-mark=rack in-interface=ether1 new-routing-mark=rack

And that's all that's needed. Traffic that's handled by the rack VRF get's delivered on the interfaces linked to that VRF. The traffic for the main routing table goes out the WAN port.

To make traffic destined for the internet cross the barrier between routing tables a route is added in the rack VRF that sends traffic destined to the internet to the default gateway address in the main routing table. Which will run it through the firewall rules and apply the standard NAT rule that rewrites the source address to the one of the router. There is also a connection mark added in the connection tracker that tells iptables that this connection was originated from the rack VRF.

When a packet returns from the internet the iptables connection tracker will match it up to the connection that has the connection-mark set on it. This will trigger the other mangle rule that will make sure that this packed is proccessed by the rack VRF so it gets sent back to the device in the rack that started the connection.

This all together allows the devices in the rack to have internet connectivity just like it's a regular NATted connection. It is impossible to reach other devices in my home network through it though because the routing table for the rack will just send traffic destined for 10.0.0.0/24 back out the same interface instead of the WAN port. The same thing happens when a device on my home network tries to use the router as a gateway to reach a device in the rack.

These few rules configured on the router will allow you to use your favorite subnet inside the rack without any worry about conflicts. I've seen the regular NAT setups in this situation for portable live production gear, for industrial monitoring gear and have once set this up for an ancient CNC machine that did not allow changing the subnet address.

With a few translations this should also apply to apply to the VRF commands in Linux routers and I assume VRFs are also exposed on other major router brands

Megapixels 2.0: Small fixes and GTK breakage

Martijn Braam — Fri, 28 Mar 2025 20:52:26 -0000

Looking back it seems like making an alpha release of Megapixels 2.0 was a great choice. The various components that make up Megapixels have been through the packaging steps at Mobian for example which brought some light to issues that would improve the packaging. A lot more eyes have hit the code in this release than the random stuff thrown to git and I'm really happy I've received a lot of improvements from a bunch of developers.

When I started working on Megapixels it was my first C codebase, when I started working on libmegapixels/libdng those were the first times figuring out how do do C libraries. In the years I've been working on this codebase I've leared a whole lot, mostly around cleaner code and seeing merge requests that fix up minor issues in the code are a great reference for figuring out what the idiomatic way is of writing specific pieces of code.

Of course quite a few of these are minor memory safety violations and there has been a few un-free()'d resources around within Megapixels, but in the end running free on a few bytes before quitting the app or letting Linux release that memory doesn't make that much of a difference, most of the effort has been going through making sure the main image processing loop doesn't leak memory anywhere. If that part leaks memory then it will starts adding up really fast :)

GTK throwing a wrench into the development process

Of course Linux development can't ever run smoothly... there's always something new and exciting to break everything.

In the case of Megapixels it has been the NGL backend for GTK4. The 4.17 release made in Februari dropped the GL backend in favor of the NGL and Vulkan renderers. Which is great if you're on the latest and greatest Macbooks.

The issue is that GTK now also dropped for GLES 2.0 which means that a lot of older devices are no longer GPU accelerated in GTK4. For Megapixels it's an even bigger issue since the debayering depends on GPU acceleration so it won't run at all if GTK4 doesn't have an OpenGL context anymore.

This hardware doesn't even have to be terribly old. For example here's some of the hardware supported by Megapixels 2.0:

Device	GPU	OpenGL	OpenGL ES
PinePhone	Mali 400	OpenGL 2.0	OpenGL ES 2.1
Samsung Galaxy SIII	Mali 400	OpenGL 2.0	OpenGL ES 2.1
PinePhone Pro	Mali T860	OpenGL 3.1	OpenGL ES 3.1
Librem 5	Vivante GC7000Lite	OpenGL 2.0	OpenGL ES 2.x

What they all have in common is that they don't really support the latest and greatest OpenGL versions. It's not very easy to get any hard docs on what OpenGL requirements GTK has now, but it seems like it's at least OpenGL 3.3 and there's still references to OpenGL ES 3.0 in the codebase. Which means that for the devices I've been targetting for Megapixels the support simply isn't there anymore.

So far there's been a workaround for this by putting Megapixels in a flatpak with a runtime that doesn't have the latest version of GTK4 in it. This is obviously not a long term solution but at least there's some workaround for now. Many thanks to Andrey Skvortsov for creating a flatpak package for Megapixels.

There's a build available now on my flatpak repository, you should be able to get your favorite graphical packagemanager frontend to install it with this link: https://flatpak.brixit.nl/megapixels2.flatpakref

The future

I'm not entirely sure what a good solution for this mess is. My current feeling is that it's best to not rely on GTK4 anymore because even if somehow a workaround is figured out to make this work, there's always the next GTK issue coming up.

Switching to another framework also isn't great, especially since Megapixels just has been through rewrite hell already, it'd be 2026 before we'd have a working camera app again. In theory it should be slightly easier now to make megapixels-qt now with libmegapixels but that only abstracts the device usage, the main magic of Megapixels is the threading mess and OpenGL debayer code that gives it the realtime performance for photography and that would have to be recreated on another platform.

Maybe someone has a great idea or a solution for this, I'd love to hear it.

BodgeOS pt.4: A working browser

Martijn Braam — Sat, 25 Jan 2025 11:35:26 -0000

After the previous post about bringing up Sway I spend a bit of time packaging random system components you'd expect for a desktop system. Mainly building GTK so I can have some actual applications running. This is mainly pretty relaxing work. Find a component to build, figure out the dependencies, package them one by one by figuring out the required commands and making a tiny build script.

Occasionally the zen of packaging gets disturbed by having to figure out why some build system does something weird, like really wanting to put files in /usr/lib64 instead of /usr/lib while in the previous build it did not do this.

It turns out this is because some build systems perform auto detection of the host system to decide things like "what folder is a good place for the libraries". In this case it was cmake that has autoconfig behavior for this if you don't explicitly define that libs go in the lib folder. At some point installing a broken rebuild of glibc on the host system had changed lib64 back from a symlink to a regular folder again and all builds after that were broken.

The path to Firefox is paved with many dependencies

For my goal of running Firefox I have to deal with several dependencies:

GTK 3 and the components to make that work like Pango and Cairo.
A whole assortment of audio, video and image codecs, these can probably be skipped by manipulating the Firefox buildsystem, but it is probably easier to package all of these instead.
Nodejs is also a dependency for building Firefox since the javascript engine in Firefox itself can't be used to run the javascript components in the build system.
WASI for having WebAssembly support.
Rust for large chunks of the Firefox codebase.
Extra libraries like the Netscape Portable Runtime (nspr) and the libraries for other system features like Alsa and Pulse.

Luckily parts of this have already been packaged as part of getting my Sway desktop working. Many dependencies for GTK 3, WASI and Rust are already packaged for either Mesa or Swaybar, but I had to finally figure out how to bring up Rust in my distro.

This is quite hard in theory because rustc is written in Rust so I need older versions of Rust to build the current version of Rust, and older versions for that again and again until you reach the point where rustc is written in C again. Instead of doing all that I opted to go for the easy road. I curl | sh'ed a functional Rust compiler into my host system with the instructions for rustup.rs and then used that compiler to build my packaged Rust. This was all surprisingly smooth and easy.

Getting NodeJS functional was a bit more painful. This is mainly because it takes forever to build which means it takes forever to fix things while packaging. The same issue with the Firefox build itself actually. The worst thing I've encountered while packaging things is things that take a lot of time to build and don't have the configure script check all the dependencies it needs. I might have been annoyed with autoconf wasting a lot of time at the start of every build of every small package with checks that seem to be completely useless, but the time wasted there does not compare at all to the time wasted in building Firefox and having it crash an hour into the build complaining it's missing a dependency.

Desktop audio

One of the dependencies for Firefox is also the various audio libraries, which in turn means I have to bring up audio on my distro. I spend a moment to figure out wether it would be a PulseAudio or PipeWire system but ended up just picking PipeWire. Most of the issues I've seen with PipeWire seem to stem from bodged migrations from Pulse systems anyway and that is not an issue I would have to deal with. Unfortunately it turns out that for my nice, clean and modern PipeWire system I would still have to build PulseAudio first. While PipeWire provides the compatability layer with applications that use PulseAudio as audio backend (like Firefox) I still need to provide those applications with libpulse first at build time.

My package of PulseAudio only provides the library and not the daemon so I have the least amount of conflicting sound systems as possible in the distribution. Of course Alsa is also packaged but that is required anyway by PipeWire to access the hardware. For the Jack parts of PipeWire I have simply decided to not have the Jack parts.

Building Firefox

So once I had all the dependencies out of the way I started the painful process of building Firefox. This part contained a lot of failed builds and many many many hours of wasted build time due to the build failing at the finish line.

I spend a lot of time trying to figure out why I had syntax errors in the Rust files for the CSS engine. For some reason the CSS engine for Firefox (The component is called "Style" so it's absolutely impossible to find anything for it in a search engine) has massive .rs files generated by Jinja2 templates and somewhere it's not providing syntaxically valid Rust files anymore. I spend weeks swapping out various dependency versions of various parts to figure out why these files are invalid until I gave up.

Then a week later I decided to try again but this time Firefox 134.0 was released so I changed the package to use the newer release, this instantly fixed all the Rust related build issues... Of course I still had a few more build issues to figure out but in the end I was presented with this nice message from the Firefox build system:

This is probably referring to the 68 minute build time, but I'd like to think the Firefox build system is self aware and knows of my days of debugging.

Increasing the difficulty and building for ARM

So at this point I felt like things were going too easy, so one day I decided it would be funny to rebuild BodgeOS for ARMv8. Due to the way BodgeOS is built this means building the distribution from LFS again since I have no support for cross-building in this system, simply because that's a lot of complication to maintain in the build system for something that is only done very rarely.

To build the ARM version of BodgeOS I wanted to build it natively on an ARM system and it turns out that for some reason I have a lot of random ARM systems around :D, I decided to grab the LX2K board I have since it's probably the fastest ARM64 system I have here (I haven't really checked the benchmarks how it compares with the RK3588 systems) but it most certainly is the most sane one of all of them. This is the only ARM SystemReady certified hardware I have, I don't think I can explain it any better than the marketing blurb on the manufacturers website:

The SystemReady ES description from solid-run.com

This means I could just unpack the generic Alpine Linux ARM64 rootfs on a random 2.5" SSD I had around, plug it in and boot it. Even easier is that I already had done that before and the SSD I used last time was still in there so I continued on directly with building from the Alpine 3.15 system I had on there.

The LX2K board in a 2U rack case powered by a picoPSU

From here on out the steps are incredibly similar to bringing up the x86_64 version of BodgeOS. I used the automated JHALFS system to get my clean LFS system to start the whole system from, surprisingly building this on ARM worked practically perfectly the first time while nothing in JHALFS suggests it even is tested on ARM systems.

From there on I copied my temporary helper scripts from my LFS installation on my Thinkpad that I used to build the first iteration of x86_64 BodgeOS and hacked together abuild inside LFS to run the build scripts. The rest was actually a lot easier since I didn't have to figure out any of the packaging again, 99% of the package builds scripts simply worked on ARM64 or were simply missing some build dependency declarations.

What also suprised me is that this ARM board actually outperformed my relatively modern Thinkpad in build speed. The Thinkpad I've been using for this is an X280 with the i5-8250U CPU in it. What also helps for a few builds is that the LX2K board has twice amount of RAM in it currently than the Thinkpad has (16GB vs 8GB) which means I didn't have to pass in -j4 for a few builds that were very memory heavy. Running with only 4 threads is quite painful on this board anyway because it's speed comes from having 16 cores in it, and the cooling to actually run that at full speed.

The initial LFS build on the LX2K board

Continuing on

I have gotten to a minimal working installation on the LX2K for the ARM64 build and I've been packaging random tools I need on the x86_64 version now. To continue I should probably make an actual build system that does automated builds from the git repositories on both platforms instead of me just manually triggering abuild for every package and rsync'ing the file over to the mirror.

One of my goals other than Firefox was building Kicad but this is more painful than getting Firefox running due to the list of dependencies it requires. The main one being that I actually going to have to build Xorg to build Kicad.

BodgeOS pt.3: Graphical desktop

Martijn Braam — Sun, 29 Dec 2024 20:11:18 -0000

In the previous post I figured out all the internal weirdness of Linux booting to get BodgeOS running on actual hardware. The next goal was very clear: getting to a graphical environment. At the start of this month I had the goal set to running a web browser before 2024 ends but I've now slightly adjusted my goals down to being able to type this blog post in a terminal on my new OS.

So what does it take to get graphics in Linux? well the first component is very clear from experience: Mesa. This is the component that provides all the userspace components for the graphics drivers. I started with checking out both the LFS mesa build instructions and the Alpine and ArchLinux mesa packages. This is not a very nice package to build due to the large dependencies it has. This one project contains all the graphical hardware related code for any GPU Linux will run on and due to that it depends on several programming languages and compilers.

I have tried stripping down this package as much as possible: no X11 support, only intel graphics, only EGL for 3D acceleration, no extra components, no software rasterisation. This makes Mesa relatively easy to build. With only the i915 gallium driver for intel graphics I don't have to bring up any of the Vulkan, rust or llvm dependencies to get basic graphics.

The desktop

So I had to pick a graphical environment as goal to run. There are many choices for this and even more opinions on what the best one. I picked Sway here since it's a Wayland based environment so I don't have to go figure out all the X11 stuff. It's also very simple to build compared to something like Gnome or KDE Plasma. I guess there will be someone that has figured out that some random ancient window manager can be built with even less dependencies but this is the smallest one from the desktops I've actually used before :)

The dependency chain for Sway is pretty simple: wlroots abstracts away all the Wayland stuffs and makes it actually communicate with Mesa. Then it has some extra dependencies to render text and simple graphics to draw the bars and window decorations.

So I started figuring out the minimum dependencies for every component in the dependencies of wlroots and Sway and package all the things are that are needed. This included fixing up the Python udev bindings from my systemd package, packaging the Wayland protocols, a bunch of Xorg keyboard stuff because it seems like keymaps are still used from x* packages and finally seatd to provide a way to get a session for the desktop.

Sway

For Sway the dependencies got a lot more annoying to compile since it depends on Pango and Cairo for rendering and those build systems were just a massive pain to deal with. It seems like the higher you get in the stack of a Linux system the more bullshit is added to build systems to make things "easier". My particiular painpoint in the Sway dependencies is glib and gobject-introspection which is not sufficiently documented and seems to work on magic.

Font rendering

Along with Pango I also had to bring up the whole font system in Linux. This involves Pango, Cairo, HarfBuzz, Glib and several obscure libraries for font processing. These packages are fun because they contain circular dependencies so I had to build them a few extra times.

First attempt at booting

After getting the 41 packages built that are required to get to a very minimal Sway experience I generated a new rootfs and tried booting it on a laptop.

This started the hunt for optional dependencies that were not optional for my usecase. The first one was that seatd could not actually make a session for me because I had zero backends compiled in. This was a relatively simple fix of just enabling the builtin backend in seatd to get at least some session.

Next came the graphical stack issues...

It first took a bit of time figuring out which of these errors was the fatal one I should be looking at, it was not one of the red ones in this case. The important error here was the "iris: driver missing" line which is from Mesa. I had initially assumed that i915 was the hardware backend I needed for my laptops since it's the name I've always come across. Apparently my laptops are old but not old enough to require i915 graphics, instead I needed to enable the iris backend.

Enabling the flag for iris in Mesa is very simple, but the hard part is the additional dependencies this adds to Mesa. Iris requires libclc which is the library for OpenCL. This depends on SPIR-V and LLVM which means packaging another massive project.

LLVM was by far the biggest time sink for a single package I've had so far. This package takes absolutely forever to compile and I was building this on an x280 with 8GB ram. Since this laptop has 8 cores I have built everything with -j8 so far which works fine except for LLVM where I had to drop to -j4 to not run out of memory while building. I had the same issues with Clang as well and together I've spend 3 days waiting on either one of them to build to hit the next issue that needed slight adjustments in the flags.

With LLVM working I managed to build all the packages required for SPIR-V and libclc so I could finally build the iris backend in Mesa. Since I now had a few extra dependencies packaged I also could enable llvmpipe as software rasterizer and osmesa, the off-screen mesa renderer.

Sway starts

With my graphics drivers fixed I finally got Sway to run. This was a very unexciting start though since the only thing it actually rendered was a black screen with my cursor. To make this more annoying to debug it also did not allow me to switch back to a TTY with ctrl+alt+F{1,2,3,4} anymore to see any of the debug logs. This forced me to build the thing I had been postponing: openSSH.

By launching Sway through an SSH session I noticed that the first thing I was missing was the swaybg binary which apparently is a seperate package, that explains why by background was completely black. This was packaged and built in a few minutes which fixed 90% of the screen area. The next mystery was the missing bars.

Suprisingly with all the logging turned to max in Sway I still got no error message whatsoever about the bars not showing up. Even more suprising is that if I reloaded the config a few times there occiasionally were some graphical artifacts where the bars were supposed to be.

After trying a few things and guessing even more things I figured out why it did not show up: I have the entire font rendering pipeline working but I haven't packaged a single font.... So that was an easy fix.

To complete the minimal working environment I also built the foot package to have a terminal available that did not have too many extra dependencies to work.

BodgeOS Sway!

There's also several more low-level things I had to figure out on the way, like my installed system not having any locales available. This was quickly fixed by importing the locale-gen script from ArchLinux to generate the locales I need and fixing up my glibc package to put the locale files in the right location.

Branding

So now I have the bare minimum I could focus on more cleanup work and small features. One of those is making the default wallpaper for my distro. I ended up doing the same thing I always do when I get annoyed with Inkscape not doing what I want: Rendering the graphics directly using Python instead.

I made a small python script that uses Pillow to render a wallpaper at the requested resolution.

This was inspired by the KiCad PCB editor I had open. I thought it was topical since the distro name was also inspired by my electronics projects :D

Continuing on

So this most likely completes my BodgeOS project for this year. I'm now up to 238 APKBUILD scripts in the repository which build ~900 packages for the distribution.

I'll have to package a lot more probably for my next goal, which is getting Firefox to run. This includes some things I've been avoiding like figuring out how to bring up the rust ecosystem and packaging GTK. While the current packages might've been hard to figure out, the rust ecosystem seem to actively resist packaging efforts to make it even harder. Maybe I should get more of Python packaged first so I can use my own utilities for working with APKBUILD files.

Since this is also the last blog post of the year, happy new year everyone!

Megapixels 2.0 alpha release

Martijn Braam — Tue, 24 Dec 2024 13:16:57 -0000

It's been quite a while since I wrote a Megapixels update post. Since my last post libmegapixels has had a lot more testing on hardware other than the PINE64 devices and the Librem 5 which I originally wrote it for. This obviously found a few flaws in my library code for edge cases I hadn't had to deal with before but overall the fundamental ideas behind the library seem to work.

I have now removed the last device-specific workaround from the libmegapixels code and the device support is now purely config files with a few flags to turn on quirks present in a few drivers like not having ioctls implemented correctly.

I once again stood before the software release dilemma: should I push a release that's not perfect or keep waiting and waiting to release until every last bug has been ironed out. Currently when running Megapixels 2.0 on the original PinePhone it's not a perfect drop-in replacement with all the features which is why I wanted to hold off on a release. But there's a few other devices that now already have 100% camera functionality on the development branch and for those devices a release would be great.

Megapixels 2.0.0-alpha1

As a compromise I have tagged an alpha release now from the development branch. This was issues can be ironed out that will happen when running Megapixels on one of the many combinations of distributions and devices. Since Megapixels now is also split up in the megapixels, libdng and libmegapixels projects the packaging can also now be figured out. The two libraries also have a new 0.2.0 tag now that marks the minimum version required for running the alpha release.

With this release it also means that all the library apis are now somewhat stable, but more importantly I'm now pretty confident that the config file format won't need any intrusive changes anymore so files for other devices can now be created without risking a lot of breakage down the line.

This format now also finally has some proper documentation over at https://libme.gapixels.me/config.html because "copy another file and hope for the best" is simply not a great developer experience.

Megapixels 2.0 running on the Samsung Galaxy SIII (ported by @jack_kekzoz)

Many thanks

I've not build this release alone ofcourse. I'd like to thank @k.vos, @pavelm, @pastalian, @jack_kekzoz, @barni2000 and @Luigi311 for their contributions to all the various parts of this codebase. I'd also like to thank the people that have supported my patreon/liberapay to sponsor me working on this :)

The release

The most important link ofcourse, the Megapixels tag:

https://gitlab.com/megapixels-org/Megapixels/-/tags/2.0.0-alpha1

The libraries releases are:

Documentation available at:

BodgeOS pt.2: Running on real hardware

Martijn Braam — Tue, 17 Dec 2024 00:30:04 -0000

In the previous part of this series I created a base Linux distribution from a running LFS system. This version only ran as a container which has several benefits that makes building the distribution a lot easier. For a simple container I didn't have to have:

A service manager (systemd)
Something to make it bootable on x86_64 systems (grub, syslinux, systemd-boot)
A kernel
An initramfs to get my filesystem mounted
File-system utilities since there's a folder instead of a filesystem.

A few of these are pretty easy to get running. I already have all the dependencies to build a kernel so I generated a kernel from the linux-lts package in Alpine Linux.

To make things easier for myself I just limited the distribution to run on UEFI x86_64 systems for now. This means I don't have to mess with grub ever again and I can just dump systemd-boot into my /boot folder to get a functional system. I had to build this anyway since I had to build systemd to have an init system for my distribution.

The Initramfs

The thing that took by far the longest is messing with the initramfs to make my test system boot. The initramfs generator is certainly one of the parts that have the most distribution-specific "flavor". Everyone invents it's own solution for it like mkinitcpio for ArchLinux and mkinitfs for Alpine and initramfs-tools for Debian as a few examples.

I did the only logical thing and reinvented the wheel here. I'm even planning to reinvent it even further! Like the above solutions my current initramfs generator is a collection of shell scripts. The initramfs is a pretty simple system after all: it has to load some kernel modules, find the rootfs, mount it and then execute the init in the real system.

For a very minimal system the only required thing is the busybox binary, it provides the shell script interpreter required to run the messy shell script that brings up the system and also provides all the base utilities. Due to my previous experiences with BusyBox modprobe in postmarketOS I decided to also move the real modprobe binary in the initramfs to have things loading correctly. To complete it I also added blkid instead of relying on the BusyBox implementation here to have support for partition labels in udev so no custom partition-label-searching code is required.

Getting binaries in the initramfs is super easy. The process for generating an initramfs is:

Create an empty working directory
Move in the files you need into the working directory from the regular rootfs like /usr/bin/busybox > /tmp/initfs-build/usr/bin/busybox
Add in a script that functions as pid 1 in the initramfs and starts execution of the whole system
Run the cpio command against the /tmp/initfs-build directory to create an archive of this temporary rootfs and run that through gzip to generate initramfs.gz

Step 2 is fairly simple since I just need to copy the binaries from the host system, but those binaries also have dependencies that need to be copied to make the executable actually work. Normally this is handled by the lddtree utility but I didn't feel like packaging that. It is a shell script that does a complicated task which is never a good thing and it depends on python and calling various ELF binary debugging utilities.

Instead of using lddtree I brought up Hare on my distribution and wrote a replacement utility for it called bindeps. This is just a single binary that loads the ELF file(s) and spits out the dependencies without calling any other tools. This is significantly faster than the performance overhead of lddtree which was always the slowest part of generating the initramfs for postmarketOS.

The output format is also optimized to be easily parse-able in the mkinitfs shellscript.

$ lddtree /usr/sbin/blkid /usr/sbin/modprobe 
/usr/sbin/blkid (interpreter => /lib64/ld-linux-x86-64.so.2)
    libblkid.so.1 => /lib/x86_64-linux-gnu/libblkid.so.1
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6
/usr/sbin/modprobe (interpreter => /lib64/ld-linux-x86-64.so.2)
    libzstd.so.1 => /lib/x86_64-linux-gnu/libzstd.so.1
    liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5
    libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3
    libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6

$ bindeps /usr/bin/blkid /usr/bin/modprobe 
/usr/lib/ld-linux-x86-64.so.2
/usr/lib/libblkid.so.1.1.0
/usr/lib/libc.so.6
/usr/lib/libzstd.so.1.5.6
/usr/lib/liblzma.so.5.6.3
/usr/lib/libz.so.1.3.1
/usr/lib/libcrypto.so.3

The bindeps utility seems to be roughly 100x faster in the few testcases I've used it in and it outputs in a format that needs no further string-mangling to be used in a shell script. In BodgeOS mkinitfs it's used like this:

binaries="/bin/modprobe /bin/busybox /bin/blkid"
for bin in $binaries ; do
    install -Dm0755 $bin $workdir/$bin
done

bindeps $binaries | while read lib ; do
    install -Dm0755 $lib $workdir/$lib
done

The next part is the kernel modules. Kernel modules are also ELF binaries just like the binaries I just copied over but they sadly don't contain any dependency metadata. This metadata is stored in a seperate file called modules.dep that has to be parsed seperately. I did not bother with this and copied the solution from the initramfs generator example from LFS and just copy hardcoded folders of modules into the initramfs and hope it works.

The file format for modules.dep is trivial so I really want to just integrate support for that into bindeps in the future.

Debugging the boot

It's suprisingly painful to debug a non-booting Linux system that fails in the initramfs. I wasted several hours figuring out why the kernel threw errors at the spot the initramfs should start executing which ended up being an issue with the /sbin/init file had the wrong shebang line at the start so it was not loadable. The kernel has no proper error message that conveys any of this.

After I got the initramfs to actually load and start a lot of time was wasted on executables missing the interperter module. In the example above this is the /lib64/ld-linux-x86-64.so.2 line. The issue here ended up that I was just missing the /lib64 symlink in my initramfs. This was very hard to debug in a system without debug utilities because nothing could execute.

After all that I spend even more time figuring out why I had no kernel log lines on my screen. After much annoyance this turned out to be missing options in the kernel config for the linux-lts kernel config I took from Alpine Linux. So instead of fixing that I took the kernel config from ArchLinux and rebuild the linux-lts package. This fixed my kernel log output issue but also added a new one... The keyboard in my laptop wasn't working in the initramfs.

I never did figure out which module I was missing for that because I fixed the rest of the initramfs script instead so it just continues on to the real rootfs where all the modules are available.

After all that I did manage to get to a login prompt though!

Cleaning up

After booting up this I realized it would be really handy if I actually had a /etc/passwd file in my system and some more of the bare essentials so I could actually log in and use the system.

This mainly involved adding missing dependencies in packages and packaging a few more files in /etc to make it a functional system. After the first boot the journal had a neat list of system binaries from util-linux that systemd depends on but not explicitly, so I added those dependencies to my systemd packaging.

I also had to fix the issue that my newly-installed system did not trust the BodgeOS repository, I was missing the keys package that installs the repository key in /etc/apk/keys for me. In this process I noticed that the key I built the system with was called `-randomdigits.pub` instead of being prefixed with a name. This is pretty annoying because this name is embedded in all the compiled packages and I didn't want to ship a file with that name in my keys package.

There seemed to be a nifty solution though: the abuild-sign tool appends a key to a tar archive, which is normally used to sign the APKINDEX.tar.gz file that contains the package list in the repository. I decided to run abuild-sign *.apk in my main repository after adjusting the abuild signing settings with a correct key name.

Apparently this breaks .apk files and after inspection they now had two keys in them and neither my development LFS install and my test BodgeOS install wanted to have anything to do with the packages anymore.

In the end I had to throw away my built packages and rebuild everything again from the APKBUILD files I had. Luckily this distribution is not that big yet so a full rebuild only took about 2.5x the duration of Dark Side of the Moon.

Next steps

Now I have a basic system that can boot I continued with packaging more libraries and utilites you'd expect in a regular Linux distribution. One of the things I thought would be very neat is having curl available, but that has a suprising amount of dependencies. Some tools like git will be useful too before I can call this system self-hosting.

I also want to remove all the shell scripts from the initramfs generation. None of the tasks in the initramfs are really BodgeOS specific and most of the complications and bugs in this initramfs implementation (and the one in postmarketOS) is because the utilities it depends on are not really intended to do this stuff and system bootup just has a lot of race conditions shell scripts are just not great at handling.

My current plan to fix that is to just replace the entire initramfs with a single statically linked binary. All this logic is way neater to implement in a good programming language.

Conjuring a Linux distribution out of thin air

Martijn Braam — Sat, 07 Dec 2024 23:08:27 -0000

I decided I had to get something with slightly more CPU power than my Thinkpad x230 for a few tasks so I got a refurbished x280, aside from the worse keyboard the laptop is pretty nice and light. It shipped with Windows of course so the first thing I did is to install Ubuntu on the thing to wipe that off and verify all the hardware is working decently.

I was wondering if I should leave Ubuntu on the thing, it works pretty well and it's still possible to get rid of all the Snap stuff, it's not my main machine anyway. The issue I ran into quickly though is some software is pretty outdated, like I don't want to use Kicad 7 anymore...

Picking distros once again

You'd think after using Linux for decades I would know what distro I'd put on a new machine. All the options I could think off though had annoying trade-offs I didn't want to deal with once again.

The three main distributions I have running on hardware I manage is Alpine Linux, Archlinux and Debian. I like Alpine a lot but it is quite annoying when you deal with closed-source software. Since this is my go-to laptop to take with me to outages and repairs then I need it to handle random software thrown on it relatively easily.

ArchLinux satisfies that requirement pretty well but my main issue with it is pacman. If you don't religiously run upgrades every hour on the thing it will just break because for some reason key management is not figured out yet there. The installation is also quite big usually due to packages not being split.

Debian fixes the stability issues but comes with the trade off that software is usually much much older, this also leaks into Ubuntu that's running on the laptop now. It is also internally a lot more complicated due to the way it automatically sets up stuff while installing which I don't usually need.

There is another solution though. Just build my own!

Artisanal home-grown Linux

Creating a new Linux distribution is one of those things that sounds much harder than it actually is. I just haven't done it before. I did build a small Debian derivative distro before just to avoid re-doing all the config for all machines but that's just adding an extra repository to an existing distribution. Of course I've also worked extensively on postmarketOS and while the scope of that is a lot larger it still is only a repository with additions on Alpine Linux.

Some of you might be familiar with this graphic of Linux distributions:

There are many many many derivative distributions here, way less distributions that are built up from nothing. And that's exactly the part that I want to figure out. How do I make a distribution from scratch?

I have once, a long time ago, build a working Linux machine from sources using the great Linux From Scratch project. I would recommend that anyone that's really into Linux internals do that at least once. Just like Archlinux learns you how the distribution installer works (before they added an installer) the LFS book will learn you how you bootstrap a separate userspace from another Linux distribution and doing the gcc/glibc dependency loop build.

So my plan for the distribution is: create a super barebones system using the LFS book, package up that installation using abuild and apk from Alpine Linux. This way I can basically make my own systemd/glibc distribution that is mostly like Archlinux but uses the packaging tools and methodology from Alpine Linux.

The bootstrap build

So to make my distribution I first have to build the Linux installation that will build the packages for the distribution. To get through this part relatively quickly I used the automated LFS installer called ALFS. This basically does all the steps from the LFS book but very quickly. My intention is to replace all of these packages anyway so this part was not super important. It does all the required setup though to validate the GCC I'm using to build my distribution is sane and tested.

In the ALFS installer I picked the systemd option since I didn't want to deal with openrc again and ended up with a nice functional rootfs to work in. I immediately encountered a few things that were critical and missing though. There was no wget or curl. I fixed this by grabbing the Busybox binary from my host Ubuntu distribution and putting that inside with only a wget symlink to it.

This wget was not functional in my chroot though since it could not connect to https servers. Annoyingly all solutions you'd find for these errors is passsing --no-check-certificates to wget to get your files which is unacceptable when building a distribution. After a lot of debugging with openssl configs I ended up copying over all the certificates from the host Ubuntu system again and pointed wget to the certificate bundle with the wgetrc file.

The very next thing I needed is abuild. This is the tool from Alpine Linux that's used to build the packages. Luckily it's a few small C programs and a large shell script so it's very easy to install in the temporary system. I also added apk.static to the system to be able to install the built packages.

Building packages

So now I have my temporary system running I could start writing APKBUILD files for all the packages in my LFS system. I started with the very simplest one of course that only provides /etc/services and /etc/protocols. No compiling and dependencies needed.

For this package made the script that built the current APKBUILD using the abuild subcommands and generated a neat local repository so apk could install the files. So I ran that and now /etc/services and /etc/protocols are now exactly the same files but managed by apk and in a package.

The reason I had to use the subcommands for abuild to run seperate stages instead of just running abuild itself to do the whole things is because one of the first steps is installing the build dependencies. In this case I'm in the weird setup where I have all the build dependencies installed through LFS but apk doesn't know about that so I simply skip that step.

And that's how I re-build a lot of the LFS packages once again, but this time through my half broken abuild installation. One of the things that abuild was not happy about is the lack of the scanelf utility which it uses after building a package to check which .so files the binaries in the package depend on. Due to this a lot of dependencies between packages are simply missing. The scanelf utility has enough build dependencies though that I could not have that as the first few packages so most of the packages are broken in this stage.

When I finally built and installed scanelf I ran into another issue. The packages I build after this failed at the last step because scanelf found the .so files required for the package but the package metadata for all the packages I made before it lack this information about the included .so files apparently. At this point I had to build all those packages for a fourth time (twice in LFS and now twice in abuild) to make dependencies here work.

After getting through practically everything in the base system I ended up with around 333 packages in my local repository.

Most of these APKBUILD files are a combination of the metadata header copied from the Alpine Linux ABUILD so I have a neat description and the correct license data. And then the build steps and flags from LFS and sometimes the install step from Archlinux.

This means that for example the xz build steps in LFS are:

$ ./configure --prefix=/usr    \
              --disable-static \
              --docdir=/usr/share/doc/xz-5.6.3
$ make
$ make check
$ make install

And that combined with the Alpine Linux metadata headers and adjustments for packaging becomes:

pkgname=xz
pkgver=5.6.3
pkgrel=0
pkgdesc="Library and CLI tools for XZ and LZMA compressed files"
url="https://tukaani.org/xz/"
arch="all"
license="GPL-2.0-or-later AND 0BSD AND Public-Domain AND LGPL-2.1-or-later"
subpackages="$pkgname-doc $pkgname-libs $pkgname-dev"
source="https://github.com//tukaani-project/xz/releases/download/v$pkgver/xz-$pkgver.tar.xz"

build() {
        ./configure \
                --prefix=/usr \
                --disable-static \
                --docdir=/usr/share/doc/xz-$pkgver
        make
}

check() {
        make check
}

package() {
        make DESTDIR="$pkgdir" install
}

Getting the first install to run

Now the hard part, finding all the issues that prevent this new installation from starting. The first thing I tried to use the apk.static on my host system to generate a new chroot from the repository I created, just like you'd install an Alpine chroot.

Unfortunately this did not work and I had to fork apk-tools and make my own adjusted version. This is mainly because apk-tools hardcodes paths in it which conflict with my usrmerge setup. So I now have an apk.static build from my fork that does not try to create /var/ for the database before the baselayout package can create the actual filesystem hierarchy with a symlink at that spot.

With that fixed apk.static would be able to finish creating an installation from my repository, but I could not chroot into it for some reason. All the binaries are broken and return "Not found" when trying to execute them. I managed to actually enter the chroot by throwing my trusty busybox binary in there but did not get any more information out of that installation.

After a bunch of testing, debugging, and more testing, I found out the reason was that I don't have /lib64 in my installation. It seems like it's required specifically because x86_64 binaries specify /lib64/ld-linux-x86-64.so.2 as loader. The fix for that is quite easy by just having the glibc package place a symlink at that spot to the real ld-linux.so.

Beyond the first run

There's a lot of things that need to be fixed up to be a good distribution. All the packages will need to be rebuild again from the distribution installled from these first generation packages to leave behind the last parts of LFS in there. There's also the system setup that needs to happen to make it bootable and maintainable. For example things like the keys package that install and update the repository keys and adding the logo to neofetch (after packaging neofetch).

I've also rsync'd the repository for the distribution to a webserver so it can actually be added to installations. I've been using this now to create test chroots using my locally build patched apk-tools.

The repository itself also needs quite a bit of work, gcc shouldn't've been pulled in here for these packages and a bunch of the large packages need to be split up to remove the uncommon parts. Currently the glibc package is already 10x larger than a base Alpine Linux installation, luckily apk is a very fast package manager and everything still installs super quickly.

So why?

It doesn't really make sense to do this. I wanted to have done it anyway because that's how you learn. This took about 3 days of fiddling with build scripts between other work since it's mostly waiting on builds to finish.

At the very least this distribution created this blog post, which is surprisingly one of the very few pieces of information available for bootstrapping a distribution.

Even with this tiny base of packages this is already a quite usable OS since I have a kernel, a service manager and python. All you need to build some embedded stuff if this had an ARM64 port bootstrapped or something. For desktop work there's still a mountain of work to be done to package everything required to launch Firefox, getting the basic graphics stack up and running should be relatively straightforward with bringing up Sway with its dependencies.

In the end it probably would've been easier to just add a ppa for Kicad to my Ubuntu installation :)

Building a timeseries database for fun

Martijn Braam — Mon, 28 Oct 2024 22:50:55 -0000

Everyone that has tried to make some nice charts in Grafana has probably come across timeseries databases, for example InfluxDB or Prometheus. I've deployed a few instances of these things for various tasks and I've always been annoyed by how they work. But this is a consequence of having great performance right?

The thing is... most the dataseries I'm working with don't need that level of performance. If you're just logging the power delivered by a solar inverter to a raspberry pi then you don't need a datastore for 1000 datapoints per second. My experience with timeseries is not that performance is my issue but the queries I want to do which seem very simple are practically impossible, especially when combinated with Grafana.

Something like having a daily total of a measurement as a bar graph to have some long-term history with keeping the bars aligned to the day boundary instead of 24 hour offsets based on my current time. Or being able to actually query the data from a single month to get a total instead of querying chunks of 30.5 days.

But most importantly, writing software is fun and I want to write something that does this for me. Not everything has to scale to every usecase from a single raspberry pi to a list of fortune 500 company logos on your homepage.

A prototype

Since I don't care about high performance and I want to prototype quickly I started with a Python Flask application. This is mainly because I already wrote a bunch of Python glue before to pump the data from my MQTT broker into InfluxDB or Prometheus so I can just directly integrate that.

I decided that as storage backend just using a SQLite database will be fine and to integrate with Grafana I'll just implement the relevant parts of the Prometheus API and query language.

To complete it I made a small web UI for configuring and monitoring the whole thing. Mainly to make it easy to adjust the MQTT topic mapping without editing config files and restarting the server.

I've honestly probably spend way too much time writing random javascript for the MQTT configuration window. I had already written a MQTT library for Flask that allows using the Flask route syntax to extract data from the topic so I reused that backend. To make that work nicely I also wrote a simple parser for the syntax in Javascript to visualize the parsing while you type and give you dropdowns for selecting the values.

This is not at all related to the dataseries part but at least it allows me to easily get a bunch of data into my test environment while writing the rest of the code.

The database

For storing the data I'm using the sqlite3 module in Python. I dynamically generate a schema based on the data that's coming in with one table per measurement.

There's two kinds of devices on my MQTT broker, some send the data as a JSON blob and some just send single values to various topics.

Example data from the MQTT broker

The JSON blobs are still considered a single measurement and all the top-level values get stored in seperate columns. Later in the querying stage the specific column is selected.

My worst case is a bunch of ESP devices that measure various unrelated things and output JSON to the topic shown above with JSON. I have a single ingestion rule in my database that grabs devices/hoofdweg/ and dumps it in a table that has the columns for the various sensors, which ends up with a schema like this:

A timestamp is stored, no consideration is made for timezones since in practically all cases a house isn't located right on a timezone boundary. The tags are stored in seperate columns with a tag_ prefix and the fields are stored in column with a field_ prefix. The maximum granularity of data is also a single second since I don't store the timestamp as a float.

A lot of the queries I do don't need every single datapoint though but instead I just need hourly, daily or monthly data. For that a second table is created with the same structure but with aggregated data:

This contains a row for every hour with the min(), max() and avg() of every field, it also contains a row for every day and one for every month. This makes it possible to after a preconfigured amount of time just throw away the data that has single-second granularity and keep the aggregated data way longer. For querying you explicitly select which table you want the data from.

The querying

To make the Grafana UI not complain too much I kept the language syntax the same as Prometheus but simply implemented less of the features because I don't use most of them. The supported features right now are:

Simple math queries like 1+1, this can only do addition queries and is only here to satisfy the Grafana connection tester.
Selecting a single measurement from the database and filtering on tags using the braces like my_sensors{sensor="solar"}
Selecting a time granularity with brackets like example_sensor[1h]. This only supports 1h, 1d and 1M and selects which rows are queried
The aggregate functions like max(my_sensors[1h]) which makes it select the columns from the reduced table with the max_ prefix for querying when using the reduced table. For selecting the realtime data it will use the SQLite max() function.

This is also just about enough to make the graphical query builder in Grafana work for most cases. The other setting used for the queries is the step value that Grafana calculates and passes to the Prometheus API. For the reduced table this is completely ignored and for the realtime table this is converted to SQL to do aggregation across rows.

As an example the query avg(sensors{sensor="solar", _col="voltage"}) gets translated to:

SELECT
  instant,
  tag_sensor,
  avg(field_voltage) as field_voltage
FROM series_sensors
WHERE instant BETWEEN ? AND ? -- Grafana time range
  AND tag_sensor = ? -- solar
GROUP BY instant/30 -- 30 is the step value from Grafana

To get nice aligned hourly data for a bar chart the query simply changes to avg(sensors{sensor="solar", _col="voltage"}[1h]) which generates this query:

SELECT
  instant,
  date,
  hour,
  tag_sensor,
  avg_voltage
FROM reduced_sensors
WHERE instant BETWEEN ? AND ? -- Grafana time range
  AND tag_sensor = ? -- solar
  AND scale = 0 -- hourly

This reduced data is generated as background task in the server and makes sure that the row with the aggregate of a single hour selects the datapoints that fit exactly in that hour, not shifted by the local time when querying like I now have issues with in Grafana:

The query running against the old Prometheus database

The bars in this chart don't align with the dates because this screenshot wasn't made at midnight. The data in the bars is also only technically correct when viewing the Grafana dashboard at midnight since on other hours it selects data from other days as well. If I view this at 13:00 then I get the data from 13:00 the day before to today which is a bit annoying in most cases and useless in the case of this chart because the daily_total metric in my solar inverter is reset at night and I pick the highest value.

For monthly bars this issue gets worse because it's apparently impossible to accurately get monthly data from the timeseries databases I've used. Because I'm pregenerating this data instead of using magic intervals this also Just Works(tm) in my implementation.

The same sort of query on the Miniseries backend, hourly instead because I don't have enough demo data yet.

Is this better?

It is certainly in the prototype stage and has not had enough testing to find weird edgecases. It does provide all the features though I need to recreate by existing home automation dashboard and performance is absolutely fine. The next step here is to implement a feature to lie to Grafana about the date of the data to actually use the heatmap chart to show data from multiple days as multiple rows.

Once the kinks are worked out in this prototype it's probably a good idea to rewrite it into something like Go for example because while a lot of the data processing is done in SQLite the first bottleneck will probably be the single-threaded nature of the webserver and the MQTT ingestion code.

The source code is online at https://git.sr.ht/~martijnbraam/miniseries

Making a Linux-managed network switch

Martijn Braam — Wed, 03 Jul 2024 14:10:04 -0000

Network switches are simple devices, packets go in, packets go out. Luckily people have figured out how to make it complicated instead and invented managed switches.

Usually this is done by adding a web-interface for configuring the settings and see things like port status. If you have more expensive switches then you'd even get access to some alternate interfaces like telnet and serial console ports.

There is a whole second category of managed switches though that people don't initially think of. These are the network switches that are inside consumer routers. These routers are little Linux devices that have a switch chip inside of them, one or more ports are internally connected to the CPU and the rest are on the outside as physical ports.

Mikrotik RB2011 block diagram from mikrotik.com

Here is an example of such a device that actually has this documented. I always thought that the configuration of these switch connected ports was just a nice abstraction by the webinterface but I was suprised to learn that with the DSA and switchdev subsystem in Linux these ports are actually fully functioning "local" network ports. Due to this practically only being available inside integrated routers It's pretty hard to play around with unfortunately.

What is shown as a single line on this diagram is actually the connection of the SoC of the router and the switch over the SGMII bus (or maybe RGMII in this case) and a management bus that's either SMI or MDIO. Network switches have a lot of these fun acronyms that even with the full name written out make little sense unless you know how all of this fits together.

Controlling your standard off-the-shelf switch using this system simply isn't possible because the required connections of the switch chip aren't exposed for this. So there's only one option left...

Making my own gigabit network switch

Making my own network switch can't be that hard right? Those things are available for the price of a cup of coffee and are most likely highly integrated to reach that price point. Since I don't see any homemade switches around on the internet I guess the chips for those must be pretty hard to get...

Nope, very easy to get. There's even a datasheet available for these. So I created a new KiCad project and started creating some footprints and symbols.

I'm glad there's any amount of datasheet available for this chip since that's not usually the case for Realtek devices, but it's still pretty minimal. I resorted to finding any devices that has schematics available for similar Realtek chips to find out how to integrate it and looking at a lot of documentation for how to implement ethernet in a design at all.

The implementation for the chip initially looked very complicated, there's about 7 different power nets it requires and there are several pretty badly documented communication interfaces. After going through other implementations it seem like the easiest way to power it is just connect all the nets with overlapping voltage ranges together and you're left with only needing a 3.3V and 1.1V regulator.

The extra communication busses are for all the extra ports I don't seem to need. The switch chip I selected is the RTL8367S which is a very widely used 5-port gigabit switch chip, but it's actually not a 5-port chip. It's a 7 port switch chip where 5 ports have an integrated PHY and two are for CPU connections.

CPU connection block diagram from the RTL8367S datasheet

My plan is different though, while there are these CPU ports available there is actually nothing in the Linux switchdev subsystem that requires the CPU connection to be to those ports. Instead I'll be connecting to port 0 on the switch with a network cable and as far as the switchdev driver knows there's no ethernet PHY in between.

The next hurdle is the configuration of the switch chip, there's several configuration systems available and the datasheet does not really describe what is the minimum required setup to actually get it to function as a regular dumb switch. To sum up the configuration options of the chip:

There's 8 pins on the chip that are read when it's starting up. These pins are shared with the led pins for the ports so that makes designing pretty annoying. Switching the setting from pull-up to pull-down also requires the led to be connected in the different orientation.
There's an i2c bus that can be connected to an eeprom chip. The pins for this are shared with the SMI bus that I require to make this chip talk to Linux though. There is pin configuration to select from one of two eeprom size ranges but does not further specify what this setting actually changes.
There's a SPI bus that supports connecting a NOR flash chip to it. This can store either configuration registers or firmware for the embedded 8051 core depending on the configuration of the bootup pins. The SPI bus pins are also shared with one of the CPU network ports.
There is a serial port available but from what I guess it probably does nothing at all unless there's firmware loaded in the 8051.

My solution to figuring out is to just order a board and solder connections differently until it works. I've added a footprint for a flash chip that I ended up not needing and for all the configuration pins I added solder jumpers. I left out all the leds since making that configurable would be pretty hard.

The next step is figuring out how to do ethernet properly. There has been a lot of documentation written about this and they all make it sound like gigabit ethernet requires perfect precision engineering, impedance managed boards and a blessing from the ethernet gods themselves to work. This does not seem to match up with the reality that these switches are very very cheaply constructed and seem to work just fine. So I decided to open up a switch to check how many of these coupling capacitors and impedance matching planes are actually used in a real design. The answer seems to be that it doesn't matter that much.

This is the design I have ended up with now but it is not what is on my test PCB. I got it almost right the first time though :D

The important parts seem to be matching the pair skew but matching the length of the 4 network pairs is completely useless, this is mainly because network cables don't have the same twisting rate for the 4 pairs and so the length of these are already significantly different inside the cable.

The pairs between the transformer and the RJ45 jack has it's own ground plane that's coupled to the main ground through a capacitor. The pairs after the transformer are just on the main board ground fill.

What I did wrong on my initial board revision was forgetting the capacitor that connects the center taps of the transformer on the switch side to ground making the signals on that side referenced to board ground. This makes ethernet very much not work anymore so I had to manually cut tiny traces on the board to disconnect that short to ground. In my test setup the capacitor just doesn't exist and all the center taps float. This seems to work just fine but the final design does have that capacitor added.

Cut ground traces on the ethernet transformer

The end result is this slightly weird gigabit switch. It has 4 ports facing one direction and one facing backwards and it is powered over a 2.54mm pinheader. I have also added a footprint for a USB Type-C connector to have an easy way to power it without bringing out the DuPont wires.

Connecting it to Linux

For my test setup I've picked the PINE64 A64-lts board since it has the connectors roughly in the spots where I want them. It not being an x86 platform is also pretty important because configuration requires a device tree change, can't do that on a platform that doesn't use device trees.

The first required thing was rebuilding the kernel for the board since most kernels simply don't have these kernel modules enabled. For this I enabled these options:

CONFIG_NET_DSA for the Distributed Switch Architecture system
CONFIG_NET_DSA_TAG_RTL8_4 for having port tagging for this Realtek switch chip
CONFIG_NET_SWITCHDEV the driver system for network switches
CONFIG_NET_DSA_REALTEK, CONFIG_NET_DSA_REALTEK_SMI, CONFIG_NET_DSA_REALTEK_RTL8365MB for the actual switch chip driver

Then the more complicated part was figuring out how to actually get this all loaded. In theory it is possible to create a device tree overlay for this and get it loaded by U-Boot. I decided to not do that and patch the device tree for the A64-lts board instead since I'm rebuilding the kernel anyway. The device tree change I ended up with is this:

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts
index 596a25907..10c1a5187 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts
@@ -18,8 +18,78 @@ led {
 			gpios = <&r_pio 0 7 GPIO_ACTIVE_LOW>; /* PL7 */
 		};
 	};
+
+switch {
+	compatible = "realtek,rtl8365rb";
+	mdc-gpios = <&pio 2 5 GPIO_ACTIVE_HIGH>; // PC5
+	mdio-gpios = <&pio 2 7 GPIO_ACTIVE_HIGH>; // PC7
+	reset-gpios = <&pio 8 5 GPIO_ACTIVE_LOW>; // PH5
+	realtek,disable-leds;
+
+	mdio {
+		compatible = "realtek,smi-mdio";
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		ethphy0: ethernet-phy@0 {
+			reg = <0>;
+		};
+
+		ethphy1: ethernet-phy@1 {
+			reg = <1>;
+		};
+
+		ethphy2: ethernet-phy@2 {
+			reg = <2>;
+		};
+
+		ethphy3: ethernet-phy@3 {
+			reg = <3>;
+		};
+
+		ethphy4: ethernet-phy@4 {
+			reg = <4>;
+		};
+	};
+
+	ports {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		port@0 {
+			reg = <0>;
+			label = "cpu";
+			ethernet = <&emac>;
+		};
+
+		port@1 {
+			reg = <1>;
+			label = "lan1";
+			phy-handle = <&ethphy1>;
+		};
+
+		port@2 {
+			reg = <2>;
+			label = "lan2";
+			phy-handle = <&ethphy2>;
+		};
+
+		port@3 {
+			reg = <3>;
+			label = "lan3";
+			phy-handle = <&ethphy3>;
+		};
+
+		port@4 {
+			reg = <4>;
+			label = "lan4";
+			phy-handle = <&ethphy4>;
+		};
+	};
+};
 };

It loads the driver for the switch with the realtek,rtl8365rb, this driver supports a whole range of Realtek switch chips including the RTL8367S I've used in this design. I've removed the CPU ports from the documentation example and just added the definitions of the 5 regular switch ports.

The important part is in port@0, this is the port that is facing backwards on my switch and is connected to the A64-lts, I've linked it up to &emac which is a reference to the ethernet port of the computer. The rest of the ports are linked up to their respective PHYs in the switch chip.

In the top of the code there's also 3 GPIOs defined, these link up to SDA/SCL and Reset on the switch PCB to make the communication work. After booting up the system the result is this:

1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST> mtu 1508 qdisc noop state DOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
3 lan1@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
4 lan2@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
5 lan3@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
6 lan4@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff

I have the eth0 device here like normal and then I have the 4 interfaces for the ports on the switch I defined in the device tree. To make it actually do something the interfaces actually need to be brought online first:

$ ip link set eth0 up
$ ip link set lan1 up
$ ip link set lan2 up
$ ip link set lan3 up
$ ip link set lan4 up
$ ip link
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
2: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1508 qdisc mq state UP qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
3: lan1@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
4: lan2@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
5: lan3@eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff
6: lan4@eth0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000
    link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff

Now the switch is up you can see I have a cable plugged into the third port. This system hooks into a lot of the Linux networking so it Just Works(tm) with a lot of tooling. Some examples:

Add a few of the lan ports into a standard Linux bridge and the switchdev system will bridge those ports together in the switch chip so Linux doesn't have to forward that traffic.
Thinks like ethtool lan3 just work to get information about the link. and with ethtool -S lan3 all the standard status return info which includes packets that have been fully handled by the switch.

Limitations

There's a few things that makes this not very nice to work with. First of all the requirement of either building a custom network switch or tearing open an existing one and finding the right connections.

It's not really possible to use this system on regular computers/servers since you need device trees to configure the kernel for this and most computers don't have kernel-controlled GPIO pins available to hook up a switch.

As far as I can find there's also no way to use this with a network port on the computer side that's not fixed, USB network interfaces don't have a device tree node handle to refer to to set the conduit port.

There is a chance some of these limitations are possible to work around, maybe there's some weird USB device that exposes pins on the GPIO subsystem, maybe there's a way to load switchdev without being on an ARM device but that would certainly take a bit more documentation...

Megapixels contributions

Martijn Braam — Sat, 11 May 2024 14:45:17 -0000

I've been working on the code that has become libmegapixels for a bit more as a year now. It has taken several thrown-away codebases to come to a general architecture I was happy with and it it has been quite a task to split off media pipeline tasks from the original Megapixels codebase.

After staring at this code for many months I thought I've made libmegapixels a nearly perfect little library. That's the problem with working on a codebase without anyone else looking at it.

About two weeks ago libmegapixels and the general Megapixels 2.x codebase had it's first contact with external contributors and that has put a spotlight on all the low hanging fruit in documentation and codebase issues. A great example of that is this commit:

diff --git a/src/parse.c b/src/parse.c
index bfea3ec..93072d0 100644
--- a/src/parse.c
+++ b/src/parse.c
@@ -403,6 +403,8 @@ libmegapixels_load_file(libmegapixels_devconfig *config, const char *file)
 	config_init(&cfg);
 	if (!config_read_file(&cfg, file)) {
 		fprintf(stderr, "Could not read %s\n", file);
+		fprintf(stderr, "%s:%d - %s\n",
+			config_error_file(&cfg), config_error_line(&cfg), config_error_text(&cfg));
 		config_destroy(&cfg);
 		return 0;
 	}

A simple patch that massively improves the usablility for people writing libmegapixels config files: Actually printing the parsing errors from libconfig when a file could not be read. Because I generally run libmegapixels through the IDE and have all the syntax highlighting etc set up for the files I simply haven't triggered this codepath enough to actually implement this part.

These last two weeks there have also been some significantly more complicated fixes like tracing segfault issues in Megapixels 2.x which helps a lot with getting the new codebase ready for daily use. Figuring out some API issues in libmegapixels like not correctly setting camera indexes in the returned data. Also the config files have now been updated to work with the latest versions of the PinePhone Pro kernel instead of the year old build I've been developing against.

Video recording

I've been saying for a long time that video recording on the PinePhone won't be possible, especially not to the level of support on Android and iOS due to hardware limitations. The only real hope for proper video recording would be that someone gets H.264 hardware encoding to work on the A64 processor.

I can happily report that I was wrong. Pavel Machek has made significant progress in PinePhone video recording with a few large contributions that implement the UI bits to add video recording. A new second postprocessing pipeline for running external video encoding scripts just like Megapixels already lets you write your own custom scripts for processing the raw pictures into JPEGs.

Video recording is a complicated issue though, mainly due to the sheer amount of data that needs to be processed to make it work smoothly. On the maximum resultion of the sensor in the PinePhone the framerate isn't high enough for recording normal videos (unless you enjoy 15fps video files) but on lower resolutions the pipeline can run at normal video framerates. The maximum framerates from the sensor for this are 1080p at 30fps and 720p at 60fps.

For 720p60 the bandwidth of the raw sensor data is 442 Mbps and for 1080P30 this is 497 Mbps. This is a third of the expected bandwidth because the raw sensor data is essentially a greyscale image where every pixels has a different color filter in front. This is too much data to write out to the eMMC or SD card to process later and the PinePhone also struggles already to encode 720p30 video live without even running a desktop environment.

There are two implementations of video recording right now. One that saves the raw DNG frames to a tmpfs since RAM is the only thing that can keep up with the data rate. This should give you roughly 30 seconds of video recording capabilities and after that recording time it will take a while to actually encode the video.

Pavel has posted an example of this video recording on his mastodon.

The second way is putting the sensor in a YUV mode instead of raw data. This gives worse picture quality on the sensor in the PinePhone but the data format matches more closely to the way frames are stored in video files so the expensive debayer step can be skipped while video recording. This together with encoding H.264 video with the ultrafast preset should make it just about possible to record real-time encoded video on the PinePhone.

Many thanks

It's great to see contributions to Megapixels 2 and libmegapixels. It's a big step towards getting the Megapixels 2.x codebase production ready and it's simply a lot more fun to work on a project together with other people.

It's great to have contributors working on the UI code, the camera support fixes for devices and the many bugfixes to the internals. It's also very helpful to actually have issues created by people building and testing the code on other distributions. This already ironed out a few issues in the build system.

There also has been some nice contributions to the Megapixels 1.x codebase, all of those should by now already have been merged into your favorite PinePhone distribution :)

The last few Megapixels update blogposts have all been around Megapixels 2.x and the supporting libraries so none of the improvements are immediately usable by actual PinePhone{,Pro} and Librem 5 users until there is an actual release. It will take a bunch more polish until feature parity with Megapixels 1.x is reached.

Bootstrapping Alpine Linux without root

Martijn Braam — Wed, 20 Mar 2024 23:50:30 -0000

Creating a chroot in Linux is pretty easy: put a rootfs in a folder and run the sudo chroot /my/folder command. But what if you don't want to use superuser privileges for this?

This is not super simple to fix, not only does the chroot command itself require root permissions but the steps for creating the rootfs in the first place and mounting the required filesystems like /proc and /sys require root as well.

In pmbootstrap the process for creating an installable image for a phone requires setting up multiple chroots and executing many commands in those chroots. If you have the password timeout disabled in sudo you will notice that you will have to enter your password tens to hundreds of times depending on the operation you're doing. An example of this is shown in the long running "pmbootstrap requires sudo" issue on Gitlab. In this example sudo was called 240 times!

Now it is possible with a lot of refactoring to move batches of superuser-requiring commands into scripts and elevate the permissions of that with a single sudo call but to get this down to a single sudo call per pmbootstrap command would be really hard.

Another approach

So instead of building a chroot the "traditional" way what are the alternatives?

The magic trick to get this working are user namespaces. From the Linux documentation:

User namespaces isolate security-related identifiers and attributes, in particular, user IDs and group IDs (see credentials(7)), the root directory, keys (see keyrings(7)), and capabilities (see capabilities(7)). A process's user and group IDs can be different inside and outside a user namespace. In particular, a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace.

It basically allows running commands in a namespace where you have UID 0 on the inside without requiring to elevate any of the commands. This does have a lot of limitations though which I somehow all manage to hit with this.

One of the tools that makes it relatively easy to work with the various namespaces in Linux is unshare. Conveniently this is also part of util-linux so it's a pretty clean dependency to have.

Building a rootfs

There's enough examples of using unshare to create a chroot without sudo but those all assume you already have a rootfs somewhere to chroot into. Creating the rootfs itself has a few difficulties already though.

Since I'm building an Alpine Linux rootfs the utility I'm going to use is apk.static. This is a statically compiled version of the package manager in Alpine which allows building a new installation from an online repository. This is similar to debootstrap for example if you re more used to Debian than Alpine.

There's a wiki page on running Alpine Linux in a chroot that documents the steps required for setting up a chroot the traditional way with this. The initial commands to aquire the apk.static binary don't require superuser at all, but after that the problems start:

$ ./apk.static -X ${mirror}/latest-stable/main -U --allow-untrusted -p ${chroot_dir} --initdb add alpine-base

This creates the Alpine installation in ${chroot_dir}. This requires superuser privileges to set the correct permissions on the files of this new rootfs. After this there's two options of populating /dev inside this rootfs which both are problematic:

$ mount -o bind /dev ${chroot_dir}/dev
mounting requires superuser privileges and this exposes all your hardware in the chroot

$ mknod -m 666 ${chroot_dir}/dev/full c 1 7
$ mknod -m 644 ${chroot_dir}/dev/random c 1 8
... etcetera, the mknod command also requires superuser privileges

The steps after this have similar issues, most of them for mount reasons or chown reasons.

There is a few namespace options from unshare used to work around these issues. The command used to run apk.static in my test implementation is this:

$ unshare \
    --user \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    ./apk-tools-static -X...etc

This will use unshare to create a new userns and change the uid/gid inside that to 0. This effectively grants root privileges inside this namespace. But that's not enough.

If chown is used inside the namespace it will still fail because my unprivileged user still can't change the permissions of those files. The solution to that is the uid remapping with --map-users and --map-groups. In the example above it sets up the namespace so files created with uid 0 will generate files with the uid 100000 on the actual filesystem. uid 1 becomes 100001 and this continues on for 10000 uids.

This again does not completely solve the issue though because my unprivileged user still can't chown those files, doesn't matter if it's chowning to uid 0 or 100000. To give my unprivileged user this permission the /etc/subuid and /etc/subgid files on the host system have to be modified to add a rule. This sadly requires root privileges once to set up this privilege. To make the command above work I had to add this line to those two files:

martijn:100000:10000

This grants the user with the name martijn the permission to use 10.000 uids starting at uid 100.000 for the purpose of userns mapping.

The result of this is that the apk.static command will seem to Just Work(tm) and the resulting files in ${chroot_dir} will have all the right permissions but only offset by 100.000.

One more catch

There is one more complication with remapped uids and unshare that I've skipped over in the above example to make it clearer, but the command inside the namespace most likely cannot start.

If you remap the uid with unshare you get more freedom inside the namespace, but it limits your privileges outside the namespace even further. It's most likely that the unshare command above was run somewhere in your own home directory. After changing your uid to 0 inside the namespace your privilege to the outside world will be as if you're uid 100.000 and that uid most likely does not have privileges. If any of the folders in the path to the executable you want unshare to run for you inside the namespace don't have the read and execute bit set for the "other" group in the unix permissions then the command will simply fail with "Permission denied".

The workaround used in my test implementation is to just first copy the executable over to /tmp and hope you at least still have permissions to read there.

Completing the rootfs

So after all that the first command from the Alpine guide is done. Now there's only the problems left for mounting filesystems and creating files.

While /etc/subuid does give permission to use a range of uids as an unprivileged user with a user namespace it does not give you permissions to create those files outside the namespace. So the way those files are created is basically the complicated version of echo "value" | sudo tee /root/file:

$ echo "nameserver a.b.c.d" | unshare \
    --user \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    sh -c 'cat > /etc/resolv.conf'

This does set-up and tear down the entire namespace for every file change or creation which is a bit inefficient, but inefficient is still better than impossible. Changing file permissions is done in a similar way.

To fix the mounting issue there's the mount namespace functionality in Linux. This allows creating new mounts inside the namespace as long as you still have permissions on the source file as your unprivileged user. This effectively means you can't use this to mount random block devices but it works great for things like /proc and loop mounts.

There is a --mount-proc parameter that will tell unshare to set-up a mount namespace and then mount /proc inside the namespace at the right place so that's what I'm using. But I still need other things mounted. This mounting is done as a small inline shell script right before executing the commands inside the chroot:

$ unshare \
    --user \
    --fork \
    --pid \
    --mount \
    --mount-proc \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    -- \
    sh -c " \
    	mount -t proc none proc ; \
        touch dev/zero ; \
        mount -o rw,bind /dev/zero dev/zero ;\
        touch dev/null ; \
        mount -o row,bind /dev/null dev/null ;\
        ...
        chroot . bin/sh \
        "

The mounts are created right between setting up the namespaces but before the chroot is started so the host filesystem can still be accessed. The working directory is set to the root of the rootfs using the --wd parameter of unshare and then bind mounts are made from /dev/zero to dev/zero to create those devices inside the rootfs.

This combines the two impossible options to make it work. mknod can still not work inside namespaces because it is a bit of a security risk. mount'ing /dev gives access to way too many devices that are not needed but the mount namespace does allow bind-mounting the existing device nodes one by one and allows me to filter them.

Then finally... the chroot command to complete the journey. This has to refer to the rootfs with a relative path and this also depends on the working directory being set by unshare since host paths are breaking with uid remapping.

What's next?

So this creates a full chroot without superuser privileges (after the initial setup) and this whole setup even works perfectly with having cross-architecture chroots in combination with binfmt_misc.

Compared to pmbootstrap this codebase does very little and there's more problems to solve. For one all the filesystem manipulation has to be figured out to copy the contents of the chroot into a filesystem image that can be flashed. This is further complicated by the mangling of the uids in the host filesystem so it has to be remapped while writing into the filesystem again.

Flashing the image to a fastboot capable device should be pretty easy without root privileges, it only requires an udev rule that is usually already installed by the android-tools package on various Linux distributions. For the PinePhone flashing happens on a mass-storage device and as far as I know it will be impossible to write to that without requiring actual superuser privileges.

The code for this is in the ~martijnbraam/ambootstrap repository, hopefully in some time I get this to actually write a plain Alpine Linux image to a phone :D

The dilemma of tagging library releases

Martijn Braam — Sun, 14 Jan 2024 16:11:17 -0000

I've been working on the libmegapixels library for quite a bit now. The base of the library is pretty solid which is configuring a V4L2 pipeline so you can get camera frames on modern ARM platforms. Most of the work on the library side is figuring the AWB/AE/AF code and how that will fit together with applications.

Due to the AAA code not working yet and the API not being being fully defined on how those parts will fit together I've been holding of on tagging an actual release on the libmegapixels library.

A lot of my projects, especially libraries, are written in Python so I've long enjoyed the luxury of APIs being duck-typed and having the possibility of adding optional arguments to methods in the future. Sadly in C libraries I can't get away with never defining the types for arguments that might change in the future or adding optional arguments.

My original plan was to tag a release on libmegapixels together with the first 2.x release of Megapixels since these pieces of software are intended to fit together but after thinking about it some more (and some convincing from other people interested in the libmegapixels release) I've decided to tag a 0.1 release.

In an ideal world I can just release code when it's fully done and tested. In this case the long time it takes to get everything ready for use will mean that potential contributors to the code will also be held back from experimenting with the codebase. Especially since a large part of libmegapixels is the config files it ships for specific hardware configurations. If I wouldn't make any releases then at some point users/developers will be forced to just ship random git commits which is a way worse situation to be in for bug tracking.

With this 0.1 release I want to make it possible to start writing config files for various phones and platforms to test camera pipelines. Hopefully this will also mean any issues with the configuration file format that people might hit will be figured out before I have to tag a "final" 1.x release.

The release

So the initial tagged release of libmegapixels:

located at https://gitlab.com/megapixels-org/libmegapixels/-/tags/0.1.0
Build instructions at https://libme.gapixels.me/building.html
Comes with absolutely no guarantee of stability for the C api of the library
Most likely the config file format is stable but might have small tweaks before the 1.x release

Hopefully this will allow people to start experimenting with the codebase and generate some feedback on it so I'm not just developing this for months and completely overfitting it to the three devices I'm testing on.

I'm planning to make a similar release for libdng soon. That library is also mostly stable but I need to fix up the last parts of the API to allow reading and writing all the required metadata.

The MNT keyboard reviewed

Martijn Braam — Tue, 19 Dec 2023 23:59:01 -0000

MNT Research is one of those few companies that actually releases open source hardware. Instead of just getting a schematic with your hardware (which is great even by itself) there's the full sources for that schematic, the Kicad parts libraries, the sources for the firmware and even documentation how to use that code.

I received my MNT Standalone Keyboard V3 a few days ago so I've been typing on it now for a bit. This is all happening while I'm recovering from covid so I hope if I read back this post in a few days it is actually somewhat coherent :)

This being a more niche product sadly does make it a bit on the expensive side. But I must say this is by far the most solid keyboard I've owned. My main keyboard on my desktop is an Das Keyboard 4 ultimate. It's a nice keyboard but it doesn't compare to the full machined aluminium frame on the MNT keyboard.

The whole keyboard is mounted on what's basically a 4mm slab of aluminium which has a nice MNT logo machined on it on the bottom

This makes the keyboard feel incredibly solid, even with the rest of the frame taken off it's practically impossible to even bend the keyboard. The second half of the frame is the top edge that screws on the base plate with 8 screws.

This is another very carefully designed aluminium part In the close-up above you can see the opening for the USB-C connection for the keyboard and the internal cutouts for the display daughterboard with the screw mounting.

The electronics

This keyboard is based around an Atmega32U4 microcontroller. This is the same keyboard PCB as what's shipped in the MNT Reform laptop so there are two connectors on this board. The USB-C connector is what's exposed on the standalone keyboard and the laptop presumably uses the USB header that's beside it.

Beside the USB header is one of the dip switches. SW36 is labeled "STANDALONE" here. This switches the board to use USB power instead of the 3.3V supplied by the laptop mainboard. The ribbon connector is the connection to the OLED display board.

On the left side of the display board there is an empty footprint for the standard Atmega programming header and a serial port that's used to connect to the laptop mainboard. Additionally there's a reset button and SW84 which has the confusing label "RG".

Thanks to the schematics being available in the manual it's easy to find that this is the switch to enable programming. The rest of the interesting parts is hidden somewhere below the display board or on the bottom side of the PCB possibly. I have not taken the keyboard further apart for this review since all the information I'd ever want is already available in the schematics. The keyboard matrix itself is read out by the Atmega directly which provides the full keyboard functionality and the OLED display is on a small daughterboard to slightly rise it towards the front bezel.

Firmware

Since this is one of the 8-bit Atmel parts it's very easy to build firmware using the gcc-avr compiler packaged in various distributions. All the source files are stored in the firmware repository for the various MNT products.

Checking the version of the firmware is pretty easy. With the circle key on the top-right corner of the keyboard the menu on the display opens. You can use the arrow keys to browse to the "System Status" option or just press the "s" key on the keyboard.

Which shows the hardware revision this firmware was build for and the version that was specified when building:

It seems like the "g" at the start of the commit has was accidental here and it refers to commit 7e73483 in the firmware repository. This seems to be the newest tag when the keyboard was shipped so that makes sense.

So lets change something! The key in the bottom left corner of the keyboard is the Hyper key instead of Ctrl as you'd expect from most keyboards. The Ctrl key is moved in place of the Caps Lock button on normal keyboard layouts which is great for a lot of uses. I never use Hyper though so I want to change that key to be my second Ctrl key.

The readme specifies that the keyboard layout is defined in the various matrix_* files so after reading around a bit it seems like I have to edit matrix_3.h for my keyboard.

Reading the manual again I realized that doing this makes me lose access to the media keys since those are defined as "Hyper+F*" for the various media actions. To fix that I changed the right control button into the Hyper key, this is the button with the three dots on it. My resulting code change:

diff --git a/reform2-keyboard-fw/matrix_3.h b/reform2-keyboard-fw/matrix_3.h
index bb72f6d..f9db133 100644
--- a/reform2-keyboard-fw/matrix_3.h
+++ b/reform2-keyboard-fw/matrix_3.h
@@ -25,7 +25,7 @@
 
 // Sixth row
 #define MATRIX3_DEFAULT_ROW_6 \
-  HID_KEYBOARD_SC_EXECUTE,\
+  HID_KEYBOARD_SC_LEFT_CONTROL,\
   HID_KEYBOARD_SC_LEFT_GUI,\
   HID_KEYBOARD_SC_LEFT_ALT,\
   KEY_SPACE,\
@@ -33,7 +33,7 @@
   KEY_SPACE,\
   KEY_SPACE,\
   HID_KEYBOARD_SC_RIGHT_ALT,\
-  HID_KEYBOARD_SC_RIGHT_CONTROL,\
+  HID_KEYBOARD_SC_EXECUTE,\
   HID_KEYBOARD_SC_LEFT_ARROW,\
   HID_KEYBOARD_SC_DOWN_ARROW,\
   HID_KEYBOARD_SC_RIGHT_ARROW

Now to build this there's a simple Makefile. Since I've already programmed Atmega parts on this machine I already have the compiler installed making this very quick and easy.

I ended up compiling with the following command:

$ make REFORM_KBD_OPTIONS="-DKBD_VARIANT_3 -DKBD_MODE_STANDALONE -DKBD_FW_VERSION=\\\"Martijn\\\""

This is straight from the readme with an additional define to set the firmware version to "Martijn". After building this I got the keyboard.hex file that can be flashed.

The flashing is as simple as running the flash.sh script. This will instruct you to press "Circle + X" to enter flashing mode and then run the neccesary commands to flash the keyboard. After running this I noticed that the delete key on the keyboard was no longer a delete key. It turns out I don't have VARIANT_3 but instead VARIANT_3_US. A quick rebuild and reflash also fixes that.

The brightness differences on the display are a camera artifact

Tadaa! My own name in the firmware version field. It's super easy to mess with this firmware.

The keyboard itself

Well the keyboard works just fine as a keyboard. Typing on this keyboard takes a few minutes to get used to compared to my normal keyboard since all the keys are slightly closer together. The split spacebar is also annoying me a bit. It turns out that the left split in the spacebar is exactly the spot where I normally hit the spacebar with my thumb.

The switches are nice and clicky (but silent, I have the version with brown switches in them). Overall the keyboard just does what it needs to. The standard layout is quite unusual but everything can be changed with open firmware so I'm confident I can get to a layout I'm 100% happy with.

Conclusion

This is an extremely solid and very compact keyboard I can easily throw in my backpack. It being an USB-C keyboard makes it fit neatly with all my other random cables I usually take with me.

It might be slightly more expensive than similar keyboards, but I don't know of similar keyboards with a case this rugged and the display functionality (I forgot to mention you can use HID reports from the host to write custom content to the display from your computer). The openness of this product makes the extra cost certainly worth it for me.

I'll probably be messing with the firmware for this keyboard a bit more while I use it. There's some small things to fix like the device reporting the name "LUFA Keyboard Demo Application" in Linux instead of a neater "MNT Keyboard" or something.

Looking closer at the syslog

Martijn Braam — Mon, 11 Dec 2023 15:43:17 -0000

The syslog protocol, it's one of the ancient protocols in the Unix world. For a long time the logging was handled by daemons like syslog-ng and rsyslog, this has now been taken over by journald on a lot of systems. But have you ever wondered how your log messages even end up in /var/log in the first place?

I've started looking into syslog implementations when building a replacement for the use of busybox syslogd in postmarketOS. In postmarketOS this daemon is configured to just send syslog messages to a in-memory buffer for logging and never store anything on disk in /var/log. This is mainly to make sure there's no unneeded writes to the flash storage in a lot of the old phones that are supported by postmarketOS. There's a few downsides to this logging implementation though:

No persistent logging of system messages across reboots. This would be easy to check if certain log messages were present on earlier boots when debugging.
Completely unstructured logs while people are pretty much used to journald logging with nice filters

As a replacement I wrote logbookd. It's a tiny syslog daemon that supports disk and memory logging and provides some nice filtering options to be closer to journald. The bulk of this work is handled by doing structured logging into SQLite.

So how does the syslog work

The way the syslog works is incredibly simple. The syslog daemon opens an unix domain socket at /dev/log. Applications connect to this socket and write log messages in the syslog format and the syslog daemon takes care of filtering those out and putting it in the various files in /var/log.

The complication of this is that there is no real syslog protocol. There are two standards for it though. There is RFC 3164 and RFC 5424 which both describe the syslog protocol. The 3164 document was only created in 2001 and describes what various implementations are doing in the wild. It's RFC 5424 that actually nails down a specific format.

I wrote parser for the 5424 format initially since that's the newest standard and it's by far the easiest to parse. An RFC 5424 log message looks like this:

<13>1 2023-12-11T14:56:59.0189+01:00 laptop test - - [timeQuality tzKnown="1"] Hi

The first part here between the angular brackets is the PRI value. It encodes the logging facility and severity as one number. The least significant 3 bits encode the severity on a scale of 1-8 and the other bits encode one of the 23 facilities that are defined. Some examples of the facilities are:

0 for kernel messages
1 for generic userspace messages
2 for the mail system

Most of the other numbers are for more old services like UUCP and FTP and for some numbered user-defined codes. In the example above the 13 means facility 1 (user) and severity 5 (notice).

The other parts of this message are in order:

The protocol version number which is set to 1 here.
A timestamp with timezone for the log message
The laptop is the hostname for the message, this will be set to - when NULL
The test part is the application name. This can also be - for NULL
The next field is called PROCID and is set to - for NULL in my case. According to the standard it might be used for the pid but is mostly implementation defined.
The second null is the MSGID, it can define a message type from the specific service, it will also be null in most cases.
The next part is [timeQuality tzKnown="1"] which is the STRUCTURED-DATA field. It can contain any implementation defined data. This is a subset of the structured data created by the logger tool used to create test messages. This field can also be just - for no structured data.
Finally the actual log message. That's just Hi in this case.

Writing a parser for this format is relatively straightforward. In the logbookd implementation there's a row for every one of these fields in the logging table and the message is split up according to these rules.

There is a fatal flaw in the RFC 5424 specification though: nobody is using it. None of the log messages on my running systems are actually in this format. It looks like practically all software uses RFC 3164, which is a fancy way of saying they do whatever they want.

RFC 3164

So this is actually the true specification for syslog messages being used in the wild. Let's look at one of these messages:

<13>Dec 11 15:21:50 laptop test: Hi

It's a lot simpler! But not actually. This is a pretty minimal message. The initial part is the same as the RFC 5424 message, the PRI is luckily parsed the exact same way. There is no version indicator though and it does not use an ISO timestamp format.

The more problematic issues with this format though is that it does support a lot more data but it's pretty badly defined. Even all the parts shown in the example above are optional. The most minimal syslog message that is still up to this spec is Hi.

It's also somewhat valid to send messages with a badly formatted timestamp and it's up to the syslogd to fix up the timestamp in the message. This also makes it very easy to make it actually parse parts of the timestamp as the hostname since this is all badly defined and space separated.

Since there is no official field for the pid of a process this is usually appended to the application name in square brackets.

The logbookd implementation is mostly based on the way these old messages are parsed in rsyslog and tries to not guess parts. This means only the timestamp, app, hostname and message fields are filled in.

Kernel logging

Not all logging in the system comes from userspace. On Linux there's also the kernel log ringbuffer that can be read from /dev/kmsg. Reading from this file will return all the log messages in the kernel ringbuffer and also makes it possible to stream new log messages with further reads. The log messages from the kernel are in a similar but different format than the syslog socket:

6,1004,5150172365,-;hid-generic 0003: hiddev96,hidraw2: USB HID device on usb-0000:00:14
 SUBSYSTEM=hid
 DEVICE=+hid:0003

The first field in the kernel message is again the PRI. This follows the same numbering as the syslog RFCs but it's not in angular brackets this time. In this case it's facility 0 (the kernel) and severity 6 (info). The second field is the KSEQ. This is a number that counts up for every log message since boot. The logbookd implementation uses this to de-duplicate the kernel log messages after opening the file since it will always return the old kernel log messages first.

After that comes the timestamp. Instead of string parsing this is a straight up unix timestamp so it's way easier to deal with. The field after the timestamp is - indicating NULL, this is the flags field.

After the semicolon the actual kernel log message starts. This is the message as is rendered in the dmesg utility. After the log message there's a newline but the log line doesn't end there! The structured data is defined as indented continuation lines after the message itself and this contains some easier machine-parsable data that is usually hidden in dmesg.

Systemd journald

So everything changed when journald was introduced. Figuring out how this all works involves diving into the systemd source code. Systemd provides several unix sockets related to logging in /var/run/systemd/journal:

dev-log this is symlinked to /dev/log and receives syslog formatted lines and writes it to the journal
stdout is a socket that receives logs from systemd units. This is what the systemd-cat command connects to. It writes a header on connection to give the application metadata and then the stdout or stderr is just connected straight to this socket.
socket receives the log messages in the binary journald format

There is a few other fancy things that journald does. It is possible to filter your log messages with the --boot argument. If no argument is supplied it will only show messages from the current boot. If you specify a negative number it's possible to get only log messages from a specific previous bootup.

The way this is done is by reading from /proc/sys/kernel/random/boot_id. This is a value generated by the kernel on bootup. It is a UUID generated from random data. These are also the values you see when you run journalctl --list-boots. The BOOT_ID value shown there is this UUID with the dashes removed.

My logbookd implementation also reads the boot_id on startup and stores it with the logs, this allows filtering in the exact same way with the logread -b parameter.

Logging to a database

So the main departure journald and logbookd do from the older syslog daemons is that they don't log to plain-text files. Journald has a custom database format the messages are stored in and logbookd stores messages in an SQLite database.

Structured logging to a database has a few nice upsides. The main one is being able to do way more detailed filtering than what is reasonably possible with grep. It's a lot easier to filter on a specific date and time range in a database and due to database indexes this is still fast.

One of the other main reasons for using SQLite in logbookd is that the implementation in postmarketOS was configured to only log to memory. Using SQLite as logging back-end meant that it's easily possible to replicate this by writing to an in-memory database which is already supported by SQLite.

The final thing added to logbookd is the middle ground between in-memory and on-disk logging: the reduced writes mode. In this mode the syslog is written to an in-memory database but when receiving a SIGINT, SIGTERM or SIGUSR1 signal the logbookd daemon will open the on-disk database and lets SQLite do a database migration. This means that SQLite will append the write the new loglines to the disk without rewriting all the existing logs there. On bootup this database is restored again so the logging system behaves as-if it's configured to do normal on-disk logging.

You can use this now

If you're running postmarketOS edge and you have updated to the latest version your installation should've migrated to the logbookd logging daemon. The logread utility implements the common options the busybox logread command already had. For normal use this means that there's not much difference except that the log output from the logread utility is now colored and contains kernel logs.

Some examples of the new things that are now possible:

$ logread -b list
ID   BOOT ID                                FIRST ENTRY       LAST ENTRY
 0   05c3f283-3bae-4b2a-8431-210dd63310e0   Dec 11 16:33:59   Dec 11 16:34:05
-1   f3ea2fa1-6f9e-4e82-bd0b-201091fcb5b4   Dec 07 18:21:06   Dec 07 18:25:50

$ logread -b 0 -n 2
[Dec 11 16:35:09] daemon dleyna-renderer-service[18060]: Client :1.166 lost
[Dec 11 16:35:10] daemon dleyna-renderer-service[18060]: dLeyna: Exit

$ logread -b 1 -n 1
[Dec 07 18:25:18] kern kernel: perf: interrupt took too long (2531 > 2500), lowering kernel.perf_event_max_sample_rate to 79000

$ logread -b -1 -u logbookd -n 1
[Dec 07 18:25:37] syslog logbookd: Ready to process log messages

Being able to see interleaved kernel and userspace messages also makes certain scenarios a lot easier to debug.

Hopefully this makes a few things easier to debug. There's a bunch of software that also logs directly into /var/log in seperate files, this has not been replaced by logbookd and is also not directly query-able by this new system. For the rest of the log messages enjoy the new colors :)