Linux - BrixIT Bloghttps://blog.brixit.nl/tag/linux/page/1Mon, 28 Oct 2024 22:50:55 -000060Building a timeseries database for funhttps://blog.brixit.nl/building-a-timeseries-db-for-fun/103LinuxMartijn BraamMon, 28 Oct 2024 22:50:55 -0000<p>Everyone that has tried to make some nice charts in Grafana has probably come across timeseries databases, for example InfluxDB or Prometheus. I've deployed a few instances of these things for various tasks and I've always been annoyed by how they work. But this is a consequence of having great performance right?</p> <p>The thing is... most the dataseries I'm working with don't need that level of performance. If you're just logging the power delivered by a solar inverter to a raspberry pi then you don't need a datastore for 1000 datapoints per second. My experience with timeseries is not that performance is my issue but the queries I want to do which seem very simple are practically impossible, especially when combinated with Grafana.</p> <p>Something like having a daily total of a measurement as a bar graph to have some long-term history with keeping the bars aligned to the day boundary instead of 24 hour offsets based on my current time. Or being able to actually query the data from a single month to get a total instead of querying chunks of 30.5 days.</p> <p>But most importantly, writing software is fun and I want to write something that does this for me. Not everything has to scale to every usecase from a single raspberry pi to a list of fortune 500 company logos on your homepage.</p> <h2>A prototype</h2> <p>Since I don't care about high performance and I want to prototype quickly I started with a Python Flask application. This is mainly because I already wrote a bunch of Python glue before to pump the data from my MQTT broker into InfluxDB or Prometheus so I can just directly integrate that.</p> <p>I decided that as storage backend just using a SQLite database will be fine and to integrate with Grafana I'll just implement the relevant parts of the Prometheus API and query language.</p> <p>To complete it I made a small web UI for configuring and monitoring the whole thing. Mainly to make it easy to adjust the MQTT topic mapping without editing config files and restarting the server.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730152571/image.png" class="kg-image"></figure> <p>I've honestly probably spend way too much time writing random javascript for the MQTT configuration window. I had already written a MQTT library for Flask that allows using the Flask route syntax to extract data from the topic so I reused that backend. To make that work nicely I also wrote a simple parser for the syntax in Javascript to visualize the parsing while you type and give you dropdowns for selecting the values.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730152874/image.png" class="kg-image"></figure> <p>This is not at all related to the dataseries part but at least it allows me to easily get a bunch of data into my test environment while writing the rest of the code.</p> <h2>The database</h2> <p>For storing the data I'm using the <code>sqlite3</code> module in Python. I dynamically generate a schema based on the data that's coming in with one table per measurement.</p> <p>There's two kinds of devices on my MQTT broker, some send the data as a JSON blob and some just send single values to various topics.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730153124/image.png" class="kg-image"><figcaption>Example data from the MQTT broker</figcaption></figure> <p>The JSON blobs are still considered a single measurement and all the top-level values get stored in seperate columns. Later in the querying stage the specific column is selected.</p> <p>My worst case is a bunch of ESP devices that measure various unrelated things and output JSON to the topic shown above with JSON. I have a single ingestion rule in my database that grabs <code>devices/hoofdweg/<sensor></code> and dumps it in a table that has the columns for the various sensors, which ends up with a schema like this:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730153341/image.png" class="kg-image"></figure> <p>A timestamp is stored, no consideration is made for timezones since in practically all cases a house isn't located right on a timezone boundary. The tags are stored in seperate columns with a <code>tag_</code> prefix and the fields are stored in column with a <code>field_</code> prefix. The maximum granularity of data is also a single second since I don't store the timestamp as a float.</p> <p>A lot of the queries I do don't need every single datapoint though but instead I just need hourly, daily or monthly data. For that a second table is created with the same structure but with aggregated data:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730153646/image.png" class="kg-image"></figure> <p>This contains a row for every hour with the <code>min()</code>, <code>max()</code> and <code>avg()</code> of every field, it also contains a row for every day and one for every month. This makes it possible to after a preconfigured amount of time just throw away the data that has single-second granularity and keep the aggregated data way longer. For querying you explicitly select which table you want the data from.</p> <h2>The querying</h2> <p>To make the Grafana UI not complain too much I kept the language syntax the same as Prometheus but simply implemented less of the features because I don't use most of them. The supported features right now are:</p> <ul><li>Simple math queries like <code>1+1</code>, this can only do addition queries and is only here to satisfy the Grafana connection tester.</li> <li>Selecting a single measurement from the database and filtering on tags using the braces like <code>my_sensors{sensor=&quot;solar&quot;}</code></li> <li>Selecting a time granularity with brackets like <code>example_sensor[1h]</code>. This only supports <code>1h</code>, <code>1d</code> and <code>1M</code> and selects which rows are queried</li> <li>The aggregate functions like <code>max(my_sensors[1h])</code> which makes it select the columns from the reduced table with the <code>max_</code> prefix for querying when using the reduced table. For selecting the realtime data it will use the SQLite <code>max()</code> function.</li> </ul> <p>This is also just about enough to make the graphical query builder in Grafana work for most cases. The other setting used for the queries is the <code>step</code> value that Grafana calculates and passes to the Prometheus API. For the reduced table this is completely ignored and for the realtime table this is converted to SQL to do aggregation across rows.</p> <p>As an example the query <code>avg(sensors{sensor="solar", _col="voltage"})</code> gets translated to:</p> <div class="highlight"><pre><span></span><span class="k">SELECT</span><span class="w"></span> <span class="w"> </span><span class="n">instant</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="n">tag_sensor</span><span class="p">,</span><span class="w"></span> <span class="w"> </span><span class="k">avg</span><span class="p">(</span><span class="n">field_voltage</span><span class="p">)</span><span class="w"> </span><span class="k">as</span><span class="w"> </span><span class="n">field_voltage</span><span class="w"></span> <span class="k">FROM</span><span class="w"> </span><span class="n">series_sensors</span><span class="w"></span> <span class="k">WHERE</span><span class="w"> </span><span class="n">instant</span><span class="w"> </span><span class="k">BETWEEN</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- Grafana time range</span> <span class="w"> </span><span class="k">AND</span><span class="w"> </span><span class="n">tag_sensor</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="o">?</span><span class="w"> </span><span class="c1">-- solar</span> <span class="k">GROUP</span><span class="w"> </span><span class="k">BY</span><span class="w"> </span><span class="n">instant</span><span class="o">/</span><span class="mi">30</span><span class="w"> </span><span class="c1">-- 30 is the step value from Grafana</span> </pre></div> <p>To get nice aligned hourly data for a bar chart the query simply changes to <code>avg(sensors{sensor="solar", _col="voltage"}[1h])</code> which generates this query:</p> <pre><code>SELECT instant, date, hour, tag_sensor, avg_voltage FROM reduced_sensors WHERE instant BETWEEN ? AND ? -- Grafana time range AND tag_sensor = ? -- solar AND scale = 0 -- hourly</code></pre> <p>This reduced data is generated as background task in the server and makes sure that the row with the aggregate of a single hour selects the datapoints that fit exactly in that hour, not shifted by the local time when querying like I now have issues with in Grafana:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730154759/image.png" class="kg-image"><figcaption>The query running against the old Prometheus database</figcaption></figure> <p>The bars in this chart don't align with the dates because this screenshot wasn't made at midnight. The data in the bars is also only technically correct when viewing the Grafana dashboard at midnight since on other hours it selects data from other days as well. If I view this at 13:00 then I get the data from 13:00 the day before to today which is a bit annoying in most cases and useless in the case of this chart because the <code>daily_total</code> metric in my solar inverter is reset at night and I pick the highest value.</p> <p>For monthly bars this issue gets worse because it's apparently impossible to accurately get monthly data from the timeseries databases I've used. Because I'm pregenerating this data instead of using magic intervals this also Just Works(tm) in my implementation.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1730155171/image.png" class="kg-image"><figcaption>The same sort of query on the Miniseries backend, hourly instead because I don&#x27;t have enough demo data yet.</figcaption></figure> <h2>Is this better?</h2> <p>It is certainly in the prototype stage and has not had enough testing to find weird edgecases. It does provide all the features though I need to recreate by existing home automation dashboard and performance is absolutely fine. The next step here is to implement a feature to lie to Grafana about the date of the data to actually use the heatmap chart to show data from multiple days as multiple rows.</p> <p>Once the kinks are worked out in this prototype it's probably a good idea to rewrite it into something like Go for example because while a lot of the data processing is done in SQLite the first bottleneck will probably be the single-threaded nature of the webserver and the MQTT ingestion code.</p> <p>The source code is online at <a href="https://git.sr.ht/~martijnbraam/miniseries">https://git.sr.ht/~martijnbraam/miniseries</a></p> Making a Linux-managed network switchhttps://blog.brixit.nl/making-a-linux-managed-network-switch/102LinuxMartijn BraamWed, 03 Jul 2024 14:10:04 -0000<p>Network switches are simple devices, packets go in, packets go out. Luckily people have figured out how to make it complicated instead and invented managed switches.</p> <p>Usually this is done by adding a web-interface for configuring the settings and see things like port status. If you have more expensive switches then you'd even get access to some alternate interfaces like telnet and serial console ports.</p> <p>There is a whole second category of managed switches though that people don't initially think of. These are the network switches that are inside consumer routers. These routers are little Linux devices that have a switch chip inside of them, one or more ports are internally connected to the CPU and the rest are on the outside as physical ports.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1719959978/RB2011UiAS-160620170256_160656.png" class="kg-image"><figcaption>Mikrotik RB2011 block diagram from mikrotik.com</figcaption></figure> <p>Here is an example of such a device that actually has this documented. I always thought that the configuration of these switch connected ports was just a nice abstraction by the webinterface but I was suprised to learn that with the DSA and switchdev subsystem in Linux these ports are actually fully functioning "local" network ports. Due to this practically only being available inside integrated routers It's pretty hard to play around with unfortunately.</p> <p>What is shown as a single line on this diagram is actually the connection of the SoC of the router and the switch over the SGMII bus (or maybe RGMII in this case) and a management bus that's either SMI or MDIO. Network switches have a lot of these fun acronyms that even with the full name written out make little sense unless you know how all of this fits together.</p> <p>Controlling your standard off-the-shelf switch using this system simply isn't possible because the required connections of the switch chip aren't exposed for this. So there's only one option left...</p> <h2>Making my own gigabit network switch</h2> <p>Making my own network switch can't be <i>that</i> hard right? Those things are available for the price of a cup of coffee and are most likely highly integrated to reach that price point. Since I don't see any homemade switches around on the internet I guess the chips for those must be pretty hard to get...</p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1719960715/image.png" class="kg-image"></figure> <p>Nope, very easy to get. There's even a datasheet available for these. So I created a new KiCad project and started creating some footprints and symbols.</p> <p>I'm glad there's any amount of datasheet available for this chip since that's not usually the case for Realtek devices, but it's still pretty minimal. I resorted to finding any devices that has schematics available for similar Realtek chips to find out how to integrate it and looking at a lot of documentation for how to implement ethernet in a design at all.</p> <p>The implementation for the chip initially looked very complicated, there's about 7 different power nets it requires and there are several pretty badly documented communication interfaces. After going through other implementations it seem like the easiest way to power it is just connect all the nets with overlapping voltage ranges together and you're left with only needing a 3.3V and 1.1V regulator.</p> <p>The extra communication busses are for all the extra ports I don't seem to need. The switch chip I selected is the RTL8367S which is a very widely used 5-port gigabit switch chip, but it's actually not a 5-port chip. It's a 7 port switch chip where 5 ports have an integrated PHY and two are for CPU connections.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1719961532/image.png" class="kg-image"><figcaption>CPU connection block diagram from the RTL8367S datasheet</figcaption></figure> <p>My plan is different though, while there are these CPU ports available there is actually nothing in the Linux switchdev subsystem that requires the CPU connection to be to those ports. Instead I'll be connecting to port 0 on the switch with a network cable and as far as the switchdev driver knows there's no ethernet PHY in between.</p> <p>The next hurdle is the configuration of the switch chip, there's several configuration systems available and the datasheet does not really describe what is the minimum required setup to actually get it to function as a regular dumb switch. To sum up the configuration options of the chip:</p> <ul><li>There&#x27;s 8 pins on the chip that are read when it&#x27;s starting up. These pins are shared with the led pins for the ports so that makes designing pretty annoying. Switching the setting from pull-up to pull-down also requires the led to be connected in the different orientation.</li> <li>There&#x27;s an i2c bus that can be connected to an eeprom chip. The pins for this are shared with the SMI bus that I require to make this chip talk to Linux though. There is pin configuration to select from one of two eeprom size ranges but does not further specify what this setting actually changes.</li> <li>There&#x27;s a SPI bus that supports connecting a NOR flash chip to it. This can store either configuration registers or firmware for the embedded 8051 core depending on the configuration of the bootup pins. The SPI bus pins are also shared with one of the CPU network ports.</li> <li>There is a serial port available but from what I guess it probably does nothing at all unless there&#x27;s firmware loaded in the 8051.</li> </ul> <p>My solution to figuring out is to just order a board and solder connections differently until it works. I've added a footprint for a flash chip that I ended up not needing and for all the configuration pins I added solder jumpers. I left out all the leds since making that configurable would be pretty hard.</p> <p>The next step is figuring out how to do ethernet properly. There has been a lot of documentation written about this and they all make it sound like gigabit ethernet requires perfect precision engineering, impedance managed boards and a blessing from the ethernet gods themselves to work. This does not seem to match up with the reality that these switches are very very cheaply constructed and seem to work just fine. So I decided to open up a switch to check how many of these coupling capacitors and impedance matching planes are actually used in a real design. The answer seems to be that it doesn't matter that much.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1719962591/image.png" class="kg-image"></figure> <p>This is the design I have ended up with now but it is not what is on my test PCB. I got it almost right the first time though :D</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1719962813/image.png" class="kg-image"></figure> <p>The important parts seem to be matching the pair skew but matching the length of the 4 network pairs is completely useless, this is mainly because network cables don't have the same twisting rate for the 4 pairs and so the length of these are already significantly different inside the cable.</p> <p>The pairs between the transformer and the RJ45 jack has it's own ground plane that's coupled to the main ground through a capacitor. The pairs after the transformer are just on the main board ground fill.</p> <p>What I did wrong on my initial board revision was forgetting the capacitor that connects the center taps of the transformer on the switch side to ground making the signals on that side referenced to board ground. This makes ethernet very much not work anymore so I had to manually cut tiny traces on the board to disconnect that short to ground. In my test setup the capacitor just doesn't exist and all the center taps float. This seems to work just fine but the final design does have that capacitor added.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1720003020/fixed.JPG" class="kg-image"><figcaption>Cut ground traces on the ethernet transformer</figcaption></figure> <p>The end result is this slightly weird gigabit switch. It has 4 ports facing one direction and one facing backwards and it is powered over a 2.54mm pinheader. I have also added a footprint for a USB Type-C connector to have an easy way to power it without bringing out the DuPont wires.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1720007603/IMG_20240626_221246.jpg" class="kg-image"></figure> <h2>Connecting it to Linux</h2> <p>For my test setup I've picked the PINE64 A64-lts board since it has the connectors roughly in the spots where I want them. It not being an x86 platform is also pretty important because configuration requires a device tree change, can't do that on a platform that doesn't use device trees.</p> <p>The first required thing was rebuilding the kernel for the board since most kernels simply don't have these kernel modules enabled. For this I enabled these options:</p> <ul><li><code>CONFIG_NET_DSA</code> for the Distributed Switch Architecture system</li> <li><code>CONFIG_NET_DSA_TAG_RTL8_4</code> for having port tagging for this Realtek switch chip</li> <li><code>CONFIG_NET_SWITCHDEV</code> the driver system for network switches</li> <li><code>CONFIG_NET_DSA_REALTEK</code>, <code>CONFIG_NET_DSA_REALTEK_SMI</code>, <code>CONFIG_NET_DSA_REALTEK_RTL8365MB</code> for the actual switch chip driver</li> </ul> <p>Then the more complicated part was figuring out how to actually get this all loaded. In theory it is possible to create a device tree overlay for this and get it loaded by U-Boot. I decided to not do that and patch the device tree for the A64-lts board instead since I'm rebuilding the kernel anyway. The device tree change I ended up with is this:</p> <pre><code>diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts index 596a25907..10c1a5187 100644 --- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64-lts.dts @@ -18,8 +18,78 @@ led { gpios = &lt;&amp;r_pio 0 7 GPIO_ACTIVE_LOW&gt;; /* PL7 */ }; }; + +switch { + compatible = &quot;realtek,rtl8365rb&quot;; + mdc-gpios = &lt;&amp;pio 2 5 GPIO_ACTIVE_HIGH&gt;; // PC5 + mdio-gpios = &lt;&amp;pio 2 7 GPIO_ACTIVE_HIGH&gt;; // PC7 + reset-gpios = &lt;&amp;pio 8 5 GPIO_ACTIVE_LOW&gt;; // PH5 + realtek,disable-leds; + + mdio { + compatible = &quot;realtek,smi-mdio&quot;; + #address-cells = &lt;1&gt;; + #size-cells = &lt;0&gt;; + + ethphy0: ethernet-phy@0 { + reg = &lt;0&gt;; + }; + + ethphy1: ethernet-phy@1 { + reg = &lt;1&gt;; + }; + + ethphy2: ethernet-phy@2 { + reg = &lt;2&gt;; + }; + + ethphy3: ethernet-phy@3 { + reg = &lt;3&gt;; + }; + + ethphy4: ethernet-phy@4 { + reg = &lt;4&gt;; + }; + }; + + ports { + #address-cells = &lt;1&gt;; + #size-cells = &lt;0&gt;; + + port@0 { + reg = &lt;0&gt;; + label = &quot;cpu&quot;; + ethernet = &lt;&amp;emac&gt;; + }; + + port@1 { + reg = &lt;1&gt;; + label = &quot;lan1&quot;; + phy-handle = &lt;&amp;ethphy1&gt;; + }; + + port@2 { + reg = &lt;2&gt;; + label = &quot;lan2&quot;; + phy-handle = &lt;&amp;ethphy2&gt;; + }; + + port@3 { + reg = &lt;3&gt;; + label = &quot;lan3&quot;; + phy-handle = &lt;&amp;ethphy3&gt;; + }; + + port@4 { + reg = &lt;4&gt;; + label = &quot;lan4&quot;; + phy-handle = &lt;&amp;ethphy4&gt;; + }; + }; +}; }; </code></pre> <p>It loads the driver for the switch with the <code>realtek,rtl8365rb</code>, this driver supports a whole range of Realtek switch chips including the RTL8367S I've used in this design. I've removed the CPU ports from the documentation example and just added the definitions of the 5 regular switch ports.</p> <p>The important part is in <code>port@0</code>, this is the port that is facing backwards on my switch and is connected to the A64-lts, I've linked it up to <code>&emac</code> which is a reference to the ethernet port of the computer. The rest of the ports are linked up to their respective PHYs in the switch chip. </p> <p>In the top of the code there's also 3 GPIOs defined, these link up to SDA/SCL and Reset on the switch PCB to make the communication work. After booting up the system the result is this:</p> <pre><code>1: lo: &lt;LOOPBACK&gt; mtu 65536 qdisc noop state DOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: &lt;BROADCAST,MULTICAST&gt; mtu 1508 qdisc noop state DOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 3 lan1@eth0: &lt;BROADCAST,MULTICAST,M-DOWN&gt; mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 4 lan2@eth0: &lt;BROADCAST,MULTICAST,M-DOWN&gt; mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 5 lan3@eth0: &lt;BROADCAST,MULTICAST,M-DOWN&gt; mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 6 lan4@eth0: &lt;BROADCAST,MULTICAST,M-DOWN&gt; mtu 1500 qdisc noop state DOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff</code></pre> <p>I have the <code>eth0</code> device here like normal and then I have the 4 interfaces for the ports on the switch I defined in the device tree. To make it actually do something the interfaces actually need to be brought online first:</p> <pre><code>$ ip link set eth0 up $ ip link set lan1 up $ ip link set lan2 up $ ip link set lan3 up $ ip link set lan4 up $ ip link 1: lo: &lt;LOOPBACK&gt; mtu 65536 qdisc noop state DOWN qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: eth0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1508 qdisc mq state UP qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 3: lan1@eth0: &lt;NO-CARRIER,BROADCAST,MULTICAST,UP&gt; mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 4: lan2@eth0: &lt;NO-CARRIER,BROADCAST,MULTICAST,UP&gt; mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 5: lan3@eth0: &lt;BROADCAST,MULTICAST,UP,LOWER_UP&gt; mtu 1500 qdisc noqueue state UP qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff 6: lan4@eth0: &lt;NO-CARRIER,BROADCAST,MULTICAST,UP&gt; mtu 1500 qdisc noqueue state LOWERLAYERDOWN qlen 1000 link/ether 02:ba:6f:0c:21:c4 brd ff:ff:ff:ff:ff:ff</code></pre> <p>Now the switch is up you can see I have a cable plugged into the third port. This system hooks into a lot of the Linux networking so it Just Works(tm) with a lot of tooling. Some examples:</p> <ul><li>Add a few of the lan ports into a standard Linux bridge and the switchdev system will bridge those ports together in the switch chip so Linux doesn&#x27;t have to forward that traffic.</li> <li>Thinks like <code>ethtool lan3</code> just work to get information about the link. and with <code>ethtool -S lan3</code> all the standard status return info which includes packets that have been fully handled by the switch.</li> </ul> <h2>Limitations</h2> <p>There's a few things that makes this not very nice to work with. First of all the requirement of either building a custom network switch or tearing open an existing one and finding the right connections. </p> <p>It's not really possible to use this system on regular computers/servers since you need device trees to configure the kernel for this and most computers don't have kernel-controlled GPIO pins available to hook up a switch.</p> <p>As far as I can find there's also no way to use this with a network port on the computer side that's not fixed, USB network interfaces don't have a device tree node handle to refer to to set the conduit port.</p> <p>There is a chance some of these limitations are possible to work around, maybe there's some weird USB device that exposes pins on the GPIO subsystem, maybe there's a way to load switchdev without being on an ARM device but that would certainly take a bit more documentation...</p> Megapixels contributionshttps://blog.brixit.nl/megapixels-contributions/99MegapixelsMartijn BraamSat, 11 May 2024 14:45:17 -0000<p>I've been working on the code that has become libmegapixels for a bit more as a year now. It has taken several thrown-away codebases to come to a general architecture I was happy with and it it has been quite a task to split off media pipeline tasks from the original Megapixels codebase.</p> <p>After staring at this code for many months I thought I've made libmegapixels a nearly perfect little library. That's the problem with working on a codebase without anyone else looking at it.</p> <p>About two weeks ago libmegapixels and the general Megapixels 2.x codebase had it's first contact with external contributors and that has put a spotlight on all the low hanging fruit in documentation and codebase issues. A great example of that is this commit:</p> <div class="highlight"><pre><span></span><span class="gh">diff --git a/src/parse.c b/src/parse.c</span><span class="w"></span> <span class="gh">index bfea3ec..93072d0 100644</span><span class="w"></span> <span class="gd">--- a/src/parse.c</span><span class="w"></span> <span class="gi">+++ b/src/parse.c</span><span class="w"></span> <span class="gu">@@ -403,6 +403,8 @@ libmegapixels_load_file(libmegapixels_devconfig *config, const char *file)</span><span class="w"></span> <span class="w"> </span> config_init(&amp;cfg);<span class="w"></span> <span class="w"> </span> if (!config_read_file(&amp;cfg, file)) {<span class="w"></span> <span class="w"> </span> fprintf(stderr, &quot;Could not read %s\n&quot;, file);<span class="w"></span> <span class="gi">+ fprintf(stderr, &quot;%s:%d - %s\n&quot;,</span><span class="w"></span> <span class="gi">+ config_error_file(&amp;cfg), config_error_line(&amp;cfg), config_error_text(&amp;cfg));</span><span class="w"></span> <span class="w"> </span> config_destroy(&amp;cfg);<span class="w"></span> <span class="w"> </span> return 0;<span class="w"></span> <span class="w"> </span> }<span class="w"></span> </pre></div> <p>A simple patch that massively improves the usablility for people writing libmegapixels config files: Actually printing the parsing errors from libconfig when a file could not be read. Because I generally run libmegapixels through the IDE and have all the syntax highlighting etc set up for the files I simply haven't triggered this codepath enough to actually implement this part.</p> <p>These last two weeks there have also been some significantly more complicated fixes like tracing segfault issues in Megapixels 2.x which helps a lot with getting the new codebase ready for daily use. Figuring out some API issues in libmegapixels like not correctly setting camera indexes in the returned data. Also the config files have now been updated to work with the latest versions of the PinePhone Pro kernel instead of the year old build I've been developing against.</p> <h2>Video recording</h2> <p>I've been saying for a long time that video recording on the PinePhone won't be possible, especially not to the level of support on Android and iOS due to hardware limitations. The only real hope for proper video recording would be that someone gets H.264 hardware encoding to work on the A64 processor.</p> <p>I can happily report that I was wrong. Pavel Machek has made significant progress in PinePhone video recording with a few large contributions that implement the UI bits to add video recording. A new second postprocessing pipeline for running external video encoding scripts just like Megapixels already lets you write your own custom scripts for processing the raw pictures into JPEGs.</p> <p>Video recording is a complicated issue though, mainly due to the sheer amount of data that needs to be processed to make it work smoothly. On the maximum resultion of the sensor in the PinePhone the framerate isn't high enough for recording normal videos (unless you enjoy 15fps video files) but on lower resolutions the pipeline can run at normal video framerates. The maximum framerates from the sensor for this are 1080p at 30fps and 720p at 60fps.</p> <p>For 720p60 the bandwidth of the raw sensor data is 442 Mbps and for 1080P30 this is 497 Mbps. This is a third of the expected bandwidth because the raw sensor data is essentially a greyscale image where every pixels has a different color filter in front. This is too much data to write out to the eMMC or SD card to process later and the PinePhone also struggles already to encode 720p30 video live without even running a desktop environment.</p> <p>There are two implementations of video recording right now. One that saves the raw DNG frames to a tmpfs since RAM is the only thing that can keep up with the data rate. This should give you roughly 30 seconds of video recording capabilities and after that recording time it will take a while to actually encode the video.</p> <p>Pavel has posted an <a href="https://social.kernel.org/notice/AhFxeCMdslrRIhQjE8">example of this video recording</a> on his mastodon.</p> <p>The second way is putting the sensor in a YUV mode instead of raw data. This gives worse picture quality on the sensor in the PinePhone but the data format matches more closely to the way frames are stored in video files so the expensive debayer step can be skipped while video recording. This together with encoding H.264 video with the ultrafast preset should make it just about possible to record real-time encoded video on the PinePhone.</p> <h2>Many thanks</h2> <p>It's great to see contributions to Megapixels 2 and libmegapixels. It's a big step towards getting the Megapixels 2.x codebase production ready and it's simply a lot more fun to work on a project together with other people.</p> <p>It's great to have contributors working on the UI code, the camera support fixes for devices and the many bugfixes to the internals. It's also very helpful to actually have issues created by people building and testing the code on other distributions. This already ironed out a few issues in the build system.</p> <p>There also has been some nice contributions to the Megapixels 1.x codebase, all of those should by now already have been merged into your favorite PinePhone distribution :)</p> <p>The last few Megapixels update blogposts have all been around Megapixels 2.x and the supporting libraries so none of the improvements are immediately usable by actual PinePhone{,Pro} and Librem 5 users until there is an actual release. It will take a bunch more polish until feature parity with Megapixels 1.x is reached.</p> Bootstrapping Alpine Linux without roothttps://blog.brixit.nl/bootstrapping-alpine-linux-without-root/98LinuxMartijn BraamWed, 20 Mar 2024 23:50:30 -0000<p>Creating a chroot in Linux is pretty easy: put a rootfs in a folder and run the <code>sudo chroot /my/folder</code> command. But what if you don't want to use superuser privileges for this?</p> <p>This is not super simple to fix, not only does the <code>chroot</code> command itself require root permissions but the steps for creating the rootfs in the first place and mounting the required filesystems like /proc and /sys require root as well.</p> <p>In pmbootstrap the process for creating an installable image for a phone requires setting up multiple chroots and executing many commands in those chroots. If you have the password timeout disabled in sudo you will notice that you will have to enter your password tens to hundreds of times depending on the operation you're doing. An example of this is shown in the long running "<a href="https://gitlab.com/postmarketOS/pmbootstrap/-/issues/2052#note_966447872">pmbootstrap requires sudo</a>" issue on Gitlab. In this example sudo was called 240 times!</p> <p>Now it is possible with a lot of refactoring to move batches of superuser-requiring commands into scripts and elevate the permissions of that with a single sudo call but to get this down to a single sudo call per pmbootstrap command would be really hard.</p> <h2>Another approach</h2> <p>So instead of building a chroot the "traditional" way what are the alternatives?</p> <p>The magic trick to get this working are user namespaces. From the Linux documentation:</p> <blockquote>User namespaces isolate security-related identifiers and attributes, in particular, user IDs and group IDs (see <a href="https://man7.org/linux/man-pages/man7/credentials.7.html">credentials(7)</a>), the root directory, keys (see <a href="https://man7.org/linux/man-pages/man7/keyrings.7.html">keyrings(7)</a>), and capabilities (see <a href="https://man7.org/linux/man-pages/man7/capabilities.7.html">capabilities(7)</a>). A process's user and group IDs can be different inside and outside a user namespace. In particular, a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace. </blockquote> <p>It basically allows running commands in a namespace where you have UID 0 on the inside without requiring to elevate any of the commands. This does have a lot of limitations though which I somehow all manage to hit with this.</p> <p>One of the tools that makes it relatively easy to work with the various namespaces in Linux is <code>unshare</code>. Conveniently this is also part of <code>util-linux</code> so it's a pretty clean dependency to have.</p> <h2>Building a rootfs</h2> <p>There's enough examples of using <code>unshare</code> to create a chroot without sudo but those all assume you already have a rootfs somewhere to chroot into. Creating the rootfs itself has a few difficulties already though.</p> <p>Since I'm building an Alpine Linux rootfs the utility I'm going to use is <code>apk.static</code>. This is a statically compiled version of the package manager in Alpine which allows building a new installation from an online repository. This is similar to <code>debootstrap</code> for example if you re more used to Debian than Alpine.</p> <p>There's a wiki page on running <a href="https://wiki.alpinelinux.org/wiki/Alpine_Linux_in_a_chroot">Alpine Linux in a chroot</a> that documents the steps required for setting up a chroot the traditional way with this. The initial commands to aquire the <code>apk.static</code> binary don't require superuser at all, but after that the problems start:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>./apk.static -X <span class="si">${</span><span class="nv">mirror</span><span class="si">}</span>/latest-stable/main -U --allow-untrusted -p <span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span> --initdb add alpine-base </pre></div> <p>This creates the Alpine installation in <code>${chroot_dir}</code>. This requires superuser privileges to set the correct permissions on the files of this new rootfs. After this there's two options of populating /dev inside this rootfs which both are problematic:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>mount -o <span class="nb">bind</span> /dev <span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span>/dev <span class="go">mounting requires superuser privileges and this exposes all your hardware in the chroot</span> <span class="gp">$ </span>mknod -m <span class="m">666</span> <span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span>/dev/full c <span class="m">1</span> <span class="m">7</span> <span class="gp">$ </span>mknod -m <span class="m">644</span> <span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span>/dev/random c <span class="m">1</span> <span class="m">8</span> <span class="go">... etcetera, the mknod command also requires superuser privileges</span> </pre></div> <p>The steps after this have similar issues, most of them for <code>mount</code> reasons or <code>chown</code> reasons.</p> <p>There is a few namespace options from <code>unshare</code> used to work around these issues. The command used to run <code>apk.static</code> in my test implementation is this:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>unshare <span class="se">\</span> --user <span class="se">\</span> --map-users<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --map-groups<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --setuid <span class="m">0</span> <span class="se">\</span> --setgid <span class="m">0</span> <span class="se">\</span> --wd <span class="s2">&quot;</span><span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span><span class="s2">&quot;</span> <span class="se">\</span> ./apk-tools-static -X...etc </pre></div> <p>This will use <code>unshare</code> to create a new userns and change the uid/gid inside that to 0. This effectively grants root privileges inside this namespace. But that's not enough.</p> <p>If <code>chown</code> is used inside the namespace it will still fail because my unprivileged user still can't change the permissions of those files. The solution to that is the uid remapping with <code>--map-users</code> and <code>--map-groups</code>. In the example above it sets up the namespace so files created with uid 0 will generate files with the uid 100000 on the actual filesystem. uid 1 becomes 100001 and this continues on for 10000 uids. </p> <p>This again does not completely solve the issue though because my unprivileged user still can't chown those files, doesn't matter if it's chowning to uid 0 or 100000. To give my unprivileged user this permission the <code>/etc/subuid</code> and <code>/etc/subgid</code> files on the host system have to be modified to add a rule. This sadly requires root privileges <i>once</i> to set up this privilege. To make the command above work I had to add this line to those two files:</p> <pre><code>martijn:100000:10000</code></pre> <p>This grants the user with the name <code>martijn</code> the permission to use 10.000 uids starting at uid 100.000 for the purpose of userns mapping.</p> <p>The result of this is that the <code>apk.static</code> command will seem to Just Work(tm) and the resulting files in <code>${chroot_dir}</code> will have all the right permissions but only offset by 100.000.</p> <h2>One more catch</h2> <p>There is one more complication with remapped uids and <code>unshare</code> that I've skipped over in the above example to make it clearer, but the command inside the namespace most likely cannot start.</p> <p>If you remap the uid with <code>unshare</code> you get more freedom inside the namespace, but it limits your privileges outside the namespace even further. It's most likely that the <code>unshare</code> command above was run somewhere in your own home directory. After changing your uid to 0 inside the namespace your privilege to the outside world will be as if you're uid 100.000 and that uid most likely does not have privileges. If any of the folders in the path to the executable you want <code>unshare</code> to run for you inside the namespace don't have the read and execute bit set for the "other" group in the unix permissions then the command will simply fail with "Permission denied".</p> <p>The workaround used in my test implementation is to just first copy the executable over to <code>/tmp</code> and hope you at least still have permissions to read there.</p> <h2>Completing the rootfs</h2> <p>So after all that the first command from the Alpine guide is done. Now there's only the problems left for mounting filesystems and creating files.</p> <p>While <code>/etc/subuid</code> does give permission to use a range of uids as an unprivileged user with a user namespace it does not give you permissions to create those files outside the namespace. So the way those files are created is basically the complicated version of <code>echo "value" | sudo tee /root/file</code>: </p> <div class="highlight"><pre><span></span><span class="gp">$ </span><span class="nb">echo</span> <span class="s2">&quot;nameserver a.b.c.d&quot;</span> <span class="p">|</span> unshare <span class="se">\</span> --user <span class="se">\</span> --map-users<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --map-groups<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --setuid <span class="m">0</span> <span class="se">\</span> --setgid <span class="m">0</span> <span class="se">\</span> --wd <span class="s2">&quot;</span><span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span><span class="s2">&quot;</span> <span class="se">\</span> sh -c <span class="s1">&#39;cat &gt; /etc/resolv.conf&#39;</span> </pre></div> <p>This does set-up and tear down the entire namespace for every file change or creation which is a bit inefficient, but inefficient is still better than impossible. Changing file permissions is done in a similar way.</p> <p>To fix the mounting issue there's the mount namespace functionality in Linux. This allows creating new mounts inside the namespace as long as you still have permissions on the source file as your unprivileged user. This effectively means you can't use this to mount random block devices but it works great for things like <code>/proc</code> and loop mounts.</p> <p>There is a <code>--mount-proc</code> parameter that will tell <code>unshare</code> to set-up a mount namespace and then mount <code>/proc</code> inside the namespace at the right place so that's what I'm using. But I still need other things mounted. This mounting is done as a small inline shell script right before executing the commands inside the chroot:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>unshare <span class="se">\</span> --user <span class="se">\</span> --fork <span class="se">\</span> --pid <span class="se">\</span> --mount <span class="se">\</span> --mount-proc <span class="se">\</span> --map-users<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --map-groups<span class="o">=</span><span class="m">10000</span>,0,10000 <span class="se">\</span> --setuid <span class="m">0</span> <span class="se">\</span> --setgid <span class="m">0</span> <span class="se">\</span> --wd <span class="s2">&quot;</span><span class="si">${</span><span class="nv">chroot_dir</span><span class="si">}</span><span class="s2">&quot;</span> <span class="se">\</span> -- <span class="se">\</span> sh -c <span class="s2">&quot; \</span> <span class="s2"> mount -t proc none proc ; \</span> <span class="s2"> touch dev/zero ; \</span> <span class="s2"> mount -o rw,bind /dev/zero dev/zero ;\</span> <span class="s2"> touch dev/null ; \</span> <span class="s2"> mount -o row,bind /dev/null dev/null ;\</span> <span class="s2"> ...</span> <span class="go"> chroot . bin/sh \</span> <span class="go"> &quot;</span> </pre></div> <p>The mounts are created right between setting up the namespaces but before the chroot is started so the host filesystem can still be accessed. The working directory is set to the root of the rootfs using the <code>--wd</code> parameter of <code>unshare</code> and then bind mounts are made from <code>/dev/zero</code> to <code>dev/zero</code> to create those devices inside the rootfs.</p> <p>This combines the two impossible options to make it work. <code>mknod</code> can still not work inside namespaces because it is a bit of a security risk. <code>mount</code>'ing /dev gives access to way too many devices that are not needed but the mount namespace does allow bind-mounting the existing device nodes one by one and allows me to filter them.</p> <p>Then finally... the <code>chroot</code> command to complete the journey. This has to refer to the rootfs with a relative path and this also depends on the working directory being set by <code>unshare</code> since host paths are breaking with uid remapping.</p> <h2>What's next?</h2> <p>So this creates a full chroot without superuser privileges (after the initial setup) and this whole setup even works perfectly with having cross-architecture chroots in combination with <code>binfmt_misc</code>. </p> <p>Compared to <code>pmbootstrap</code> this codebase does very little and there's more problems to solve. For one all the filesystem manipulation has to be figured out to copy the contents of the chroot into a filesystem image that can be flashed. This is further complicated by the mangling of the uids in the host filesystem so it has to be remapped while writing into the filesystem again.</p> <p>Flashing the image to a fastboot capable device should be pretty easy without root privileges, it only requires an udev rule that is usually already installed by the android-tools package on various Linux distributions. For the PinePhone flashing happens on a mass-storage device and as far as I know it will be impossible to write to that without requiring actual superuser privileges.</p> <p>The code for this is in the <a href="https://git.sr.ht/~martijnbraam/ambootstrap">~martijnbraam/ambootstrap</a> repository, hopefully in some time I get this to actually write a plain Alpine Linux image to a phone :D</p> <p></p> The dilemma of tagging library releaseshttps://blog.brixit.nl/the-dilemma-of-tagging-library-releases/94MegapixelsMartijn BraamSun, 14 Jan 2024 16:11:17 -0000<p>I've been working on the libmegapixels library for quite a bit now. The base of the library is pretty solid which is configuring a V4L2 pipeline so you can get camera frames on modern ARM platforms. Most of the work on the library side is figuring the AWB/AE/AF code and how that will fit together with applications.</p> <p>Due to the AAA code not working yet and the API not being being fully defined on how those parts will fit together I've been holding of on tagging an actual release on the libmegapixels library.</p> <p>A lot of my projects, especially libraries, are written in Python so I've long enjoyed the luxury of APIs being duck-typed and having the possibility of adding optional arguments to methods in the future. Sadly in C libraries I can't get away with never defining the types for arguments that might change in the future or adding optional arguments.</p> <p>My original plan was to tag a release on libmegapixels together with the first 2.x release of Megapixels since these pieces of software are intended to fit together but after thinking about it some more (and some convincing from other people interested in the libmegapixels release) I've decided to tag a 0.1 release.</p> <p>In an ideal world I can just release code when it's fully done and tested. In this case the long time it takes to get everything ready for use will mean that potential contributors to the code will also be held back from experimenting with the codebase. Especially since a large part of libmegapixels is the config files it ships for specific hardware configurations. If I wouldn't make any releases then at some point users/developers will be forced to just ship random git commits which is a way worse situation to be in for bug tracking.</p> <p>With this 0.1 release I want to make it possible to start writing config files for various phones and platforms to test camera pipelines. Hopefully this will also mean any issues with the configuration file format that people might hit will be figured out before I have to tag a "final" 1.x release.</p> <h2>The release</h2> <p>So the initial tagged release of <code>libmegapixels</code>:</p> <ul><li>located at <a href="https://gitlab.com/megapixels-org/libmegapixels/-/tags/0.1.0">https://gitlab.com/megapixels-org/libmegapixels/-/tags/0.1.0</a></li> <li>Build instructions at <a href="https://libme.gapixels.me/building.html">https://libme.gapixels.me/building.html</a></li> <li>Comes with absolutely no guarantee of stability for the C api of the library</li> <li>Most likely the config file format is stable but might have small tweaks before the 1.x release</li> </ul> <p>Hopefully this will allow people to start experimenting with the codebase and generate some feedback on it so I'm not just developing this for months and completely overfitting it to the three devices I'm testing on.</p> <p>I'm planning to make a similar release for <code>libdng</code> soon. That library is also mostly stable but I need to fix up the last parts of the API to allow reading and writing all the required metadata.</p> The MNT keyboard reviewedhttps://blog.brixit.nl/the-mnt-keyboard-reviewed/92LinuxMartijn BraamTue, 19 Dec 2023 23:59:01 -0000<p>MNT Research is one of those few companies that actually releases open source hardware. Instead of just getting a <a href="https://mntre.com/documentation/reform-keyboard-v3-manual.pdf">schematic</a> with your hardware (which is great even by itself) there's the full <a href="https://source.mnt.re/reform/reform/-/tree/master/reform2-keyboard3-pcb">sources for that schematic</a>, the Kicad parts libraries, the sources for the firmware and even documentation how to use that code.</p> <p>I received my <a href="https://shop.mntre.com/products/mnt-reform-keyboard-30">MNT Standalone Keyboard V3</a> a few days ago so I've been typing on it now for a bit. This is all happening while I'm recovering from covid so I hope if I read back this post in a few days it is actually somewhat coherent :)</p> <p>This being a more niche product sadly does make it a bit on the expensive side. But I must say this is by far the most solid keyboard I've owned. My main keyboard on my desktop is an Das Keyboard 4 ultimate. It's a nice keyboard but it doesn't compare to the full machined aluminium frame on the MNT keyboard.</p> <p>The whole keyboard is mounted on what's basically a 4mm slab of aluminium which has a nice MNT logo machined on it on the bottom</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703022648/20231219_0011.jpg" class="kg-image"></figure> <p>This makes the keyboard feel incredibly solid, even with the rest of the frame taken off it's practically impossible to even bend the keyboard. The second half of the frame is the top edge that screws on the base plate with 8 screws.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703022719/20231219_0012.jpg" class="kg-image"></figure> <p>This is another very carefully designed aluminium part In the close-up above you can see the opening for the USB-C connection for the keyboard and the internal cutouts for the display daughterboard with the screw mounting.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703027123/20231219_0021.jpg" class="kg-image"></figure> <h2>The electronics</h2> <p>This keyboard is based around an Atmega32U4 microcontroller. This is the same keyboard PCB as what's shipped in the MNT Reform laptop so there are two connectors on this board. The USB-C connector is what's exposed on the standalone keyboard and the laptop presumably uses the USB header that's beside it.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703023510/20231219_0015.jpg" class="kg-image"></figure> <p>Beside the USB header is one of the dip switches. SW36 is labeled "STANDALONE" here. This switches the board to use USB power instead of the 3.3V supplied by the laptop mainboard. The ribbon connector is the connection to the OLED display board.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703026316/20231219_0016.jpg" class="kg-image"></figure> <p>On the left side of the display board there is an empty footprint for the standard Atmega programming header and a serial port that's used to connect to the laptop mainboard. Additionally there's a reset button and SW84 which has the confusing label "RG".</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703026421/image.png" class="kg-image"></figure> <p>Thanks to the schematics being available in the manual it's easy to find that this is the switch to enable programming. The rest of the interesting parts is hidden somewhere below the display board or on the bottom side of the PCB possibly. I have not taken the keyboard further apart for this review since all the information I'd ever want is already available in the schematics. The keyboard matrix itself is read out by the Atmega directly which provides the full keyboard functionality and the OLED display is on a small daughterboard to slightly rise it towards the front bezel.</p> <h2>Firmware</h2> <p>Since this is one of the 8-bit Atmel parts it's very easy to build firmware using the gcc-avr compiler packaged in various distributions. All the source files are stored in the <a href="https://source.mnt.re/reform/reform/-/tree/master/reform2-keyboard-fw">firmware repository</a> for the various MNT products.</p> <p>Checking the version of the firmware is pretty easy. With the circle key on the top-right corner of the keyboard the menu on the display opens. You can use the arrow keys to browse to the "System Status" option or just press the "s" key on the keyboard.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703027504/20231220_0008.jpg" class="kg-image"></figure> <p>Which shows the hardware revision this firmware was build for and the version that was specified when building:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703027858/20231220_0018.jpg" class="kg-image"></figure> <p>It seems like the "g" at the start of the commit has was accidental here and it refers to commit <code>7e73483</code> in the firmware repository. This seems to be the newest tag when the keyboard was shipped so that makes sense.</p> <p>So lets change something! The key in the bottom left corner of the keyboard is the Hyper key instead of Ctrl as you'd expect from most keyboards. The Ctrl key is moved in place of the Caps Lock button on normal keyboard layouts which is great for a lot of uses. I never use Hyper though so I want to change that key to be my second Ctrl key.</p> <p>The readme specifies that the keyboard layout is defined in the various <code>matrix_*</code> files so after reading around a bit it seems like I have to edit <code>matrix_3.h</code> for my keyboard.</p> <p>Reading the manual again I realized that doing this makes me lose access to the media keys since those are defined as "Hyper+F*" for the various media actions. To fix that I changed the right control button into the Hyper key, this is the button with the three dots on it. My resulting code change:</p> <pre><code>diff --git a/reform2-keyboard-fw/matrix_3.h b/reform2-keyboard-fw/matrix_3.h index bb72f6d..f9db133 100644 --- a/reform2-keyboard-fw/matrix_3.h +++ b/reform2-keyboard-fw/matrix_3.h @@ -25,7 +25,7 @@ // Sixth row #define MATRIX3_DEFAULT_ROW_6 \ - HID_KEYBOARD_SC_EXECUTE,\ + HID_KEYBOARD_SC_LEFT_CONTROL,\ HID_KEYBOARD_SC_LEFT_GUI,\ HID_KEYBOARD_SC_LEFT_ALT,\ KEY_SPACE,\ @@ -33,7 +33,7 @@ KEY_SPACE,\ KEY_SPACE,\ HID_KEYBOARD_SC_RIGHT_ALT,\ - HID_KEYBOARD_SC_RIGHT_CONTROL,\ + HID_KEYBOARD_SC_EXECUTE,\ HID_KEYBOARD_SC_LEFT_ARROW,\ HID_KEYBOARD_SC_DOWN_ARROW,\ HID_KEYBOARD_SC_RIGHT_ARROW </code></pre> <p>Now to build this there's a simple Makefile. Since I've already programmed Atmega parts on this machine I already have the compiler installed making this very quick and easy.</p> <p>I ended up compiling with the following command:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>make <span class="nv">REFORM_KBD_OPTIONS</span><span class="o">=</span><span class="s2">&quot;-DKBD_VARIANT_3 -DKBD_MODE_STANDALONE -DKBD_FW_VERSION=\\\&quot;Martijn\\\&quot;&quot;</span> </pre></div> <p>This is straight from the readme with an additional define to set the firmware version to "Martijn". After building this I got the <code>keyboard.hex</code> file that can be flashed.</p> <p>The flashing is as simple as running the <code>flash.sh</code> script. This will instruct you to press "Circle + X" to enter flashing mode and then run the neccesary commands to flash the keyboard. After running this I noticed that the delete key on the keyboard was no longer a delete key. It turns out I don't have <code>VARIANT_3</code> but instead <code>VARIANT_3_US</code>. A quick rebuild and reflash also fixes that.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1703029671/20231220_0021.jpg" class="kg-image"><figcaption>The brightness differences on the display are a camera artifact</figcaption></figure> <p>Tadaa! My own name in the firmware version field. It's super easy to mess with this firmware.</p> <h2>The keyboard itself</h2> <p>Well the keyboard works just fine as a keyboard. Typing on this keyboard takes a few minutes to get used to compared to my normal keyboard since all the keys are slightly closer together. The split spacebar is also annoying me a bit. It turns out that the left split in the spacebar is <i>exactly</i> the spot where I normally hit the spacebar with my thumb.</p> <p>The switches are nice and clicky (but silent, I have the version with brown switches in them). Overall the keyboard just does what it needs to. The standard layout is quite unusual but everything can be changed with open firmware so I'm confident I can get to a layout I'm 100% happy with.</p> <h2>Conclusion</h2> <p>This is an extremely solid and very compact keyboard I can easily throw in my backpack. It being an USB-C keyboard makes it fit neatly with all my other random cables I usually take with me.</p> <p>It might be slightly more expensive than similar keyboards, but I don't know of similar keyboards with a case this rugged and the display functionality (I forgot to mention you can use HID reports from the host to write custom content to the display from your computer). The openness of this product makes the extra cost certainly worth it for me.</p> <p>I'll probably be messing with the firmware for this keyboard a bit more while I use it. There's some small things to fix like the device reporting the name "LUFA Keyboard Demo Application" in Linux instead of a neater "MNT Keyboard" or something.</p> Looking closer at the sysloghttps://blog.brixit.nl/looking-closer-at-the-syslog/91LinuxMartijn BraamMon, 11 Dec 2023 15:43:17 -0000<p>The syslog protocol, it's one of the ancient protocols in the Unix world. For a long time the logging was handled by daemons like syslog-ng and rsyslog, this has now been taken over by journald on a lot of systems. But have you ever wondered how your log messages even end up in <code>/var/log</code> in the first place?</p> <p>I've started looking into syslog implementations when building a replacement for the use of busybox syslogd in postmarketOS. In postmarketOS this daemon is configured to just send syslog messages to a in-memory buffer for logging and never store anything on disk in <code>/var/log</code>. This is mainly to make sure there's no unneeded writes to the flash storage in a lot of the old phones that are supported by postmarketOS. There's a few downsides to this logging implementation though:</p> <ul><li>No persistent logging of system messages across reboots. This would be easy to check if certain log messages were present on earlier boots when debugging.</li> <li>Completely unstructured logs while people are pretty much used to journald logging with nice filters</li> </ul> <p>As a replacement I wrote <code>logbookd</code>. It's a tiny syslog daemon that supports disk and memory logging and provides some nice filtering options to be closer to journald. The bulk of this work is handled by doing structured logging into SQLite.</p> <h2>So how does the syslog work</h2> <p>The way the syslog works is incredibly simple. The syslog daemon opens an unix domain socket at <code>/dev/log</code>. Applications connect to this socket and write log messages in the syslog format and the syslog daemon takes care of filtering those out and putting it in the various files in /var/log.</p> <p>The complication of this is that there is no real syslog protocol. There are two standards for it though. There is <a href="https://datatracker.ietf.org/doc/html/rfc3164">RFC 3164</a> and <a href="https://datatracker.ietf.org/doc/html/rfc5424">RFC 5424</a> which both describe the syslog protocol. The 3164 document was only created in 2001 and describes what various implementations are doing in the wild. It's RFC 5424 that actually nails down a specific format.</p> <p>I wrote parser for the 5424 format initially since that's the newest standard and it's by far the easiest to parse. An RFC 5424 log message looks like this:</p> <pre><code>&lt;13&gt;1 2023-12-11T14:56:59.0189+01:00 laptop test - - [timeQuality tzKnown=&quot;1&quot;] Hi</code></pre> <p>The first part here between the angular brackets is the PRI value. It encodes the logging facility and severity as one number. The least significant 3 bits encode the severity on a scale of 1-8 and the other bits encode one of the 23 facilities that are defined. Some examples of the facilities are:</p> <ul><li><code>0</code> for kernel messages</li> <li><code>1</code> for generic userspace messages</li> <li><code>2</code> for the mail system</li> </ul> <p>Most of the other numbers are for more old services like UUCP and FTP and for some numbered user-defined codes. In the example above the 13 means facility 1 (user) and severity 5 (notice).</p> <p>The other parts of this message are in order:</p> <ul><li>The protocol version number which is set to <code>1</code> here.</li> <li>A timestamp with timezone for the log message</li> <li>The <code>laptop</code> is the hostname for the message, this will be set to <code>-</code> when NULL</li> <li>The <code>test</code> part is the application name. This can also be <code>-</code> for NULL</li> <li>The next field is called PROCID and is set to <code>-</code> for NULL in my case. According to the standard it might be used for the pid but is mostly implementation defined.</li> <li>The second null is the MSGID, it can define a message type from the specific service, it will also be null in most cases.</li> <li>The next part is <code>[timeQuality tzKnown=&quot;1&quot;]</code> which is the STRUCTURED-DATA field. It can contain any implementation defined data. This is a subset of the structured data created by the <code>logger</code> tool used to create test messages. This field can also be just <code>-</code> for no structured data.</li> <li>Finally the actual log message. That&#x27;s just <code>Hi</code> in this case.</li> </ul> <p>Writing a parser for this format is relatively straightforward. In the logbookd implementation there's a row for every one of these fields in the logging table and the message is split up according to these rules.</p> <p>There is a fatal flaw in the RFC 5424 specification though: nobody is using it. None of the log messages on my running systems are actually in this format. It looks like practically all software uses RFC 3164, which is a fancy way of saying they do whatever they want.</p> <h2>RFC 3164</h2> <p>So this is actually the true specification for syslog messages being used in the wild. Let's look at one of these messages:</p> <pre><code>&lt;13&gt;Dec 11 15:21:50 laptop test: Hi</code></pre> <p>It's a lot simpler! But not actually. This is a pretty minimal message. The initial part is the same as the RFC 5424 message, the PRI is luckily parsed the exact same way. There is no version indicator though and it does not use an ISO timestamp format.</p> <p>The more problematic issues with this format though is that it does support a lot more data but it's pretty badly defined. Even all the parts shown in the example above are optional. The most minimal syslog message that is still up to this spec is <code>Hi</code>.</p> <p>It's also somewhat valid to send messages with a badly formatted timestamp and it's up to the syslogd to fix up the timestamp in the message. This also makes it very easy to make it actually parse parts of the timestamp as the hostname since this is all badly defined and space separated.</p> <p>Since there is no official field for the pid of a process this is usually appended to the application name in square brackets.</p> <p>The logbookd implementation is mostly based on the way these old messages are parsed in rsyslog and tries to not guess parts. This means only the timestamp, app, hostname and message fields are filled in.</p> <h2>Kernel logging</h2> <p>Not all logging in the system comes from userspace. On Linux there's also the kernel log ringbuffer that can be read from <code>/dev/kmsg</code>. Reading from this file will return all the log messages in the kernel ringbuffer and also makes it possible to stream new log messages with further reads. The log messages from the kernel are in a similar but different format than the syslog socket:</p> <pre><code>6,1004,5150172365,-;hid-generic 0003: hiddev96,hidraw2: USB HID device on usb-0000:00:14 SUBSYSTEM=hid DEVICE=+hid:0003</code></pre> <p>The first field in the kernel message is again the PRI. This follows the same numbering as the syslog RFCs but it's not in angular brackets this time. In this case it's facility 0 (the kernel) and severity 6 (info). The second field is the KSEQ. This is a number that counts up for every log message since boot. The logbookd implementation uses this to de-duplicate the kernel log messages after opening the file since it will always return the old kernel log messages first.</p> <p>After that comes the timestamp. Instead of string parsing this is a straight up unix timestamp so it's way easier to deal with. The field after the timestamp is <code>-</code> indicating NULL, this is the flags field.</p> <p>After the semicolon the actual kernel log message starts. This is the message as is rendered in the <code>dmesg</code> utility. After the log message there's a newline but the log line doesn't end there! The structured data is defined as indented continuation lines after the message itself and this contains some easier machine-parsable data that is usually hidden in <code>dmesg</code>. </p> <h2>Systemd journald</h2> <p>So everything changed when journald was introduced. Figuring out how this all works involves diving into the systemd source code. Systemd provides several unix sockets related to logging in <code>/var/run/systemd/journal</code>:</p> <ul><li><code>dev-log</code> this is symlinked to <code>/dev/log</code> and receives syslog formatted lines and writes it to the journal</li> <li><code>stdout</code> is a socket that receives logs from systemd units. This is what the <code>systemd-cat</code> command connects to. It writes a header on connection to give the application metadata and then the stdout or stderr is just connected straight to this socket.</li> <li><code>socket</code> receives the log messages in the binary journald format</li> </ul> <p>There is a few other fancy things that journald does. It is possible to filter your log messages with the <code>--boot</code> argument. If no argument is supplied it will only show messages from the current boot. If you specify a negative number it's possible to get only log messages from a specific previous bootup.</p> <p>The way this is done is by reading from <code>/proc/sys/kernel/random/boot_id</code>. This is a value generated by the kernel on bootup. It is a UUID generated from random data. These are also the values you see when you run <code>journalctl --list-boots</code>. The BOOT_ID value shown there is this UUID with the dashes removed.</p> <p>My logbookd implementation also reads the boot_id on startup and stores it with the logs, this allows filtering in the exact same way with the logread <code>-b</code> parameter.</p> <h2>Logging to a database</h2> <p>So the main departure journald and logbookd do from the older syslog daemons is that they don't log to plain-text files. Journald has a custom database format the messages are stored in and logbookd stores messages in an SQLite database.</p> <p>Structured logging to a database has a few nice upsides. The main one is being able to do way more detailed filtering than what is reasonably possible with grep. It's a lot easier to filter on a specific date and time range in a database and due to database indexes this is still fast.</p> <p>One of the other main reasons for using SQLite in logbookd is that the implementation in postmarketOS was configured to only log to memory. Using SQLite as logging back-end meant that it's easily possible to replicate this by writing to an in-memory database which is already supported by SQLite.</p> <p>The final thing added to logbookd is the middle ground between in-memory and on-disk logging: the reduced writes mode. In this mode the syslog is written to an in-memory database but when receiving a SIGINT, SIGTERM or SIGUSR1 signal the logbookd daemon will open the on-disk database and lets SQLite do a database migration. This means that SQLite will append the write the new loglines to the disk without rewriting all the existing logs there. On bootup this database is restored again so the logging system behaves as-if it's configured to do normal on-disk logging.</p> <h2>You can use this now</h2> <p>If you're running postmarketOS edge and you have updated to the latest version your installation should've migrated to the logbookd logging daemon. The <code>logread</code> utility implements the common options the busybox logread command already had. For normal use this means that there's not much difference except that the log output from the logread utility is now colored and contains kernel logs.</p> <p>Some examples of the new things that are now possible:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>logread -b list <span class="go">ID BOOT ID FIRST ENTRY LAST ENTRY</span> <span class="go"> 0 05c3f283-3bae-4b2a-8431-210dd63310e0 Dec 11 16:33:59 Dec 11 16:34:05</span> <span class="go">-1 f3ea2fa1-6f9e-4e82-bd0b-201091fcb5b4 Dec 07 18:21:06 Dec 07 18:25:50</span> <span class="gp">$ </span>logread -b <span class="m">0</span> -n <span class="m">2</span> <span class="go">[Dec 11 16:35:09] daemon dleyna-renderer-service[18060]: Client :1.166 lost</span> <span class="go">[Dec 11 16:35:10] daemon dleyna-renderer-service[18060]: dLeyna: Exit</span> <span class="gp">$ </span>logread -b <span class="m">1</span> -n <span class="m">1</span> <span class="go">[Dec 07 18:25:18] kern kernel: perf: interrupt took too long (2531 &gt; 2500), lowering kernel.perf_event_max_sample_rate to 79000</span> <span class="gp">$ </span>logread -b -1 -u logbookd -n <span class="m">1</span> <span class="go">[Dec 07 18:25:37] syslog logbookd: Ready to process log messages</span> </pre></div> <p>Being able to see interleaved kernel and userspace messages also makes certain scenarios a lot easier to debug.</p> <p>Hopefully this makes a few things easier to debug. There's a bunch of software that also logs directly into <code>/var/log</code> in seperate files, this has not been replaced by logbookd and is also not directly query-able by this new system. For the rest of the log messages enjoy the new colors :)</p> Megapixels 2.0: DNG exportinghttps://blog.brixit.nl/megapixels-2-0-dng-exporting/89MegapixelsMartijn BraamSat, 18 Nov 2023 14:17:38 -0000<p>It seems overkill to make a whole seperate library dedicated to replacing 177 lines of code in Megapixels that touches libtiff, but this small section of code causes significant issues for distribution packaging and compatability with external photo editing software. Most importantly the adjusted version in Millipixels for the Librem 5 does not output DNG files that are close enough to the Adobe specifications to be loaded into the calibration software.</p> <p>Making this a seperate library would make it easier to test. In the Adobe DNG SDK there is a test utility that can verify if a TIFF file is up to DNG spec and it can (with a lot of complications) be build for Linux.</p> <h2>The spec</h2> <p>The first thing after copying over the code block from Megapixels to a seperate project is reading the Adobe DNG specification.</p> <p>When I wrote the original export code in Megapixels it was based around some example code I found on Github for using Libtiff that I can no longer find and it results in something that's close enough to a valid DNG file for the <code>dcraw</code> utility. This is also a DNG 1.0 file that is generated.</p> <p>I have spend the next day reading the <a href="https://www.kronometric.org/phot/processing/DNG/dng_spec_1.4.0.0.pdf">DNG 1.4 specification</a> from Adobe to understand what a valid DNG file is absolutely minimally required to have. These are my notes from that:</p> <div class="highlight"><pre><span></span><span class="gu">## Inside a DNG file</span> <span class="k">*</span> SubIFDType 0 is the original raw data <span class="k">*</span> SubIFDType 1 is the thumbnail data <span class="k">*</span> The recommendation is to store the thumbnail as the first IFD <span class="k">*</span> TIFF metdata goes in the first IFD <span class="k">*</span> EXIF tags are preferred <span class="k">*</span> Camera profiles are stored in the first IFD <span class="gu">## Required tags</span> <span class="k">*</span> DNGVersion <span class="k">*</span> UniqueCameraModel </pre></div> <h2>Validation</h2> <p>I also spend a long time to build the official Adobe DNG SDK. This is mostly useless for developing any open source software due to licensing but it does provide a nice <code>dng_validate</code> utility that can be used to actually test the DNG files. Building this utility is pretty horrifying since it requires some specific versions of dependencies and some patches to work on modern compilers.</p> <p>The libdng codebase now has the <a href="https://gitlab.com/megapixels-org/libdng/-/blob/master/adobe_dng_sdk.sh">adobe_dng_sdk.sh</a> script that will build the required libraries and the validation binary.</p> <p>with the Megapixels code adjusted with the info from the documentation above I fed some random noise as data to the library to generate a DNG file and run it through the validator.</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>dng_validate out.dng <span class="go">Validating &quot;out.dng&quot;...</span> <span class="go">*** Warning: This file has Chained IFDs, which will be ignored by DNG readers ***</span> <span class="go">*** Error: Unable to find main image IFD ***</span> </pre></div> <p>Well that's not a great start... There's also a <code>-v</code> option to get some more verbose info</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>dng_validate -v out.dng <span class="go">Validating &quot;out.dng&quot;...</span> <span class="go">Uses little-endian byte order</span> <span class="go">Magic number = 42</span> <span class="go">IFD 0: Offset = 308, Entries = 10</span> <span class="go">NewSubFileType: Preview Image</span> <span class="go">ImageWidth: 20</span> <span class="go">ImageLength: 15</span> <span class="go">BitsPerSample: 8</span> <span class="go">Compression: Uncompressed</span> <span class="go">PhotometricInterpretation: RGB</span> <span class="go">StripOffsets: Offset = 8</span> <span class="go">StripByteCounts: Count = 300</span> <span class="go">DNGVersion: 1.4.0.0</span> <span class="go">UniqueCameraModel: &quot;LibDNG&quot;</span> <span class="go">NextIFD = 10042</span> <span class="go">Chained IFD 1: Offset = 10042, Entries = 6</span> <span class="go">NewSubFileType: Main Image</span> <span class="go">ImageWidth: 320</span> <span class="go">ImageLength: 240</span> <span class="go">Compression: Uncompressed</span> <span class="go">StripOffsets: Offset = 441</span> <span class="go">StripByteCounts: Count = 9600</span> <span class="go">NextIFD = 0</span> <span class="go">*** Warning: This file has Chained IFDs, which will be ignored by DNG readers ***</span> <span class="go">*** Error: Unable to find main image IFD ***</span> </pre></div> <p>Let's have a look at what the DNG spec says about this:</p> <blockquote>DNG recommends the use of SubIFD trees, as described in the TIFF-EP specification. SubIFD chains are not supported.<br><br>The highest-resolution and quality IFD should use NewSubFileType equal to 0. Reduced resolution (or quality) thumbnails or previews, if any, should use NewSubFileType equal to 1 (for a primary preview) or 10001.H (for an alternate preview). <br><br>DNG recommends, but does not require, that the first IFD contain a low-resolution thumbnail, as described in the TIFF-EP specification.</blockquote> <p>So I have the right tags and the right IFDs but I need to make an IFD tree instead of chain in libtiff. I have no idea how IFD trees work so up to the next specification!</p> <p>It seems like TIFF trees are defined in the Adobe PageMaker 6 tech notes from 1995. That document describes that the NextIFD tag that libtiff used for me is used primarily for defining multi-page documents, not multiple encodings of the same document like what happens here with a thumbnail and the raw data. You know this is a 1995 spec because it gives a Fax as example of a multi-page document.</p> <p>In the examples provided in that specification the first image is the main image and the NextIFD tag is just replaced by a subIFD tag. In case of DNG the main image is the thumbnail for compatibility with software that can't read the raw camera data.</p> <p>Switching over to a SubIFD tag is suprisingly simple, just badly documented. Libtiff will create the NextIFD tag automatically for you but if you create an empty SubIFD tag then libtiff will fill in the offset for the next IFD for you when closing the file:</p> <div class="highlight"><pre><span></span><span class="n">TIFF</span><span class="w"> </span><span class="o">*</span><span class="n">tif</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">TIFFOpen</span><span class="p">(</span><span class="n">path</span><span class="p">,</span><span class="w"> </span><span class="s">&quot;w&quot;</span><span class="p">);</span><span class="w"></span> <span class="c1">// Set the tags for IFD 0 like normal here</span> <span class="n">TIFFSetField</span><span class="p">(</span><span class="n">tif</span><span class="p">,</span><span class="w"> </span><span class="n">TIGTAG_SUBFILETYPE</span><span class="p">,</span><span class="w"> </span><span class="n">DNG_SUBFILETYPE_THUMBNAIL</span><span class="p">);</span><span class="w"></span> <span class="c1">// Create a NULL reference for one SubIFD</span> <span class="kt">uint64_t</span><span class="w"> </span><span class="n">offsets</span><span class="p">[]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="p">{</span><span class="w"> </span><span class="mf">0L</span><span class="w"> </span><span class="p">};</span><span class="w"></span> <span class="n">TIFFSetField</span><span class="p">(</span><span class="n">tif</span><span class="p">,</span><span class="w"> </span><span class="n">TIFFTAG_SUBIFD</span><span class="p">,</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w"> </span><span class="o">&amp;</span><span class="n">offsets</span><span class="p">);</span><span class="w"></span> <span class="c1">// Write the thumbnail image data here</span> <span class="c1">// Close the first IFD</span> <span class="n">TIFFWriteDirectory</span><span class="p">(</span><span class="n">tif</span><span class="p">);</span><span class="w"></span> <span class="c1">// Start IFD1 describing the raw data</span> <span class="n">TIFFSetField</span><span class="p">(</span><span class="n">tif</span><span class="p">,</span><span class="w"> </span><span class="n">TIFFTAG_SUBFILETYPE</span><span class="p">,</span><span class="w"> </span><span class="n">DNG_SUBFILETYPE_ORIGINAL</span><span class="p">);</span><span class="w"></span> <span class="c1">// write raw data and close the directory again</span> <span class="n">TIFFWriteDirectory</span><span class="p">(</span><span class="n">tif</span><span class="p">);</span><span class="w"></span> <span class="c1">// Close the tiff, this will cause libtiff to patch up the references</span> <span class="n">TIFFCLose</span><span class="p">(</span><span class="n">tif</span><span class="p">);</span><span class="w"></span> </pre></div> <p>So with the code updated the validation tool neatly shows the new SubIFD tags and finds actual errors in my DNG file data now</p> <pre><code>Uses little-endian byte order Magic number = 42 IFD 0: Offset = 308, Entries = 11 NewSubFileType: Preview Image ImageWidth: 20 ImageLength: 15 BitsPerSample: 8 Compression: Uncompressed PhotometricInterpretation: RGB StripOffsets: Offset = 8 StripByteCounts: Count = 300 SubIFDs: IFD = 10054 DNGVersion: 1.4.0.0 UniqueCameraModel: &quot;LibDNG&quot; NextIFD = 0 SubIFD 1: Offset = 10054, Entries = 6 NewSubFileType: Main Image ImageWidth: 320 ImageLength: 240 Compression: Uncompressed StripOffsets: Offset = 453 StripByteCounts: Count = 9600 NextIFD = 0 *** Error: Missing or invalid SamplesPerPixel (IFD 0) *** *** Error: Missing or invalid PhotometricInterpretation (SubIFD 1) ***</code></pre> <p>Ah, so these two tags are actually required but not described as such in the DNG specification since these are TIFF tags instead of DNG tags (while it does explicitly tells other TIFF required data).</p> <p>Patching up these errors is easy, just slightly annoying since the validation tool seemingly gives only a single error per IFD requiring to iterate on the code a bit more. After a whole lot of iterating on the exporting code I managed to get the first valid DNG file:</p> <pre><code>Raw image read time: 0.000 sec Linearization time: 0.002 sec Interpolate time: 0.006 sec Validation complete</code></pre> <p>Now the next step is adding all the plumbing to make this usable as library and making an actually nice command line utility.</p> <h2>First actual test</h2> <p>Now I have written the first iterations of libmegapixels and libdng it should be possible to actually load a picture in some editing software. So let's try some end-to-end testing with this.</p> <p>With the <code>megapixels-getframe</code> utility from libmegapixels I can get a frame from the sensor (In this case the rear camera of the Librem 5) and then feed that raw data to the <code>makedng</code> utility from libdng.</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>getframe -o test.raw <span class="go">Using config: /usr/share/megapixels/config/purism,librem5.conf</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">Stored frame to: test.raw</span> <span class="go">Format: 4208x3120</span> <span class="go">Pixfmt: GRBG</span> <span class="gp">$ </span>makedng -w <span class="m">4208</span> -h <span class="m">3120</span> -p GRBG test.raw test.dng <span class="go">Reading test.raw...</span> <span class="go">Writing test.dng...</span> </pre></div> <p>No errors and the file passes the DNG validation, let's load it into RawTherapee :)</p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1700184535/image.png" class="kg-image"><figcaption>The first frame loaded into RawTherapee</figcaption></figure> <p>I had to boost the exposure a bit since the <code>megapixels-getframe</code> tool does not actually control any of the sensor parameters like the exposure time so the resulting picture is very dark. There's also no whitebalance or autofocus happening so the colors look horrible. </p> <p>But... </p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1700184873/compare-checker.jpg" class="kg-image"></figure> <p>The colors are correct! The interpetation of the CFA pattern of the sensor and the orientation of the data is all correct.</p> <h2>Integration testing</h2> <p>The nice thing about having the seperate library is that testing it becomes a lot easier than testing a GTK4 application. I have added the first simple end-to-end test to the codebase now that feeds some data to makedng and checks if the result is a valid DNG file using the official Adobe tool.</p> <div class="highlight"><pre><span></span><span class="ch">#!/bin/bash</span> <span class="nb">set</span> -e <span class="k">if</span> <span class="o">[</span> <span class="nv">$#</span> -ne <span class="m">1</span> <span class="o">]</span><span class="p">;</span> <span class="k">then</span> <span class="nb">echo</span> <span class="s2">&quot;Missing tool argument&quot;</span> <span class="nb">exit</span> <span class="m">1</span> <span class="k">fi</span> <span class="nv">makedng</span><span class="o">=</span><span class="s2">&quot;</span><span class="nv">$1</span><span class="s2">&quot;</span> <span class="nb">echo</span> <span class="s2">&quot;Running tests with &#39;</span><span class="nv">$makedng</span><span class="s2">&#39;&quot;</span> <span class="c1"># This testsuite runs raw data through the makedng utility and validates the</span> <span class="c1"># result using the dng_validate tool from the Adobe DNG SDK. This tool needs</span> <span class="c1"># to be manually installed for these tests to run.</span> <span class="c1"># Create test raw data</span> mkdir -p scratch magick -size 1280x720 gradient: -colorspace RGB scratch/data.rgb <span class="c1"># Generate DNG</span> <span class="nv">$makedng</span> -w <span class="m">1280</span> -h <span class="m">720</span> -p RG10 scratch/data.rgb scratch/RG10.dng <span class="c1"># Validate DNG</span> dng_validate scratch/RG10.dng </pre></div> <p>This is launched from ctest in my cmake files for now since I'm developing most of this stuff using CLion which only properly supports cmake projects. This is why a lot of my C projects have both meson and cmake files to build them but only the meson project file has install commands in it.</p> <p>For more advanced testing it would be neat to have raw sensor dumps of several sensors in different formats which are all pictures of a colorchecker like the picture above. Then have some (probably opencv) utility that can validate that a colorchecker is present in the picture with the right colors.</p> <p>There also needs to be a non-adobe-propriatary validation tool that can be easily run as testsuite for distribution packaging so at build time it's possible to validate that the combination of libdng and the distribution version of libtiff can produce sane output. This has caused several issues in Megapixels before after all.</p> <h2>Overall architecture</h2> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1700232871/path4862-1-4.png" class="kg-image"><figcaption>I&#x27;ve spent too much time drawing this</figcaption></figure> <p>With the addition of libdng the architecture for Megapixels 2.0 starts to look like this. Megapixels no longer has any pipeline manipulation code, that is all handled by the library which after configuration just passes the file descriptor for the sensor node to Megapixels to handle the realtime control of the sensor parameters.</p> <p>The libdng code replaces the plain libtiff exporting done in Megapixels and generate the DNG files that will be read by postprocessd. Postprocessd reads the dng files with the help of the dcraw library which already has custom DNG reading code that does not use libtiff.</p> <p>The next steps now is to flesh out the library public interface for libdng so it can do all the DNG metadata that Megapixels requires and then hooking it up to Megapixels to actually use it.</p> <hr> <h3>Funding update</h3> <p>Since my <a href="https://blog.brixit.nl/adding-hardware-to-libmegapixels/">previous post</a> about the libmegapixels developments and the <a href="https://blog.brixit.nl/megapixels-2-0/">Megapixels 2.0 post</a> I wrote before that I've almost doubled the funding for actually working on all the FOSS contributions. I'm immensely thankful for all the new patrons and it also made me notice that the <a href="https://blog.brixit.nl/donations/">donations</a> page on this site was no longer being regenerated. That is fixed now.</p> <p>I'm also still trying to figure out if I can add some perks for patrons to all of this but practically all options just amount to making things slightly worse for non-patrons. I hope just making the FOSS ecosystem better one of code line at a time is enough :)</p> Adding hardware to libmegapixelshttps://blog.brixit.nl/adding-hardware-to-libmegapixels/88MegapixelsMartijn BraamMon, 13 Nov 2023 17:59:48 -0000<p>Since in the last post I only showed off the libmegapixels config format and made some claims about configurablility without demonstrating it. I thought that it might be a good idea to actually demonstrate and document it.</p> <p>As example device I will use my Xiaomi Mi Note 2 with a broken display, shown above. Also known in PostmarketOS under the codename <a href="https://wiki.postmarketos.org/wiki/Xiaomi_Mi_Note_2_(xiaomi-scorpio)">xiaomi-scorpio</a>. I picked this device as demo since I have already used this hardware in Megapixels 1.x so I know the kernel side of it is functional. I have not run any libmegapixels code on this device before writing this blogpost so I'm writing it as a I go along debugging it. Hopefully this device does not require any ioctl that has not been needed by the existing supported devices.</p> <p>What makes it possible to get camera output from this phone is two things:</p> <ul><li>The camera subsystem in this device is supported pretty well in the kernel, in this case it&#x27;s a Qualcomm device which has a somewhat universal driver for this</li> <li>The sensor in this phone has a proper driver</li> </ul> <p>The existing devices that I used to develop libmegapixels are based around the Rockchip, NXP and Allwinner platforms so this will be an interesting test if my theory works.</p> <h2>The config file name</h2> <p>Just like Megapixels 1.x the config file is based around the "compatible" name of the device. This is defined in the device tree passed to Linux by the bootloader. Since this is a nice mainline Linux device this info can be found in the kernel source: <a href="https://github.com/torvalds/linux/blob/b85ea95d086471afb4ad062012a4d73cd328fa86/arch/arm64/boot/dts/qcom/msm8996pro-xiaomi-scorpio.dts#L17">https://github.com/torvalds/linux/blob/b85ea95d086471afb4ad062012a4d73cd328fa86/arch/arm64/boot/dts/qcom/msm8996pro-xiaomi-scorpio.dts#L17</a></p> <pre><code>compatible = &quot;xiaomi,scorpio&quot;, &quot;qcom,msm8996pro&quot;, &quot;qcom,msm8996&quot;;</code></pre> <p>This device tree specifies three names for this device ranking from more specific to less specific. <code>xiaomi,scorpio</code> is the exact hardware name, <code>qcom,msm8996pro</code> is the variant of the SoC and the <code>qcom,msm8996</code> name is the inexact name of the SoC. Since this configuration defined both the SoC pipeline and the configuration for the specific sensor module the only sane option here is <code>xiaomi,scorpio</code> since that describes that exact hardware configuration. Other <code>msm8996</code> devices might be using a completely different sensor.</p> <p>The most specific option is not always the best option, in the case of the PinePhone for example the compatible is:</p> <pre><code>&quot;pine64,pinephone-1.1&quot;, &quot;pine64,pinephone&quot;, &quot;allwinner,sun50i-a64&quot;;</code></pre> <p>In this hardware the camer system for the 1.0, 1.1 and 1.2 revision is identical so the config file just uses the <code>pine64,pinephone</code> name.</p> <p>Knowing this the config file name will be <code>xiaomi,scorpio.conf</code> and can be placed in three locations. <code>/usr/share/megapixels/config</code>, <code>/etc/megapixels/config</code> and just the plain filename in your current working directory.</p> <p>Now we know what the config path is the hard part starts, figuring out what to put in this config file.</p> <h2>The media pipeline</h2> <p>The next step is figuring out the media pipeline for this device. If the kernel has support for the hardware in the device it should create one or more <code>/dev/media</code> files. In the case of the Scorpio there's only a single one for the camera pipeline but there might be additional ones for stuff like hardware accelerated video encoding or decoding. </p> <p>You can get the contents of the media pipelines with the <code>media-ctl</code> utility from <code>v4l-utils</code>. Use <code>media-ctl -p</code> to print the pipeline and you can use the <code>-d</code> option to choose another file than <code>/dev/media0</code> if needed. For the Scorpio the pipeline contents are:</p> <pre><code>Media controller API version 6.1.14 Media device information ------------------------ driver qcom-camss model Qualcomm Camera Subsystem serial bus info platform:a34000.camss hw revision 0x0 driver version 6.1.14 Device topology - entity 1: msm_csiphy0 (2 pads, 5 links) type V4L2 subdev subtype Unknown flags 0 device node name /dev/v4l-subdev0 pad0: Sink [fmt:UYVY8_2X8/1920x1080 field:none colorspace:srgb] &lt;- &quot;imx318 3-001a&quot;:0 [ENABLED,IMMUTABLE] pad1: Source [fmt:UYVY8_2X8/1920x1080 field:none colorspace:srgb] -&gt; &quot;msm_csid0&quot;:0 [] -&gt; &quot;msm_csid1&quot;:0 [] -&gt; &quot;msm_csid2&quot;:0 [] -&gt; &quot;msm_csid3&quot;:0 [] [ Removed A LOT of entities here for brevity ] - entity 226: imx318 3-001a (1 pad, 1 link) type V4L2 subdev subtype Sensor flags 0 device node name /dev/v4l-subdev19 pad0: Source [fmt:SRGGB10_1X10/5488x4112@1/30 field:none colorspace:raw xfer:none] -&gt; &quot;msm_csiphy0&quot;:0 [ENABLED,IMMUTABLE] - entity 228: ak7375 3-000c (0 pad, 0 link) type V4L2 subdev subtype Lens flags 0 device node name /dev/v4l-subdev20 </code></pre> <p>The header shows that this is a media device for the <code>qcom-camss</code> system, which handles cameras on Qualcomm devices. There is also a node for the <code>imx318</code> sensor which further confirms that this is the right media pipeline.</p> <p>Analyzing the pipeline in this format is pretty hard when there's more than two nodes though, that's why there is a neat option in media-ctl to output the mediagraph as an actual graph using Graphviz.</p> <pre><code>$ apk add graphviz $ media-ctl -d 0 --print-dot | dot -Tpng &gt; pipeline.png</code></pre> <p>Which produces this image:</p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1699888898/pipeline.png" class="kg-image"></figure> <p>In a bunch of cases you can copy most of the configuration of this graph from another device that uses the same SoC but since this is the first Qualcomm device I'm adding I have to figure out the whole pipeline.</p> <p>The only part that's really specific to the Xiaomi Scorpio is the top two nodes. The <code>imx318</code> is the actual camera module in the phone connected with mipi to the SoC. The <code>ak7375</code> is listed as a "Motor driver". This means that it is the chip handeling the lens movements for autofocus. There are no connections to this node since this device does not handle any graphical data, the entity only exists so you can set v4l control values on it to move the focus manually.</p> <p>All the boxes in the graph are called entities and correspond with the <code>Entity</code> blocks in the <code>media-ctl -p</code> output. The boxes are yellow if they are entities with the type <code>V4L</code>, these are the nodes that will show up als <code>/dev/video</code> nodes to actually get the image data out of this pipeline.</p> <p>The lines between the boxes are called links, the dotted lines are disabled links and solid lines are enabled links. On this hardware a lot of the links are created by the kernel driver and are hardcoded. These links show up in the text output as <code>IMMUTABLE</code> and mostly describe fixed hardware paths for the image data.</p> <p>The goal of configuring this pipeline is to get the image data from the IMX sensor all the way down to one of the /dev/video nodes and figuring out the purpose of the entities in between. If you are lucky there is actual documentation for this. In this case I have found documentation at <a href="https://www.kernel.org/doc/html/v4.14/media/v4l-drivers/qcom_camss.html">https://www.kernel.org/doc/html/v4.14/media/v4l-drivers/qcom_camss.html</a> which is for the v4.14 kernel but for some reason is removed on later releases.</p> <p>This documentation has neat explanations for these entities:</p> <ul><li>2 CSIPHY modules. They handle the Physical layer of the CSI2 receivers. A separate camera sensor can be connected to each of the CSIPHY module;</li> <li>2 CSID (CSI Decoder) modules. They handle the Protocol and Application layer of the CSI2 receivers. A CSID can decode data stream from any of the CSIPHY. Each CSID also contains a TG (Test Generator) block which can generate artificial input data for test purposes;</li> <li>ISPIF (ISP Interface) module. Handles the routing of the data streams from the CSIDs to the inputs of the VFE;</li> <li>VFE (Video Front End) module. Contains a pipeline of image processing hardware blocks. The VFE has different input interfaces. The PIX (Pixel) input interface feeds the input data to the image processing pipeline. The image processing pipeline contains also a scale and crop module at the end. Three RDI (Raw Dump Interface) input interfaces bypass the image processing pipeline. The VFE also contains the AXI bus interface which writes the output data to memory.</li> </ul> <p>This documentation is not for this exact SoC so the amount of entities of each type is different.</p> <p>Configuring the pipeline and connecting it all up is now just a lot of trial and error, in the case of the Scorpio it has already been trial-and-error'd so there is an existing config file for the old Megapixels at <a href="https://gitlab.com/postmarketOS/megapixels/-/blob/master/config/xiaomi,scorpio.ini?ref_type=heads">https://gitlab.com/postmarketOS/megapixels/-/blob/master/config/xiaomi,scorpio.ini</a></p> <p>In this old pipeline description format the path is just enabling the links between the first <code>csiphy</code>, <code>csid</code>, <code>ispif</code> and <code>vfe</code> entity. Since this release of Megapixels did not really support further configuration it just tried to then set the resolution and pixel format for the sensors on all entities after it and hoped it worked. On an unknown platform just picking the left-most path will pretty likely bring up a valid pipeline, the duplicated entities are mostly useful for cases where you are using multiple cameras at once.</p> <h2>Initial config file</h2> <p>The first thing I did is creating a minimal config file for the scorpio that had the minimal pipeline to stream unmodified data from the sensor to userspace.</p> <pre><code>Version = 1; Make: &quot;Xiaomi&quot;; Model: &quot;Scorpio&quot;; Rear: { SensorDriver: &quot;imx318&quot;; BridgeDriver: &quot;qcom-camss&quot;; Modes: ( { Width: 3840; Height: 2160; Rate: 30; Format: &quot;RGGB10&quot;; Rotate: 90; Pipeline: ( {Type: &quot;Link&quot;, From: &quot;imx318&quot;, FromPad: 0, To: &quot;msm_csiphy0&quot;, ToPad: 0}, {Type: &quot;Link&quot;, From: &quot;msm_csiphy0&quot;, FromPad: 1, To: &quot;msm_csid0&quot;, ToPad: 0}, {Type: &quot;Link&quot;, From: &quot;msm_csid0&quot;, FromPad: 1, To: &quot;msm_ispif0&quot;, ToPad: 0}, {Type: &quot;Link&quot;, From: &quot;msm_ispif0&quot;, FromPad: 1, To: &quot;msm_vfe0_rdi0&quot;, ToPad: 0}, {Type: &quot;Mode&quot;, Entity: &quot;imx318&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_csiphy0&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_csid0&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_ispif0&quot;}, ); }, ); }; </code></pre> <p>This can be tested with the <code>megapixels-getframe</code> command.</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>./megapixels-getframe <span class="go">Using config: /etc/megapixels/config/xiaomi,scorpio.conf</span> <span class="go">[libmegapixels] Could not link 226 -&gt; 1 [imx318 -&gt; msm_csiphy0] </span> <span class="go">[libmegapixels] Capture driver changed pixfmt to UYVY</span> <span class="go">Could not select mode</span> </pre></div> <p>This command tries to output as much debugging info as possible, but the reality is that you'll most likely need to look at the kernel source to figure out what is happening and what arbitrary constraints exist.</p> <p>So the iterating and figuring out errors starts. First the most problematic line is the <code>UYVY</code> format one. This most likely means that the pipeline pixelformat I selected was not correct and to fix that the kernel helpfully selects a completely different one. <code>getframe</code> will detect this and show this happening. In this case the RGGB10 format is wrong and it should have been RGGB10p. The kernel implementation is a bit inconsistent about which format it actually is while MIPI only allows one of these two in the spec. Changing that removes that error.</p> <p>The other interesting error is the link that could not be created. If you look closely at the Graphviz output you'll see that this link is already enabled by the kernel and in the text output it is also <code>IMMUTABLE</code>. This config line can be dropped because this is not configurable.</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>./megapixels-getframe <span class="go">Using config: /etc/megapixels/config/xiaomi,scorpio.conf</span> <span class="go">VIDIOC_STREAMON failed: Broken pipe</span> </pre></div> <p>Progress! At least somewhat. The mode setting commands succeed but now the pipeline can not actually be started. This is because some drivers only validate options when starting the pipeline instead of when you're actually setting modes. This is one of the most annoying errors to fix because there's no feedback whatsoever on <i>what</i> or <i>where</i> the config issue is.</p> <p>My suggestion for this is to first run <code>media-ctl -p</code> again and see the current state of the pipeline. This output shows the format for the pads of the pipeline so you can find a connection that might be invalid by comparing those. My pipeline state at this point is:</p> <ul><li><code>imx318</code>: <code>SRGGB10_1X10/3840x2160@1/30</code></li> <li><code>csiphy0</code>: <code>SRGGB10_1X10/3840x2160</code></li> <li><code>csid0</code>: <code>SRGGB10_1X10/3840x2160</code></li> <li><code>ispif0</code>: <code>SRGGB10_1X10/3840x2160</code></li> <li><code>vfe0_rdi0</code>: <code>UYVY8_2X8/1920x1080</code></li> </ul> <p>AHA! the last node is not configured correctly. It's always the last one you look at. It turns out the issue was that I'm simply missing a mode command in my config file that sets the mode on that entity so it's left at the pipeline defaults. Let's test the pipeline with that config added:</p> <div class="highlight"><pre><span></span><span class="gp">$ </span>/megapixels-getframe <span class="go">Using config: /etc/megapixels/config/xiaomi,scorpio.conf</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> <span class="go">received frame</span> </pre></div> <p>The pipeline is streaming! This is the bare minimum configuration needed to make Megapixels 2.0 use this camera. For reference after all the changes above the config file is:</p> <pre><code>Version = 1; Make: &quot;Xiaomi&quot;; Model: &quot;Scorpio&quot;; Rear: { SensorDriver: &quot;imx318&quot;; BridgeDriver: &quot;qcom-camss&quot;; Modes: ( { Width: 3840; Height: 2160; Rate: 30; Format: &quot;RGGB10p&quot;; Rotate: 90; Pipeline: ( {Type: &quot;Link&quot;, From: &quot;msm_csiphy0&quot;, FromPad: 1, To: &quot;msm_csid0&quot;, ToPad: 0}, {Type: &quot;Link&quot;, From: &quot;msm_csid0&quot;, FromPad: 1, To: &quot;msm_ispif0&quot;, ToPad: 0}, {Type: &quot;Link&quot;, From: &quot;msm_ispif0&quot;, FromPad: 1, To: &quot;msm_vfe0_rdi0&quot;, ToPad: 0}, {Type: &quot;Mode&quot;, Entity: &quot;imx318&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_csiphy0&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_csid0&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_ispif0&quot;}, {Type: &quot;Mode&quot;, Entity: &quot;msm_vfe0_rdi0&quot;}, ); }, ); };</code></pre> <h2>Camera metadata</h2> <p>The config file not only stores information about the media pipeline but can also store information about the optical path. Every mode can define the focal length for example because changing the cropping on the sensor will give you digital zoom and thus a longer focal length. With modern phones with 10 cameras on the back it is also possible to define all of them as the "rear" camera and have multiple modes with multiple focal lengths so camera apps can switch the pipeline for zooming once zooming is implemented in the UI.</p> <p>Finding out the values for this optical path is basically just using search engines to find datasheets and specs. Sometimes the pictures generated by android have the correct information for this in the metadata as well.</p> <p>This information is also mostly absent from sensor datasheets since that only describe the sensor itself, you either need to find this info from the camera module itself (which is the sensor plus the lens) or the specifications for the phone.</p> <p>From spec listings and review sites I've found that the focal length for the rear camera is 4.06mm and the aperture is f/2.0. This can be added to the mode section:</p> <pre><code>Width: 3840; Height: 2160; Rate: 30; Format: &quot;RGGB10p&quot;; Rotate: 90; FocalLength: 4.06; FNumber: 2.0;</code></pre> <h2>Reference for pipeline commands</h2> <p>Since this is now practically the main reference for writing config files until I get documentation generation up and running for libmegapixels I will put the complete documentation for the various commands here.</p> <p>While parsing the config file there are four values stored as state : <code>width</code>, <code>height</code>, <code>format</code> and <code>rate</code>. The values for these default to the ones set in the mode and they are updated whenever you define one of these values explicitly in a command. This prevents having to write the same resolution values repeatedly on every line but it still allows having entities in the pipeline that scale the resolution.</p> <h3>Link</h3> <pre><code>{ Type: &quot;Link&quot;, From: &quot;msm_csiphy0&quot;, # Source entity name, required FromPad: 1, # Source pad, defaults to 0 To: &quot;msm_csid0&quot;, # Target entity name, required ToPad: 0 # Target pad, defaults to 0 }</code></pre> <p>Translates to an <code>MEDIA_IOC_SETUP_LINK</code> ioctl on the media device.</p> <h3>Mode</h3> <pre><code>{ Type: &quot;Mode&quot;, Entity: &quot;imx318&quot; # Entity name, required Width: 1280 # Horisontal resolution, defaults to previous in pipeline Height: 720 # Vertical resolution, defaults to previous in pipeline Pad: 0 # Pad to set the mode on, defaults to 0 Format: &quot;RGGB10p&quot; # Pixelformat for the mode, defaults to previous in pipeline }</code></pre> <p>Translates to an <code>VIDIOC_SUBDEV_S_FMT</code> ioctl on the entity.</p> <h3>Rate</h3> <pre><code>{ Type: &quot;Rate&quot;, Entity: &quot;imx318&quot;, # Entity name, required Rate: 30 # FPS, defaults to previous in pipeline }</code></pre> <p>Translates to an <code>VIDIOC_SUBDEV_S_FRAME_INTERVAL</code> ioctl on the entity.</p> <h3>Crop</h3> <pre><code>{ Type: &quot;Crop&quot;, Entity: &quot;imx318&quot;, # Entity name, required Width: 1280 # Area width, defaults to previous width in pipeline Height: 720 # Area height resolution, defaults to previous height in pipeline Top: 0 # The vertical offset, defaults to 0 Left: 0 # The horisontal offset, defaults to 0 Pad: 0 # Pad to set the crop on, defaults to 0 }</code></pre> <p>Translates to an <code>VIDIOC_SUBDEV_S_CROP</code> ioctl on the entity.</p> <h2>The future of libmegapixels</h2> <p>It has been quite a bit of work to create libmegapixels and it has been a mountain of work to rework Megapixels to integrate it. The first 90% of this is done but the trick is always in getting the second 90% finished. In the <a href="https://blog.brixit.nl/megapixels-2-0/">Megapixels 2.0</a> post I already mentioned this has burned me out. On the other hand it's a shame to let this work go to waste.</p> <p>There is a few parts of autofocus, autoexposure and autowhitebalance that are very complicated and math heavy to figure out, I can't figure it out. The loop between libmegapixels and Megapixels exists to pass around the values but I can't stop the system from oscillating and can't get it to settle on good values. There seems to be no good public information available on how to implement this in any case.</p> <p>Another difficult part is sensor calibration. I have the hardware and software to create calibration profiles but this system expects the input pictures to come from... working cameras. The system completely lacks proper sensor linearisation which makes setting a proper whitebalance not really possible. You might have noticed the specific teal tint that gives away that a picture is taken on a Librem 5 for example. If that teal tint is corrected for manually then the midtones will look correct but highlights will become too yellow. Maybe there's a way to calibrate this properly or maybe this just takes someone messing with the curves manually for a long while to get correct.</p> <p>There also needs to be an alternative to writing dng files with libtiff so for my own sanity it is required to write libdng. The last few minor releases of libtiff have all been messing with the tiff tags relating to DNG files which have caused taking pictures to not work for a lot of people. The only way around this seems to be stop using libtiff like all the Linux photography software has already done. This is not a terribly hard thing to implement, it just has been prioritized below getting color correct so far and I have not had the time to work on it.</p> <p>There is also still segfaults and crashes relating to the GPU debayer code in Megapixels for most of the pixel formats. This is very hard to debug due to the involvement of the GPU in the equation.</p> <h3>How can you help</h3> <p>If you know how to progress with any of this I gladly accept any patches for this to push it forward.</p> <p>The harder part of this section is... money. I love working on photography stuff, I can't believe the Megapixels implementation has even gotten this far but it basically takes me hyperfocusing for weeks for 12 hours per day on random camera code to get to this point, that is not really sustainable. It's great to work on this for some days and making progress, it's really painful to work for weeks on that one 30 line code block and making no progress whatsoever. At some point my dream is that I can actually live off doing open source work but so far that has still been a distant dream.</p> <p>I've had the <a href="https://blog.brixit.nl/donations/">donations</a> page now for some years and I'm incredibly happy that people are supporting me to work on this at all. It's just forever stuck on receiving enough money that you feel like a responsibility to produce progress but not nearly enough to actually fund that progress. So in practice only extra pressure.</p> <p>So I hate asking for money, but it would certainly help towards the dream of being an actual full time FOSS developer :)</p> Megapixels 2.0https://blog.brixit.nl/megapixels-2-0/87LinuxMartijn BraamThu, 09 Nov 2023 18:33:39 -0000<p>The Megapixels camera application has long been the most performant camera application on the original PinePhone. I have not gotten the Megapixels application to that point alone. There have been several other contributors that have helped slowly improving performance and features of this application. Especially Benjamin has leaped it forward massively with the threaded processing code and GPU accelerated preview.</p> <p>All this code has made Megapixels very fast on the PinePhone but also has made it quite a lot harder to port the application to other hardware. The code is very much overfitted for the PinePhone hardware.</p> <h2>Finding a better design</h2> <p>To address the elephant in the room, yes libcamera exists and promises to abstract this all away. I just disagree with the design tradeoffs taken with libcamera and I think that any competition would only improve the ecosystem. It can't be that libcamera got this exactly right on the first try right?</p> <p>Instead of the implementation that libcamera has made that makes abstraction code in c++ for every platform I have decided to pick the method that libalsa uses for the audio abstraction in userspace.</p> <p>Alsa UCM config files are selected by soundcard name and contain a set of instructions to bring the audio pipeline in the correct state for your current usecase. All the hardware specific things are not described in code but instead in plain text configuration files. I think this scales way better since it massively lowers the skill floor needed to actually mess with the system to get hardware working.</p> <p>The first iteration of Megapixels has already somewhat done this. There's a config file that is picked based on the hardware model that describes the names of the device nodes in /dev so those paths don't have to be hardcoded and it describes the resolution and mode to configure. It also describes a few details about the optical path to later produce correct EXIF info for the pictures.</p> <div class="highlight"><pre><span></span><span class="k">[device]</span><span class="w"></span> <span class="na">make</span><span class="o">=</span><span class="s">PINE64</span><span class="w"></span> <span class="na">model</span><span class="o">=</span><span class="s">PinePhone</span><span class="w"></span> <span class="k">[rear]</span><span class="w"></span> <span class="na">driver</span><span class="o">=</span><span class="s">ov5640</span><span class="w"></span> <span class="na">media-driver</span><span class="o">=</span><span class="s">sun6i-csi</span><span class="w"></span> <span class="na">capture-width</span><span class="o">=</span><span class="s">2592</span><span class="w"></span> <span class="na">capture-height</span><span class="o">=</span><span class="s">1944</span><span class="w"></span> <span class="na">capture-rate</span><span class="o">=</span><span class="s">15</span><span class="w"></span> <span class="na">capture-fmt</span><span class="o">=</span><span class="s">BGGR8</span><span class="w"></span> <span class="na">preview-width</span><span class="o">=</span><span class="s">1280</span><span class="w"></span> <span class="na">preview-height</span><span class="o">=</span><span class="s">720</span><span class="w"></span> <span class="na">preview-rate</span><span class="o">=</span><span class="s">30</span><span class="w"></span> <span class="na">preview-fmt</span><span class="o">=</span><span class="s">BGGR8</span><span class="w"></span> <span class="na">rotate</span><span class="o">=</span><span class="s">270</span><span class="w"></span> <span class="na">colormatrix</span><span class="o">=</span><span class="s">1.384,-0.3203,-0.0124,-0.2728,1.049,0.1556,-0.0506,0.2577,0.8050</span><span class="w"></span> <span class="na">forwardmatrix</span><span class="o">=</span><span class="s">0.7331,0.1294,0.1018,0.3039,0.6698,0.0263,0.0002,0.0556,0.7693</span><span class="w"></span> <span class="na">blacklevel</span><span class="o">=</span><span class="s">3</span><span class="w"></span> <span class="na">whitelevel</span><span class="o">=</span><span class="s">255</span><span class="w"></span> <span class="na">focallength</span><span class="o">=</span><span class="s">3.33</span><span class="w"></span> <span class="na">cropfactor</span><span class="o">=</span><span class="s">10.81</span><span class="w"></span> <span class="na">fnumber</span><span class="o">=</span><span class="s">3.0</span><span class="w"></span> <span class="na">iso-min</span><span class="o">=</span><span class="s">100</span><span class="w"></span> <span class="na">iso-max</span><span class="o">=</span><span class="s">64000</span><span class="w"></span> <span class="na">flash-path</span><span class="o">=</span><span class="s">/sys/class/leds/white:flash</span><span class="w"></span> <span class="k">[front]</span><span class="w"></span> <span class="na">...</span><span class="w"></span> </pre></div> <p>This works great for the PinePhone but it has a significant drawback. Most mobile cameras require an elaborate graph of media nodes to be configured before video works, the PinePhone is the exception in that the media graph only has an input and output node so Megapixels just hardcodes that part of the hardware setup. This makes the config file practically useless for all other phones and this is also one of the reason why different devices have different forks to make Megapixels work.</p> <p>So a config file that only works for a single configuration is pretty useless. Instead of making this an .ini file I've switched the design over to libconfig so I don't have to create a whole new parser and it allows for nested configuration blocks. The config file I have been using on the PinePhone with the new codebase is this:</p> <div class="highlight"><pre><span></span><span class="k">Version</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="m">1</span><span class="err">;</span><span class="w"></span> <span class="k">Make</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;PINE64&quot;</span><span class="err">;</span><span class="w"></span> <span class="k">Model</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;PinePhone&quot;</span><span class="err">;</span><span class="w"></span> <span class="k">Rear</span><span class="err">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="k">SensorDriver</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;ov5640&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">BridgeDriver</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;sun6i-csi&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FlashPath</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;/sys/class/leds/white:flash&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">IsoMin</span><span class="err">:</span><span class="w"> </span><span class="m">100</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">IsoMax</span><span class="err">:</span><span class="w"> </span><span class="m">64000</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Modes</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">2592</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">1944</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rate</span><span class="err">:</span><span class="w"> </span><span class="m">15</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;BGGR8&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rotate</span><span class="err">:</span><span class="w"> </span><span class="m">270</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FocalLength</span><span class="err">:</span><span class="w"> </span><span class="m">3</span><span class="k">.</span><span class="m">33</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FNumber</span><span class="err">:</span><span class="w"> </span><span class="m">3</span><span class="k">.</span><span class="m">0</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Pipeline</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Link&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">From</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;ov5640&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">FromPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="k">To</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;sun6i-csi&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">ToPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;ov5640&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">2592</span><span class="p">,</span><span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">1944</span><span class="p">,</span><span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;BGGR8&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">)</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">1280</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">720</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rate</span><span class="err">:</span><span class="w"> </span><span class="m">30</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;BGGR8&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rotate</span><span class="err">:</span><span class="w"> </span><span class="m">270</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FocalLength</span><span class="err">:</span><span class="w"> </span><span class="m">3</span><span class="k">.</span><span class="m">33</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FNumber</span><span class="err">:</span><span class="w"> </span><span class="m">3</span><span class="k">.</span><span class="m">0</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Pipeline</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Link&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">From</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;ov5640&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">FromPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="k">To</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;sun6i-csi&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">ToPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;ov5640&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">)</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="p">}</span><span class="w"></span> <span class="w"> </span><span class="p">)</span><span class="err">;</span><span class="w"></span> <span class="p">}</span><span class="err">;</span><span class="w"></span> <span class="k">Front</span><span class="err">:</span><span class="w"> </span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="k">SensorDriver</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;gc2145&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">BridgeDriver</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;sun6i-csi&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">FlashDisplay</span><span class="err">:</span><span class="w"> </span><span class="k">true</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Modes</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="w"></span> <span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">1280</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">960</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rate</span><span class="err">:</span><span class="w"> </span><span class="m">60</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;BGGR8&quot;</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Rotate</span><span class="err">:</span><span class="w"> </span><span class="m">90</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Mirror</span><span class="err">:</span><span class="w"> </span><span class="k">true</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="k">Pipeline</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Link&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">From</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;gc2145&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">FromPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="k">To</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;sun6i-csi&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">ToPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;gc2145&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">)</span><span class="err">;</span><span class="w"></span> <span class="w"> </span><span class="p">}</span><span class="w"></span> <span class="w"> </span><span class="p">)</span><span class="err">;</span><span class="w"></span> </pre></div> <p>Instead of having a hardcoded preview mode and main mode for every sensor it's now possible to make many different resolution configs. This config recreates the 2 existing modes and Megapixels now picks faster mode for the preview automatically and use higher resolution modes for the actual picture. </p> <p>Every mode now also has a <code>Pipeline</code> block that describes the media graph as a series of commands, every line translates to one ioctl called on the right device node just like Alsa UCM files describe it as a series of amixer commands. Megapixels no longer has the implicit PinePhone pipeline so here it describes the one link it has to make between the sensor node and the csi node and it tells Megapixels to set the correct mode on the sensor node.</p> <p>This simple example of the PinePhone does not really show off most of the config options so lets look at a more complicated example:</p> <div class="highlight"><pre><span></span><span class="k">Pipeline</span><span class="err">:</span><span class="w"> </span><span class="p">(</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Link&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">From</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;imx258&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">FromPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">,</span><span class="w"> </span><span class="k">To</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_csi&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">ToPad</span><span class="err">:</span><span class="w"> </span><span class="m">0</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;imx258&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;RGGB10P&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">1048</span><span class="p">,</span><span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">780</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_csi&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_isp&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_isp&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Pad</span><span class="err">:</span><span class="w"> </span><span class="m">2</span><span class="p">,</span><span class="w"> </span><span class="k">Format</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;RGGB8&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Crop&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_isp&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Crop&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_isp&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Pad</span><span class="err">:</span><span class="w"> </span><span class="m">2</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_resizer_mainpath&quot;</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Mode&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_resizer_mainpath&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Pad</span><span class="err">:</span><span class="w"> </span><span class="m">1</span><span class="p">},</span><span class="w"></span> <span class="w"> </span><span class="p">{</span><span class="k">Type</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;Crop&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Entity</span><span class="err">:</span><span class="w"> </span><span class="s2">&quot;rkisp1_resizer_mainpath&quot;</span><span class="p">,</span><span class="w"> </span><span class="k">Width</span><span class="err">:</span><span class="w"> </span><span class="m">1048</span><span class="p">,</span><span class="w"> </span><span class="k">Height</span><span class="err">:</span><span class="w"> </span><span class="m">768</span><span class="p">},</span><span class="w"></span> <span class="p">)</span><span class="err">;</span><span class="w"></span> </pre></div> <p>This is the preview pipeline for the PinePhone Pro. Most of the Links are already hardcoded by the kernel itself so here it only creates the link from the rear camera sensor to the csi and all the other commands are for configuring the various entities in the graph.</p> <p>The <code>Mode</code> commands are basically doing the <code>VIDIOC_SUBDEV_S_FMT</code> ioctl on the device node found by the entity name. To make configuring modes on the pipeline not extremely verbose it implicitly takes the resolution, pixelformat and framerate from the main information set by the configuration block itself. Since several entities can convert the frames into another format or size it automatically cascades the new mode to the lines below it.</p> <p>In the example above the 5th command sets the format to <code>RGGB8</code> which means that the mode commands below it for <code>rkisp1_resizer_mainpath</code> also will use this mode but the <code>rkisp1_csi</code> mode command above it will still be operating in <code>RGGB10P</code> mode.</p> <h2>Splitting of device management code</h2> <p>Testing changes in Megapixels is pretty hard. To develop the Megapixels code I'm building it on the phone and launching it over SSH with a bunch of environment variables set so the GTK window shows up on the phone and I get realtime logs on my computer. If there's anything that's going on after the immediate setup code it is quite hard to debug because it's in one of the three threads that process the image data.</p> <p>To implement the new pipeline configuration I did that in a new empty project that builds a shared library and a few command line utilities that help test a few specific things. This codebase is <code>libmegapixels</code> and with it I have split off all hardware access from Megapixels itself making both these codebases a lot easier to understand.</p> <p>It has been a lot easier to debug complex camera pipelines using the commandline utilities and only working on the library code. It should also make it a lot easier to make Megapixels-like applications that are not GTK4 to make it integrate more with other environments. One of the test applications for libmegapixels is <code>getframe</code> which is now all you need to get a raw frame from the sensor.</p> <p>Since this codebase is now split into multiple parts I have put it into a seperate gitlab organisation at <a href="https://gitlab.com/megapixels-org">https://gitlab.com/megapixels-org</a> which hopefully keeps this a bit organized.</p> <p>This is also the codebase used for <a href="https://fosstodon.org/@martijnbraam/110775163438234897">https://fosstodon.org/@martijnbraam/110775163438234897</a> which shows off libmegapixels and megapixels 2.0 running on the Librem 5.</p> <h2>Burnout</h2> <p>So now the worse part of this blog post. No you can't use this stuff yet :(</p> <p>I've been working on this code for months, and now I've not been working on this code for months. I have completely burned out on all of this.</p> <p>The libmegapixels code is in pretty good state but the Megapixels rewrite is still a large mess:</p> <ul><li>Saving pictures doesn&#x27;t really work and I intended to split that off to create libdng</li> <li>The QR code support is not hooked up at all at the moment</li> <li>Several pixelformats don&#x27;t work correctly in the GPU decoder and I can&#x27;t find out why</li> <li>Librem 5 and PinePhone Pro really need auto-exposure, auto-focus and auto-whitebalance to produce anything remotely looking like a picture. I have ported the auto-exposure from Millipixels which works reasonably well for this but got stuck on AWB and have not attempted Autofocus yet.</li> </ul> <p>The mountain of work that's left to do to make this a superset of the functionality of Megapixels 1.x and the expectations surrounding it have made this pretty hard to work on. On the original Megapixel releases nothing mattered because any application that could show a single frame of the camera was already a 100% improvement over the current state.</p> <p>Another issue is that whatever I do or figure out it will always be instantly be put down with "Why are you not using libcamera" and "libcamera probably fixes this". </p> <p>Some things people really need to understand is that an application not using libcamera does <i>not</i> mean other software on the system can't support libcamera. If Firefox can use libcamera to do videocalls that's great, that's not the usecase Megapixels is going for anyway.</p> <p>What also doesn't help is receiving bugreports for the PinePhone Pro while Megapixels does not support the PinePhone Pro. There's a patchset added on top to make in launch on the PinePhone Pro but there's a reason this patchset is not in Megapixels. The product of the Megapixels source with the ppp.patch added on top probably shouldn't've been distributed as Megapixels...</p> <p>What also doesn't help is that if Megapixels 2.0 were finished and released it would also create a whole new wave of criticism and comparisons to libcamera. I would have to support Megapixels for the people complaining that it's not enough... You could've not had a camera application at all...</p> <p>It also doesn't help that the libcamerea developers are also the v4l2 subsystem maintainers in the kernel. I have during development of libmegapixels tried sending a simple patch for an issue I've noticed that would massively improve the ease of debugging PinePhone Pro cameras. I've sent this 3 character patch upstream to the v4l2 mailing lists and it got a Reviewed-by in a few days.</p> <p>Then after 2 whole months of radio silence it got rejected by the lead developer of libcamera on debatable grounds. Now this is only a very small patch so I'm merely dissapointed. If I had put more work into the kernel side improving some sensor drivers I might have been mad but at this point I'm just not feeling like contributing to the camera ecosystem anymore. </p> <hr> <p><b>Edit:</b> I've been convinced to actually try to do this full-time and push the codebase forward enough to make it usable. This is continued at <a href="https://blog.brixit.nl/adding-hardware-to-libmegapixels/">https://blog.brixit.nl/adding-hardware-to-libmegapixels/</a></p> Making an USB Ethernet adapter work [SR9700]https://blog.brixit.nl/making-a-usb-ethernet-adapter-work-sr9700/86LinuxMartijn BraamSat, 28 Oct 2023 17:41:51 -0000<p>I just needed a simple USB to Ethernet adapter for testing. It does not need to be fast, it does not need to reach USB 3.0 speeds or gigabit speeds. I own two other USB ethernet adapters that have various reliability issues so I got a random cheap one from ebay.</p> <p>The adapter I ended up getting was the SR-QF9700 that does not have an actual brand on it. When plugging it into my laptop there was a slight issue though. It shows up as a CD drive...</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1698514364/image.png" class="kg-image"></figure> <p>This is one of those annoying features where USB dongles will present a CD drive with drivers to windows to make them work, ignoring that this is just an rndis device that has worked without special drivers <i><i>for decades</i></i>. This wouldn't be problematic if it exposed both the ethernet interface and the driver CD at the same time so I could just ignore it.</p> <p>In theory this can be made working by hacks such as <code>usb_modeswitch</code> but I've seen several forum posts online that mention it not working on this hardware. Even if that did solve the issue, it still makes this device useless for me because I don't want to mess with special software on every device I plug it into.</p> <p>I was ready to add this adapter to the e-waste problem but then I thought about how this would be handled from the manufacturing side. The chip is generic but those driver installers are usually branded. There needs to be somewhere to actually store the drivers on the device...</p> <h2>Taking it apart</h2> <p>This thing is very easy to open. There are no screws or clips in it at all, it's all held together by the sticker with the model number.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1698514385/20231028_0014.jpg" class="kg-image"></figure> <p>Inside the adapter is a simple PCB that has a crystal and the USB ethernet module. This particular adapter contains a CoreChips SR9700 single-chip ethernet controller. Doing anything interesting with this would be hard, there's not really public documentation for this chip except for the pin descriptions. The SR9900 datasheet mentions some internal one-time programmable memory, but it being one-time programmable does not really help me.</p> <p>So lets look at the back of the PCB</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1698514403/20231028_0006.jpg" class="kg-image"></figure> <p>Another chip! This is an 4MBit SPI Nor flash chip. This must be the chip actually storing the windows drivers.</p> <p>So what is the behavior of the ethernet controller chip if this chip is absent...</p> <p>SPI flash pinouts are pretty standardized, but I looked up the datasheet for this specific chip to be sure:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1698514414/image.png" class="kg-image"></figure> <p>There's many ways to make a SPI flash chip temporarily not function. I decided to jam a screwdriver between the <code>CS</code> and <code>SO</code> pin to make the chip unable to respond to SPI communication. With the screwdriver in place I plugged in the USB cable and behold:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1698514424/image.png" class="kg-image"></figure> <p>Ethernet! Well that was easy. Now to make this permanent I shorted together the same two pins with my soldering iron since removing the whole chip is way harder.</p> <p>So instead of messing with any device/computer I want to plug this ethernet adapter into I now have a normal USB ethernet adapter with just one extra solder connection :)</p> <p><br></p> Developers are lazy, thus Flatpakhttps://blog.brixit.nl/developers-are-lazy-thus-flatpak/76LinuxMartijn BraamSat, 03 Jun 2023 15:58:47 -0000<p>In the last decade I have seen a very slow but steady shift to solutions for packaging software that try to isolate the software from host systems to supposedly make things easier. My first experience with this was Docker, now Flatpak is the thing for desktop applications.</p> <h2>The promise of Flatpak</h2> <p>So the thing Flatpak is supposed to fix for me as developer is that I don't need to care about distributions anymore. I can bolt on whatever dependencies I want to my app and it's dealt with. I also don't need to worry about having software in distributions, if it's in Flatpak it's everywhere. Flatpak gives me that unified base to work on and everything will be perfect. World hunger will be solved. Finally peace on earth.</p> <p>Sadly there's reality. The reality is to get away from the evil distributions the Flatpak creators have made... another distribution. It is not a particularly good distribution, it doesn't have a decent package manager. It doesn't have a system that makes it easy to do packaging. The developer interface is painfully shoehorned into Github workflows and it adds all the downsides of containerisation.</p> <h3>Flatpak is a distribution</h3> <p>While the developers like to pretend real hard that Flatpak is not a distribution, it's still suspiciously close to one. It lacks a kernel and a few services and it lacks the standard Linux base directory specification but it's still a distribution you need to target. Instead of providing seperate packages with a package manager it provides a runtime that comes with a bunch of dependencies. Conveniently it also provides multiple runtimes to make sure there's not actually a single base to work on. Because sometimes you need Gnome libraries, sometimes you need KDE libraries. Since there's no package manager those will be in seperate runtimes.</p> <p>While Flatpak breaks most expectations of a distribution it's still a collection of software and libraries build together to make a system to run software in, thus it's a distribution. A really weird one.</p> <h3>No built in package manager</h3> <p>If you need a dependency that's not in the runtime there's no package manager to pull in that dependency. The solution is to also package the dependencies you need yourself and let the flatpak tooling build this into the flatpak of your application. So now instead of being the developer for your application you're also the maintainer of all the dependencies in this semi-distribution you're shipping under the disguise of an application. And one thing is for sure, I don't trust application developers to maintain dependencies.</p> <p>This gets really nuts by looking at some software that deals with multimedia. Lets look at the Audacity flatpak. It builds as dependency:</p> <ul><li>wxwidgets</li> <li>ffmpeg</li> <li>sqlite</li> <li>chrpath</li> <li>portaudio</li> <li>portmidi</li> </ul> <p>So lets look at how well dependencies are managed here. Since we're now almost exactly half a year into 2023 I'll look at the updates for the last 6 months and compare it to the same dependencies in Alpine Linux.</p> <ul><li>audacity has been updated 4 times in the flatpak. It has been updated 5 times on Alpine.</li> <li>ffmpeg has been updated to 6.0 in both the flatpak and Alpine, but the ffmpeg package has had 9 updates because if codecs that have been updated.</li> <li>sqlite hasn&#x27;t been updated in the flatpak and has been updated 4 times in Alpine</li> <li>wxwidgets hasn&#x27;t been updated in the flatpak and has been updated 2 times in Alpine</li> <li>chrpath hasn&#x27;t had updates</li> <li>portaudio hasn&#x27;t had updates in flatpak and Alpine.</li> <li>portmidi hasn&#x27;t had updates</li> </ul> <p>This is just a random package I picked and it already had a lot more maintainance of the dependencies than the flatpak has. It most likely doesn't scale to have all developers keep track of all the dependencies of all their software.</p> <h3>The idea of isolation</h3> <p>One of the big pros that's always mentioned with Flatpak is that the applications run in a sandbox. The idea is that this sandbox will shield you from all the evil applications can do so it's totally safe to trust random developers to push random Flatpaks. First of all this sandbox has the same issue any permission system that exists also has. It needs to tell the user about the specific holes that have been poked in the sandbox to make the application work in a way that end users <i>understand</i> what the security implications of those permissions are.</p> <p>For example here's Gnome Software ready to install the flatpak for Edge:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1685803130/image.png" class="kg-image"></figure> <p>I find the permission handleing implemented here very interesting. There's absolutely no warning whatsoever about the bypassed security in this Flatpak untill you scroll down. The install button will immediately install it without warning about all the bypassed sandboxing features.</p> <p>So if you <i>do scroll down there's more details right? Sure there is!</i></p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1685803278/image.png" class="kg-image"></figure> <p>There's a nice red triangle with the words Unsafe! pfew, everyone is fine now. So this uses a legacy windowing system which probably means it uses X11 which is not secure and breaks the sandbox. Well if that's the only security issue then it <i>might</i> be acceptable? Let's click that button.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1685803447/image.png" class="kg-image"></figure> <p>Well yeah... let's hide that from users. Of course the browser needs to write to /etc. This is all unimportant to end users.</p> <p>The even worse news is that since this is proprietary software it's not really possible to audit what this would do, and even if it's audited it's ridiculously easy to push a new more evil version to Flathub since practically only the first version of the app you push is thorougly looked at by the Flathub maintainers.</p> <p>Even if there weren't so many holes in the sandbox. This does not stop applications from doing more evil things that are not directly related to filesystem and daemon access. You want analytics on your users? Just requirest the internet permission and send off all the tracking data you want.</p> <h2>So what about traditional distributions</h2> <p>I've heard many argument for Flatpaks by users and developers but in the end I can't really say the pros outweigh the cons.</p> <p>I think it's very important that developers do not have the permissions to push whatever code they want to everyone under the disguise of a secure system. And that's <i>my opinion as a software developer</i>.</p> <p>Software packaged by distributions has at least some degree of scrutiny and it often results in at least making sure build flags are set to disable user tracking and such features.</p> <p>I also believe software in general is better if it's made with the expectation that it will run outside of Flatpak. It's not that hard to make sure you don't depend on bleeding edge versions of libraries while that's not needed. It's not that hard to have optional dependencies in software. It's not that hard to actually follow XDG specifications instead of hardcoding paths.</p> <h2>But packaging for distributions is hard</h2> <p>That's the best thing! Developers are not supposed to be the ones packaging software so it's not hard at all. It's not your task to get your software in all the distributions, if your software is useful to people it tends to get pulled in. I have software that's packaged in Alpine Linux, ALT Linux, Archlinux AUR, Debian, Devuan, Fedora, Gentoo, Kali, LiGurOS, Nix, OpenMandriva, postmarketOS, Raspbian, Rosa, Trisquel, Ubuntu and Void. I did not have to package most of this.</p> <p>The most I notice from other distributions packaging my software is patches from maintainers that improve the software, usually in dealing with some edge case I forgot with a hardcoded path somewhere.</p> <p>The most time I've ever spent on distribution packaging is actually the few pieces of software I've managed to push to Flathub. Dealing with differences between distributions is easy, dealing with differences between runing inside and outside Flatpak is hard.</p> <h2>But Flatpaks are easier for end users</h2> <p>I've ran into enough issues as end user of flatpaks. A package being on Flathub does not mean that it will be installable for an end user. I've ran into this by installing packages on the PineBook Pro which generated some rather confusing error messages about the repositories missing. It turns out that the Aarch64 architecture was missing for those flatpaks so the software was just not available. Linux distributions generally try to enable as much architectures as possible when packaging, not just x86_64.</p> <p>A second issue I've had on my Pinebook Pro is that it has a 64GB rootfs. Using too many flatpaks is just very wasteful of space. In theory you have a runtime that has your major dependencies and then a few Megabytes of stuff in your application flatpak. In practice I nearly have an unique platform per flatpak installed because the flatpaks depend on different versions of that platform or just on different platforms.</p> <p>Another issue is with end users of some of my Flatpaks. Flatpak does not deal well with software that communicates with actual hardware. A bunch of my software uses libusb to communicate with sepecific devices as a replacement for some Windows applications and Android apps I would otherwise need. The issue end users will run in to is that they first need to install the udev rules in their distribution to make sure Flatpak can access those USB devices. For the distribution packaged version of my software it Just Works(tm)</p> <h2>Flatpak does have it's uses</h2> <p>I wouldn't say Flatpak is completely useless. For certain usecases it is great to have available. It think Flatpak makes most sense for when closed source software would need to be distributed.</p> <p>I would like to see this be more strict though. I wouldn't want to have flatpaks with holes in the sandbox with a proprietary license for example. Which is exactly what the Edge flatpak is.</p> <p>It's quite sad that Flatpak madness has gone so deep into the Gnome ecosystem that it's now impossible to run the nice Gnome Builder IDE without having your application in a flatpak. (EDIT: Turns out that using Builder without Flatpak is possible again)</p> <p>I don't think having every app on a Linux machine being Flatpak is anything I'd want, If I wanted to give developers that much power to push updates to anywhere in my system without accountability I'd just go run Windows.</p> Alpine Linux is pretty neathttps://blog.brixit.nl/alpine-linux-is-pretty-neat/68LinuxMartijn BraamWed, 01 Feb 2023 15:44:57 -0000<p>I've used various Linux distributions in the past, starting with a Knoppix live CD a long time ago. For a long time I was an Ubuntu user (with compiz-fusion ofcourse), then I used Arch Linux for years thinking it was the perfect distribution. Due to postmarketOS I found out about Alpine Linux and now after using that for some years I think I should write a post about it.</p> <h2>Installing</h2> <p>Ubuntu has the easy graphical installer of course. Installing Arch Linux the first time is quite an experience the first time. I believe Arch since has added a setup wizard now but I have not tried it.</p> <p>Installing Alpine Linux is done by booting a live cd into a shell and installing from there just like Arch but it provides the <code>setup-alpine</code> shell script that runs you through the installation steps. It's about as easy as using the Ubuntu installer if you can look past the fact that it's text on a black screen.</p> <p>A minimal Alpine installation is quite small, that combined with the fast package manager makes the install process really really quick.</p> <h2>Package management</h2> <p>The package management is always one of the big differentiators between distributions. Alpine has it's own package manager called APK, the Alpine Package Keeper. While it's now confused with the android .apk format it predates Android by two years.</p> <p>The package management is pretty similar to Archlinux in some aspects. The APKBUILD package format is very similar to the pkgbuild files in Arch and the packages support similar features. The larger difference is the packaging mentality: Archlinux prefers to never split packages, just one .pkg.tar.zst file that contains all the features of the application and all the docs and development headers. Alpine splits out all these things to subpackages and the build system warns when the main package contains any documentation or development files.</p> <p>For a minimal example of this let's compare the tiff library. In Alpine Linux this is split into 5 packages:</p> <ul><li><code>tiff</code>, the main package that contains libtiff.so.6 [460 kB]</li> <li><code>tiff-dev</code>, the development headers [144 kB]</li> <li><code>libtiffxx</code>, the c++ bindings [28 kB]</li> <li><code>tiff-doc</code>, the documentation files [5.21 MB]</li> <li><code>tiff-tools</code>, command line tools like ppm2tiff [544 kB]</li> </ul> <p>In Arch Linux this is a single package called <code>libtiff</code> that's 6.2 MB. For most Linux users you'd never need the library documentation which takes the most space in this example.</p> <p>The end result is that my Archlinux installations are using around 10x the disk space my Alpine installations use if I ignore the home directories.</p> <p>Some more differences are that Alpine provides stable releases on top of the rolling <code>edge</code> release branch. This improves reliablity a lot for my machines. You wouldn't normally put Arch Linux on a production server but I found Alpine to be almost perfect for that usecase. Things like the <code>/etc/apk/world</code> file makes management machines easier. It's basically the <code>requirements.txt</code> file for your Linux installation and you don't even need to use any extra configuration management tools to get that functionality.</p> <p>There's also some downsides to <code>apk</code> though. Things I'm missing is optional packages and when things go wrong it has some of the most useless error messages I've encountered in software: <code><code>temporary error (try again later)</code></code> . Throwing away the original error and showing "user friendly" messages usually does not improve the situation.</p> <h2>Glibc or not to glibc</h2> <p>One of the main "issues" that get raised with Alpine is that it does not use glibc. Alpine Linux is a musl-libc based distribution. In practice I don't have many problems with this since most my software is just packaged by in the distribution so I wouldn't ever see that it's a musl distribution. </p> <p>Issues appear mostly when trying to run proprietary software on top of Alpine or software that's so hard to build that you're in practice just getting the prebuilds. The solution to proprietary software is... don't use proprietary software :)</p> <p>For the cases where that's not possible there's always either Flatpak or making a chroot with a glibc distribution in it.</p> <h2>Systemd</h2> <p>Beside not using glibc there's also no systemd in Alpine. This is one of the things I miss the most actually. I don't enjoy the enormous amount of different "best practices" for specifying init scripts and the bad documentation surrounding it. So far by best solution for creating idiomatic init scripts for alpine is just submitting something to the repository and wait until someone complains about style issues.</p> <p>Beside that I'm pretty happy with the tools openrc provides for manageing services. The <code>rc-update</code> tool gives a nice consise overview of enabled boot services and the <code>service</code> tool just does what I expect. It seems like software is starting to depend on systemd restarting it instead of fixing memory leaks which causes me some issues sometimes.</p> <h2>Conclusion</h2> <p>Alpine Linux is neato. I try to use it everywhere I can.</p> Configuring Blackmagic Design converters in Linuxhttps://blog.brixit.nl/configuring-blackmagic-design-converters-in-linux/62fbceed87c35a5ee6af3810LinuxMartijn BraamThu, 18 Aug 2022 23:35:25 -0000<p>Blackmagic Design makes a lot of video converters in various sizes and feature sets. These are relatively cheap boxes that convert between HDMI, SDI and optical fiber for video signals. There's also some neat converters that can do audio embedding and de-embedding.</p> <p>Normally these converters are configured using a Windows or OSX utility. Just plug in a converter with an USB cable and change one of the few settings. This is pretty nice except there's no utility for Linux.</p> <h2>Making a Linux utility</h2> <p>I already maintain <a href="https://openswitcher.org/">pyatem/openswitcher</a> which is a reverse engineered protocol implementation for the ATEM video switcher devices from Blackmagic Design. I decided to also take a look at the USB protocol for the converters to add this to the pyatem python library.</p> <p>I started by looking at the BiDirectional SDI/HDMI 12G micro converter. To look at the traffic I installed <a href="https://desowin.org/usbpcap/">USBPCap</a> on a Windows machine. This is not the most convenient tool but it's the quickest way to get USB logs.</p> <p>With USBPCap running I started the official software and changed every setting once and then ended the capture. With this captured traffic I could start to look at how the protocol works using Wireshark.</p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1670072517/image.png" class="kg-image"><figcaption>Looking at an USB dump using Wireshark</figcaption></figure> <p>The protocol is relatively simple, it only uses USB control transfers and retrieving/writing settings is done by sending a setting name to the device and then reading or writing the value.</p> <p>The more difficult part of this design is that you need to know in advance which settings exist on which converter. There does not seem to be a way to enumerate a list of the settings from the device.</p> <p>Using this information I wrote the <code>pyatem.converter</code> module that implements the USB protocol using libusb and also added a definition that describes all the settings for the BiDirectional converter I used.</p> <p>With the library implemented I greated a simple GTK3 application that presented the settings in a similar way to the official software:</p> <figure class="kg-card kg-image-card kg-width-wide"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1670072517/image-1.png" class="kg-image"><figcaption>Converter Setup on Linux</figcaption></figure> <h2>Supporting more hardware</h2> <p>A setup utility that only works with one model of converters is not super useful. It also creates the risk of overfitting the library to some specific protocol quirk that one model has, and this is exactly what happened in this case.</p> <p>I ordered the HDMI to SDI and SDI to HDMI 3G converters since these are cheap and really common. They are also an older generation of converters. For those not familiar with this hardware: the 12G converters are the 4K capable ones and the 3G converters top out at 1080p30. The number refers to the link speed.</p> <p>I created another packet dump using Windows to find which fields are supported and to my suprise the protocol for these converters is completely different.</p> <p>The protocol for these 3G converters does not work by sending a setting name and then a value. Instead the settings are numbers and the operation is done in a single usb transaction. This is possible since a numeric setting can be set in the wValue field of a read or write operation to signify what you want to read or write.</p> <p>Due to this I had to refactor the whole library to support multiple protocol backends. This would also make it possible to support more kinds of Blackmagic USB protocols for their different products like the ATEM Mini hardware and the monitors. These are normally configured using seperate setup utilities.</p> <h2>Going mobile</h2> <p>Since I also work on postmarketOS I tried running my Linux app on a PinePhone. The only change I had to make it removing the "Blackmagic Design" name from the converter name because the product names are so ridiculously long. After that the Linux desktop setup utility I made Just Works(tm) on Linux phones:</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1670072518/PXL_20220810_004521410.jpg" class="kg-image"><figcaption>Converter Setup on postmarketOS</figcaption></figure> <p>After sharing this picture there seemed a bit of interest for having a mobile app for configuring the converters. This makes a lot of sense, the converters are usually spread out all over the place and dragging a laptop around to configure them is annoying.</p> <p>While it's nice that this works on postmarketOS supported phones, the majority of people still have Android and iOS phones. Since I already know how the protocol works and doing a simple Android app shouldn't be too hard... I decided to learn Android development.</p> <h2>The Android app</h2> <p>I had tried to do Android development before and always got stuck very early in the process making me never continue it. It seems like Kotlin is a great improvement over the Java issues I was having though. I managed to get a proof of concept up and running that read the settings in a single converter in a day and continued developing. I also signed up for a Google Play developer account and went through the verification steps. It turns out dealing with the Google Play Console is definetly the hardest and most time consuming part of development.</p> <p>It took about a week to get everything on the google side verified and reviewed. Luckily I did this with by first hello world demo and didn't wait until I had finished the rest of the app otherwise I would've lost a lot more time on this. In the week of waiting I fixed up the multi-protocol setup and refactored the code of the app to be actually nice instead of one large MainActivity.kt file with hardcoded values.</p> <figure class="kg-card kg-image-card"><img src="https://blog.brixit.nl/image/w1000//static/files/blog.brixit.nl/1670072518/P1080471.JPG" class="kg-image"><figcaption>The Converter Configuration Android app</figcaption></figure> <p>My plan with the Android app is to make it paid and keep the opensource Linux app and libraries free. This makes me able to fund the development of the open source side through the Android app and give a wider audience the ability to actually use the app. The nice thing is that the Play store already takes care of all the payment and licensing for this.I can just keep using the funds from the Android app to get more hardware to reverse engineer and add to the app.</p> <p>The Android app is now in open beta. The beta can be joined through <a href="https://play.google.com/apps/testing/nl.brixit.setup">https://play.google.com/apps/testing/nl.brixit.setup</a>. Currently the app only supports the three converters mentioned above but I hope to provide an update soon to at least get the UpDownCross converter working.</p> <p>I'm not planning to make an iOS app for this though, that would require me to buy into the Apple ecosystem with multiple devices to even get started and learn a whole new software stack. I'm also not entirely sure if apps controlling non-class-compliant usb hardware is even possible in the Apple mobile ecosystem.</p> Why I left PINE64https://blog.brixit.nl/why-i-left-pine64/62f92ad587c35a5ee6af37d0PhonesMartijn BraamWed, 17 Aug 2022 09:47:04 -0000<p>Linux hardware projects are made or broken by their community support. PINE64 has made some brilliant moves to build up a mobile Linux community, and has also made some major mistakes. This is my view on how PINE64 made the PinePhone a success, and then broke that again through their treatment of the community.</p> <p>I want to start by pointing out that this is <i>me</i> leaving PINE64 and not the projects I'm involved in like postmarketOS. These opinions are my own yadda yadda...</p> <h2>Community Editions and the PinePhone's early life</h2> <p>The original PinePhone was brought up on the existing Linux Mobile projects like Ubuntu Touch, postmarketOS, and Maemo Leste, and also spawned new Linux distributions like Mobian and Danctnix ARM. This grew until there were 25 different projects working on the PinePhone — an apparently thriving community.</p> <p>Following the initial set of Developer Editions, intended for community hackers to build the software with, the first consumer-targeted PinePhone devices were the Community Editions. A batch of PinePhones was produced with a special branded back cover for five community projects: UBPorts, postmarketOS, Mobian, Manjaro, and KDE Plasma Mobile. Every Community Edition phone sold also sent $10 to their respective Linux distribution projects.</p> <p>Working together through these Community Editions, Linux distributions built a software ecosystem that works pretty well on the PinePhone.</p> <h2>The end of community editions</h2> <p>In February 2021, PINE64 <a href="https://www.pine64.org/2021/02/02/the-end-of-community-editions/" rel="nofollow noopener">announced the end of the community editions</a>. At this moment, PINE64's focus shifted from supporting a diverse ecosystem of distributions and software projects around the PinePhone to just supporting Manjaro Linux alone.</p> <p>The fact that a useful software ecosystem for the PinePhone exists at all is thanks to the diverse strategy employed by PINE64 in supporting many distributions working together on each of the many facets of the software required. Much of the original hardware bring-up was done by Ubuntu Touch. Mobian developers built the telephony stack via their eg25-manager project. And in my role for the postmarketOS distribution, I developed the camera stack.</p> <p>Manjaro Linux has been largely absent from these efforts. The people working on many of the Linux distributions represented in the community editions tend to work not just on packaging software, but on building software as well. This is not the case for Manjaro, which focuses almost entirely on packaging existing software. Supporting Manjaro has historically done very little to facilitate the development of the software stack which is necessary for these devices to work. In some cases the Manjaro involvement actually causes extra workload for the developers by shipping known broken versions of software and pointing to the developers for support. Which is why <a href="https://dont-ship.it/">https://dont-ship.it/</a> was started.</p> <p>Regardless, Manjaro is now the sole project endorsed and financially supported by PINE64, at least for the Linux capable devices. As a consequence it has a disproportionate level of influence in how PINE64 develops its products and manages the ecosystem.</p> <h2>The last straw</h2> <p>With community members influence in PINE64 diminished in favor of a Manjaro mono-culture, what was once a vibrant ecosystem has been reduced to a bunch of burnt-out and maligned developers abandoning the project. The development channels are no longer the great collaboration between various distributions developing PinePhone components and there are now only a small number of unpaid developers working on anything important. Many of PINE64's new devices, such as the PinePhone Pro, PineNote, and others, have few to no developers working on the software — a potential death blow for PINE64's model of relying on the community to build the software.</p> <p>Everyone has had a different "last straw". For me, it was the SPI flash situation.</p> <p>There is a substantial change to booting between the PinePhone and PinePhone Pro. Previously, each distribution could develop a self-contained eMMC or microSD card image, including a compatible bootloader and kernel distribution. Installation is as simple as flashing a microSD card with the desired distribution and popping it in.</p> <p>On the PinePhone Pro, the hardware works differently: it prefers to load the bootloader from the eMMC instead of the microSD. This means that when the PinePhone Pro shipped from the factory with Manjaro on the eMMC it will always boot the Manjaro u-Boot, even when booting from a microSD card. We no longer have any control over the bootloader for these devices.</p> <p>There is a solution, however. The hardware can have an SPI flash chip that gives a bit of storage to put U-Boot in and that storage is always preferred over the eMMC and microSD storage. The problem with this is that all the distributions need to agree on a U-Boot build to put in there, and agree to never overwrite it with a distribution-specific version.</p> <p>The solution to this is Tow-Boot: a distribution of U-Boot that can be put in that flash chip. With this the U-Boot firmware can just be treated like system firmware and be updated through fwupd independent of what distributions ship. This would work not only for the PinePhone Pro, but would also enable things like installing your preferred Linux distribution on a PineBook Pro by popping in a flash drive with a UEFI installer, much like you can on any other laptop.</p> <p>Negotiating this solution was hell. Manjaro is incentivized not to agree to this, since it cedes their sole control over the bootloader, and PINE64 listens to Manjaro before anyone else. Furthermore, PINE64 does not actually want to add SPI flash chips to their hardware. Apparently, there has been some issues with people using SPI flash as RW storage on the A64-LTS boards, which would be a support issue.</p> <p>After months of discussions between the community, Manjaro, and PINE64 leadership, we finally were able to convince them to ship the PinePhone Pro with an SPI flash chip with Tow-Boot installed on it.</p> <p>But the Pinebook Pro has a similar boot configuration, and thus a similar problem. Some time after the PinePhone Pro was shipped, it was time for a new Pinebook Pro batch, and this discussion started again. The same arguments were re-iterated by all sides all over again, and the discussion went nowhere. PINE64 representatives went so far as to say, quote, "people who want [an SPI chip] can just solder one on". This batch of Pinebook Pros has ended up shipping without Tow-Boot flashed.</p> <h2>So I left</h2> <p>This is the moment I left. I left all the official channels, stepped down as PINE64 moderator. Left the private developer chat rooms. PINE64 cares only about Manjaro, and Manjaro does not care about working with any other distributions. This is no longer a community that listens to software developers. As a representative of postmarketOS, there is no further reason for me to be directly involved with PINE64 if the only opinions that matter are those of Manjaro.</p> <p>Like many others, I have become burnt out on this ecosystem. So I quit. I am no longer getting random requests to make Manjaro's factory software work. No longer am I enduring the stress and frustration of these meaningless discussions behind the scenes, and after not being in the PINE64 for some weeks I can definitely say I'm way less stressed out.</p> <p>Now I can just focus on making postmarketOS work better. On the PINE64 hardware, and all the many other devices supported by postmarketOS.</p> <p>I hope that future vendors will make better choices, and listen to the actual whole community. Maybe even help with the software development side.</p>