Mobile Linux camera pt6

The processing with postprocessd has been working pretty well for me on the PinePhone. After I released it I had someone test it with the dng files from a Librem 5 to see how it deals with a completely different input.

To my suprise the answer was: not well. With the same postprocessing for the PinePhone and the Librem 5 the Librem 5 pictures are turning out way too dark and contrasty. The postprocessd code is supposed to be generic and has no PinePhone specific code in it.

Fast forward to some time later I now have a Librem 5 so I can do more camera development. The first thing to do is the sensor calibration process I did with the PinePhone in part 4 of this blog series. This involves taking some pictures of a proper calibration target which in my case is an X-rite ColorChecker Passport and feeding that into some calibration software.

Because aligning color charts and making sure all the file format conversions with the DCamProf calibration suite from RawTherapee is quite annoying I got the paid graphical utility from the developers. By analyzing the pictures the software will generate a lot of calibration data. From that currently only a small part is used by Megapixels: the ColorMatrix and ForwardMatrix.

These are 3x3 matrices that do the colorspace conversion for the sensor. I originally just added these two to Megapixels because these have the least amount of values so they can fit in the camera config file and they have a reasonable impact on image quality.

The file contains two more important things though. The ToneCurve which converts the brightness data from the sensor to linear space and the HueSatMap which contains three correction curves in a 3 dimensional space of hue, saturation and brightness, this obviously is the most data.

What is a raw photo?

The whole purpose of Megapixels and postprocessd is take the raw sensor data and postprocess that with a lot of cpu power after taking the picture to produce the best picture possible. The processing of this is built on top of existing open source photo processing libraries like libraw.

The expectations this software has for "raw" image data is that it's high bit depth linear-light sensor data that has not been debayered yet. The data from the Librem 5 is exactly this, the PinePhone sensor data is weirder.

Unlike most phones that have the camera connected over MIPI-CSI which gives a nice high speed serial connection to push image data, the PinePhone is connected over a parallel bus.

Rear camera connection from the PinePhone 1.2 schematic

This parallel bus provides hsync/vsync/clock and 8 data lines for the image data. The ov5640 sensor itself has a 10-bit interface though:

The D[9:0] is the 10 image data lines from the sensor

Since only 8 of the 10 lines are available in the flatflex from the sensor module that has the ov5640 in it the camera has to be configured to output 8-bit data. I made the assumption the sensor just truncates two bits from the image data but from the big difference in the brightness response I have the suspicion that the image data is no longer linear in this case. It might actually be outputing an image that's not debayered but does have an sRGB gamma curve.

This is not really a case that raw image libraries deal with and it would not traditionally be labelled "raw sensor data". But it's what we have. But instead of making assumptions again lets just look at the data.

I have pictures of the colorchecker for both cameras and the colorchecker contains a strip of grayscale patches. With this it's possible to make a very rough estimation of the gamma curve of the picture. I cropped out that strip of patches from both calibration pictures and put them in the same image but with different colors. I also made sure to rescale the data to hit 0% and 100% with the darkest and brightest patch.

Waveform for the neutral patches, green is the PinePhone and pink is the Librem 5

The result clearly shows that the the data from the PinePhone is not linear. It also shows that the Librem 5 is also not linear but in the opposite direction.

These issues can be fixed though with the tonecurve calibration that's missing from the current Megapixels pictures

postprocessd is not generic after all

So what happened is that I saw the output of postprocessd while developing it and saw that my resulting pictures were way too bright. I thought I must've had a gamma issue and added a gamma correction to the code.

With this code added it looks way better for the PinePhone, it looks way worse for the Librem 5. This is all a side effect of developing it with the input of only one camera. The correct solution for this is not having this gamma correction and have the libraw step before it correct the raw data according to the tonecurve that's stored in the file.

Storing more metadata

The issue with adding more calibration metadata to the files is that it doesn't really fit in the camera ini file. I have debated just adding a quick hack to it and make a setting that generates a specific gamma curve to add as the tone curve. This will fix it for my current issue but to fix it once and for all it's way better to include all the curves generated by the calibration software.

So what is the output of this software? Lumariver Profiler outputs .dcp files which are "Adobe Digital Negative Camera Profile" files. I have used the profile inspection output that turns this binary file into readable json and extracted the matrices before. It would be way easier to just include the .dcp file alongside the camera configuration files to store the calibration data.

I have not been able to find any official file format specification for this DCP file but I saw something very familiar throwing the file in a hex editor... The file starts with II. This is the byte order mark for a TIFF file. The field directly after it is not 0x42 though which makes this an invalid TIFF file. It turns out that a DCP file is just a TIFF file with a modified header that does not have any image data in it. This makes the Megapixels implementation pretty easy: read the TIFF tags from the DCP and save them in the DNG (which is also TIFF).

In practice this was not that easy. Mainly because I'm using libtiff and DCP is almost a TIFF file. Using libtiff for DNG files works pretty well since DNG is a superset of the TIFF specification. The only thing I have to do is add a few unknown TIFF tags to the libtiff library at runtime to use it. DCP is a subset of the TIFF specification instead and it is missing some of the tags that are required by the TIFF specification. There's also no way in libtiff to ignore the invalid version number in the header.

So I wrote my own tiff parser for this. Tiff parsers are quite hard since there's an enormous amount of possiblities to store things in TIFF files. Since DCP is a smaller subset of TIFF it's quite reasonable to parse it manually instead. A parser for the DCP metadata is around 160 lines of plain C, so that is now embedded in Megapixels. The code searches for a .dcp files associated with a specific sensor and then embeds the calibration data into the generated DNG files. If the matrices are also defined in the camera ini files then those are overwritten by the ones from the DCP file.

Results

The new calibration work is now in megapixels#30 and needs to go through the testing and release process now. There's also a release for postprocessd upcoming that removes the gamma correction.

For the Librem 5 there's millipixels#88 that adds correct color matrices for now until that has the DCP code added.