Megapixels 2.0: DNG exporting

It seems overkill to make a whole seperate library dedicated to replacing 177 lines of code in Megapixels that touches libtiff, but this small section of code causes significant issues for distribution packaging and compatability with external photo editing software. Most importantly the adjusted version in Millipixels for the Librem 5 does not output DNG files that are close enough to the Adobe specifications to be loaded into the calibration software.

Making this a seperate library would make it easier to test. In the Adobe DNG SDK there is a test utility that can verify if a TIFF file is up to DNG spec and it can (with a lot of complications) be build for Linux.

The spec

The first thing after copying over the code block from Megapixels to a seperate project is reading the Adobe DNG specification.

When I wrote the original export code in Megapixels it was based around some example code I found on Github for using Libtiff that I can no longer find and it results in something that's close enough to a valid DNG file for the dcraw utility. This is also a DNG 1.0 file that is generated.

I have spend the next day reading the DNG 1.4 specification from Adobe to understand what a valid DNG file is absolutely minimally required to have. These are my notes from that:

## Inside a DNG file

* SubIFDType 0 is the original raw data
* SubIFDType 1 is the thumbnail data
* The recommendation is to store the thumbnail as the first IFD
* TIFF metdata goes in the first IFD
* EXIF tags are preferred
* Camera profiles are stored in the first IFD

## Required tags

* DNGVersion
* UniqueCameraModel

Validation

I also spend a long time to build the official Adobe DNG SDK. This is mostly useless for developing any open source software due to licensing but it does provide a nice dng_validate utility that can be used to actually test the DNG files. Building this utility is pretty horrifying since it requires some specific versions of dependencies and some patches to work on modern compilers.

The libdng codebase now has the adobe_dng_sdk.sh script that will build the required libraries and the validation binary.

with the Megapixels code adjusted with the info from the documentation above I fed some random noise as data to the library to generate a DNG file and run it through the validator.

$ dng_validate out.dng
Validating "out.dng"...
*** Warning: This file has Chained IFDs, which will be ignored by DNG readers ***
*** Error: Unable to find main image IFD ***

Well that's not a great start... There's also a -v option to get some more verbose info

$ dng_validate -v out.dng
Validating "out.dng"...

Uses little-endian byte order
Magic number = 42

IFD 0: Offset = 308, Entries = 10

NewSubFileType: Preview Image
ImageWidth: 20
ImageLength: 15
BitsPerSample: 8
Compression: Uncompressed
PhotometricInterpretation: RGB
StripOffsets: Offset = 8
StripByteCounts: Count = 300
DNGVersion: 1.4.0.0
UniqueCameraModel: "LibDNG"
NextIFD = 10042

Chained IFD 1: Offset = 10042, Entries = 6

NewSubFileType: Main Image
ImageWidth: 320
ImageLength: 240
Compression: Uncompressed
StripOffsets: Offset = 441
StripByteCounts: Count = 9600
NextIFD = 0

*** Warning: This file has Chained IFDs, which will be ignored by DNG readers ***
*** Error: Unable to find main image IFD ***

Let's have a look at what the DNG spec says about this:

DNG recommends the use of SubIFD trees, as described in the TIFF-EP specification. SubIFD chains are not supported.

The highest-resolution and quality IFD should use NewSubFileType equal to 0. Reduced resolution (or quality) thumbnails or previews, if any, should use NewSubFileType equal to 1 (for a primary preview) or 10001.H (for an alternate preview).

DNG recommends, but does not require, that the first IFD contain a low-resolution thumbnail, as described in the TIFF-EP specification.

So I have the right tags and the right IFDs but I need to make an IFD tree instead of chain in libtiff. I have no idea how IFD trees work so up to the next specification!

It seems like TIFF trees are defined in the Adobe PageMaker 6 tech notes from 1995. That document describes that the NextIFD tag that libtiff used for me is used primarily for defining multi-page documents, not multiple encodings of the same document like what happens here with a thumbnail and the raw data. You know this is a 1995 spec because it gives a Fax as example of a multi-page document.

In the examples provided in that specification the first image is the main image and the NextIFD tag is just replaced by a subIFD tag. In case of DNG the main image is the thumbnail for compatibility with software that can't read the raw camera data.

Switching over to a SubIFD tag is suprisingly simple, just badly documented. Libtiff will create the NextIFD tag automatically for you but if you create an empty SubIFD tag then libtiff will fill in the offset for the next IFD for you when closing the file:

TIFF *tif = TIFFOpen(path, "w");

// Set the tags for IFD 0 like normal here
TIFFSetField(tif, TIGTAG_SUBFILETYPE, DNG_SUBFILETYPE_THUMBNAIL);

// Create a NULL reference for one SubIFD
uint64_t offsets[] = { 0L };
TIFFSetField(tif, TIFFTAG_SUBIFD, 1, &offsets);

// Write the thumbnail image data here

// Close the first IFD
TIFFWriteDirectory(tif);

// Start IFD1 describing the raw data
TIFFSetField(tif, TIFFTAG_SUBFILETYPE, DNG_SUBFILETYPE_ORIGINAL);
// write raw data and close the directory again
TIFFWriteDirectory(tif);

// Close the tiff, this will cause libtiff to patch up the references
TIFFCLose(tif);

So with the code updated the validation tool neatly shows the new SubIFD tags and finds actual errors in my DNG file data now

Uses little-endian byte order
Magic number = 42

IFD 0: Offset = 308, Entries = 11

NewSubFileType: Preview Image
ImageWidth: 20
ImageLength: 15
BitsPerSample: 8
Compression: Uncompressed
PhotometricInterpretation: RGB
StripOffsets: Offset = 8
StripByteCounts: Count = 300
SubIFDs: IFD = 10054
DNGVersion: 1.4.0.0
UniqueCameraModel: "LibDNG"
NextIFD = 0

SubIFD 1: Offset = 10054, Entries = 6

NewSubFileType: Main Image
ImageWidth: 320
ImageLength: 240
Compression: Uncompressed
StripOffsets: Offset = 453
StripByteCounts: Count = 9600
NextIFD = 0

*** Error: Missing or invalid SamplesPerPixel (IFD 0) ***
*** Error: Missing or invalid PhotometricInterpretation (SubIFD 1) ***

Ah, so these two tags are actually required but not described as such in the DNG specification since these are TIFF tags instead of DNG tags (while it does explicitly tells other TIFF required data).

Patching up these errors is easy, just slightly annoying since the validation tool seemingly gives only a single error per IFD requiring to iterate on the code a bit more. After a whole lot of iterating on the exporting code I managed to get the first valid DNG file:

Raw image read time: 0.000 sec
Linearization time: 0.002 sec
Interpolate time: 0.006 sec
Validation complete

Now the next step is adding all the plumbing to make this usable as library and making an actually nice command line utility.

First actual test

Now I have written the first iterations of libmegapixels and libdng it should be possible to actually load a picture in some editing software. So let's try some end-to-end testing with this.

With the megapixels-getframe utility from libmegapixels I can get a frame from the sensor (In this case the rear camera of the Librem 5) and then feed that raw data to the makedng utility from libdng.

$ getframe -o test.raw
Using config: /usr/share/megapixels/config/purism,librem5.conf
received frame
received frame
received frame
received frame
received frame
Stored frame to: test.raw
Format: 4208x3120
Pixfmt: GRBG
$ makedng -w 4208 -h 3120 -p GRBG test.raw test.dng
Reading test.raw...
Writing test.dng...

No errors and the file passes the DNG validation, let's load it into RawTherapee :)

I had to boost the exposure a bit since the megapixels-getframe tool does not actually control any of the sensor parameters like the exposure time so the resulting picture is very dark. There's also no whitebalance or autofocus happening so the colors look horrible.

But...

The colors are correct! The interpetation of the CFA pattern of the sensor and the orientation of the data is all correct.

Integration testing

The nice thing about having the seperate library is that testing it becomes a lot easier than testing a GTK4 application. I have added the first simple end-to-end test to the codebase now that feeds some data to makedng and checks if the result is a valid DNG file using the official Adobe tool.

#!/bin/bash
set -e

if [ $# -ne 1 ]; then
  echo "Missing tool argument"
  exit 1
fi
makedng="$1"
echo "Running tests with '$makedng'"

# This testsuite runs raw data through the makedng utility and validates the
# result using the dng_validate tool from the Adobe DNG SDK. This tool needs
# to be manually installed for these tests to run.

# Create test raw data
mkdir -p scratch
magick -size 1280x720 gradient: -colorspace RGB scratch/data.rgb

# Generate DNG
$makedng -w 1280 -h 720 -p RG10 scratch/data.rgb scratch/RG10.dng

# Validate DNG
dng_validate scratch/RG10.dng

This is launched from ctest in my cmake files for now since I'm developing most of this stuff using CLion which only properly supports cmake projects. This is why a lot of my C projects have both meson and cmake files to build them but only the meson project file has install commands in it.

For more advanced testing it would be neat to have raw sensor dumps of several sensors in different formats which are all pictures of a colorchecker like the picture above. Then have some (probably opencv) utility that can validate that a colorchecker is present in the picture with the right colors.

There also needs to be a non-adobe-propriatary validation tool that can be easily run as testsuite for distribution packaging so at build time it's possible to validate that the combination of libdng and the distribution version of libtiff can produce sane output. This has caused several issues in Megapixels before after all.

Overall architecture

With the addition of libdng the architecture for Megapixels 2.0 starts to look like this. Megapixels no longer has any pipeline manipulation code, that is all handled by the library which after configuration just passes the file descriptor for the sensor node to Megapixels to handle the realtime control of the sensor parameters.

The libdng code replaces the plain libtiff exporting done in Megapixels and generate the DNG files that will be read by postprocessd. Postprocessd reads the dng files with the help of the dcraw library which already has custom DNG reading code that does not use libtiff.

The next steps now is to flesh out the library public interface for libdng so it can do all the DNG metadata that Megapixels requires and then hooking it up to Megapixels to actually use it.

Funding update

Since my previous post about the libmegapixels developments and the Megapixels 2.0 post I wrote before that I've almost doubled the funding for actually working on all the FOSS contributions. I'm immensely thankful for all the new patrons and it also made me notice that the donations page on this site was no longer being regenerated. That is fixed now.

I'm also still trying to figure out if I can add some perks for patrons to all of this but practically all options just amount to making things slightly worse for non-patrons. I hope just making the FOSS ecosystem better one of code line at a time is enough :)