Creating a chroot in Linux is pretty easy: put a rootfs in a folder and run the sudo chroot /my/folder command. But what if you don't want to use superuser privileges for this?

This is not super simple to fix, not only does the chroot command itself require root permissions but the steps for creating the rootfs in the first place and mounting the required filesystems like /proc and /sys require root as well.

In pmbootstrap the process for creating an installable image for a phone requires setting up multiple chroots and executing many commands in those chroots. If you have the password timeout disabled in sudo you will notice that you will have to enter your password tens to hundreds of times depending on the operation you're doing. An example of this is shown in the long running "pmbootstrap requires sudo" issue on Gitlab. In this example sudo was called 240 times!

Now it is possible with a lot of refactoring to move batches of superuser-requiring commands into scripts and elevate the permissions of that with a single sudo call but to get this down to a single sudo call per pmbootstrap command would be really hard.

Another approach

So instead of building a chroot the "traditional" way what are the alternatives?

The magic trick to get this working are user namespaces. From the Linux documentation:

User namespaces isolate security-related identifiers and attributes, in particular, user IDs and group IDs (see credentials(7)), the root directory, keys (see keyrings(7)), and capabilities (see capabilities(7)). A process's user and group IDs can be different inside and outside a user namespace. In particular, a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace; in other words, the process has full privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace.

It basically allows running commands in a namespace where you have UID 0 on the inside without requiring to elevate any of the commands. This does have a lot of limitations though which I somehow all manage to hit with this.

One of the tools that makes it relatively easy to work with the various namespaces in Linux is unshare. Conveniently this is also part of util-linux so it's a pretty clean dependency to have.

Building a rootfs

There's enough examples of using unshare to create a chroot without sudo but those all assume you already have a rootfs somewhere to chroot into. Creating the rootfs itself has a few difficulties already though.

Since I'm building an Alpine Linux rootfs the utility I'm going to use is apk.static. This is a statically compiled version of the package manager in Alpine which allows building a new installation from an online repository. This is similar to debootstrap for example if you re more used to Debian than Alpine.

There's a wiki page on running Alpine Linux in a chroot that documents the steps required for setting up a chroot the traditional way with this. The initial commands to aquire the apk.static binary don't require superuser at all, but after that the problems start:

$ ./apk.static -X ${mirror}/latest-stable/main -U --allow-untrusted -p ${chroot_dir} --initdb add alpine-base

This creates the Alpine installation in ${chroot_dir}. This requires superuser privileges to set the correct permissions on the files of this new rootfs. After this there's two options of populating /dev inside this rootfs which both are problematic:

$ mount -o bind /dev ${chroot_dir}/dev
mounting requires superuser privileges and this exposes all your hardware in the chroot

$ mknod -m 666 ${chroot_dir}/dev/full c 1 7
$ mknod -m 644 ${chroot_dir}/dev/random c 1 8
... etcetera, the mknod command also requires superuser privileges

The steps after this have similar issues, most of them for mount reasons or chown reasons.

There is a few namespace options from unshare used to work around these issues. The command used to run apk.static in my test implementation is this:

$ unshare \
    --user \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    ./apk-tools-static -X...etc

This will use unshare to create a new userns and change the uid/gid inside that to 0. This effectively grants root privileges inside this namespace. But that's not enough.

If chown is used inside the namespace it will still fail because my unprivileged user still can't change the permissions of those files. The solution to that is the uid remapping with --map-users and --map-groups. In the example above it sets up the namespace so files created with uid 0 will generate files with the uid 100000 on the actual filesystem. uid 1 becomes 100001 and this continues on for 10000 uids.

This again does not completely solve the issue though because my unprivileged user still can't chown those files, doesn't matter if it's chowning to uid 0 or 100000. To give my unprivileged user this permission the /etc/subuid and /etc/subgid files on the host system have to be modified to add a rule. This sadly requires root privileges once to set up this privilege. To make the command above work I had to add this line to those two files:

martijn:100000:10000

This grants the user with the name martijn the permission to use 10.000 uids starting at uid 100.000 for the purpose of userns mapping.

The result of this is that the apk.static command will seem to Just Work(tm) and the resulting files in ${chroot_dir} will have all the right permissions but only offset by 100.000.

One more catch

There is one more complication with remapped uids and unshare that I've skipped over in the above example to make it clearer, but the command inside the namespace most likely cannot start.

If you remap the uid with unshare you get more freedom inside the namespace, but it limits your privileges outside the namespace even further. It's most likely that the unshare command above was run somewhere in your own home directory. After changing your uid to 0 inside the namespace your privilege to the outside world will be as if you're uid 100.000 and that uid most likely does not have privileges. If any of the folders in the path to the executable you want unshare to run for you inside the namespace don't have the read and execute bit set for the "other" group in the unix permissions then the command will simply fail with "Permission denied".

The workaround used in my test implementation is to just first copy the executable over to /tmp and hope you at least still have permissions to read there.

Completing the rootfs

So after all that the first command from the Alpine guide is done. Now there's only the problems left for mounting filesystems and creating files.

While /etc/subuid does give permission to use a range of uids as an unprivileged user with a user namespace it does not give you permissions to create those files outside the namespace. So the way those files are created is basically the complicated version of echo "value" | sudo tee /root/file:

$ echo "nameserver a.b.c.d" | unshare \
    --user \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    sh -c 'cat > /etc/resolv.conf'

This does set-up and tear down the entire namespace for every file change or creation which is a bit inefficient, but inefficient is still better than impossible. Changing file permissions is done in a similar way.

To fix the mounting issue there's the mount namespace functionality in Linux. This allows creating new mounts inside the namespace as long as you still have permissions on the source file as your unprivileged user. This effectively means you can't use this to mount random block devices but it works great for things like /proc and loop mounts.

There is a --mount-proc parameter that will tell unshare to set-up a mount namespace and then mount /proc inside the namespace at the right place so that's what I'm using. But I still need other things mounted. This mounting is done as a small inline shell script right before executing the commands inside the chroot:

$ unshare \
    --user \
    --fork \
    --pid \
    --mount \
    --mount-proc \
    --map-users=10000,0,10000 \
    --map-groups=10000,0,10000 \
    --setuid 0 \
    --setgid 0 \
    --wd "${chroot_dir}" \
    -- \
    sh -c " \
    	mount -t proc none proc ; \
        touch dev/zero ; \
        mount -o rw,bind /dev/zero dev/zero ;\
        touch dev/null ; \
        mount -o row,bind /dev/null dev/null ;\
        ...
        chroot . bin/sh \
        "

The mounts are created right between setting up the namespaces but before the chroot is started so the host filesystem can still be accessed. The working directory is set to the root of the rootfs using the --wd parameter of unshare and then bind mounts are made from /dev/zero to dev/zero to create those devices inside the rootfs.

This combines the two impossible options to make it work. mknod can still not work inside namespaces because it is a bit of a security risk. mount'ing /dev gives access to way too many devices that are not needed but the mount namespace does allow bind-mounting the existing device nodes one by one and allows me to filter them.

Then finally... the chroot command to complete the journey. This has to refer to the rootfs with a relative path and this also depends on the working directory being set by unshare since host paths are breaking with uid remapping.

What's next?

So this creates a full chroot without superuser privileges (after the initial setup) and this whole setup even works perfectly with having cross-architecture chroots in combination with binfmt_misc.

Compared to pmbootstrap this codebase does very little and there's more problems to solve. For one all the filesystem manipulation has to be figured out to copy the contents of the chroot into a filesystem image that can be flashed. This is further complicated by the mangling of the uids in the host filesystem so it has to be remapped while writing into the filesystem again.

Flashing the image to a fastboot capable device should be pretty easy without root privileges, it only requires an udev rule that is usually already installed by the android-tools package on various Linux distributions. For the PinePhone flashing happens on a mass-storage device and as far as I know it will be impossible to write to that without requiring actual superuser privileges.

The code for this is in the ~martijnbraam/ambootstrap repository, hopefully in some time I get this to actually write a plain Alpine Linux image to a phone :D