I've been spending too much time trying to get Webkit built in BodgeOS so I decided I distract myself by messing with the kernel package instead.
Currently BodgeOS has a single kernel package called linux-lts which is 200MB while installed. This is a kernel with relatively many kconfig options enabled. It has the normal splitting options from the abuild system that split of all the documentation and development headers already, but what if I can push that a bit further.
My initial idea was to split of kernel modules for stuff you'd generally never use ina separate package but then the issue becomes deciding what the stuff is that should be in there. And if I need anything at all from those packages I once again have to pull in everything...
The logical solution to this is to split it in the smallest chunks instead. Have one package for every module in the kernel. This would most likely have quite a bit of overhead but luckily the apk package manager from Alpine Linux is pretty fast, so installing many small packages shouldn't be an issue.
Dependencies
It's not super hard to make a lot of split packages for the kernel, but what's more annoying is that to make it easy to install those should also have the correct dependencies between modules. If you run lsmod on your current Linux system you'd see that there are quite a few dependencies between loaded kernel modules:

So how does modprobe know what modules to load? The magic is in the depmod command, which is run during the package step of the kernel package anyway. This command analyses the module files and writes a plain text file describing the dependencies between them. This file is stored on the system in /lib/modules/$kernelversion/modules.dep and basically stores the same information as the screenshot above, but for all the modules for the kernel and with file paths instead of module names.
kernel/drivers/net/wireless/intel/iwlwifi/dvm/iwldvm.ko.zst: kernel/drivers/net/wireless/intel/iwlwifi/iwlwifi.ko.zst kernel/net/mac80211/mac80211.ko.zst kernel/lib/crypto/libarc4.ko.zst kernel/net/wireless/cfg80211.ko.zst kernel/net/rfkill/rfkill.ko.zst
kernel/drivers/net/wireless/intel/iwlwifi/mvm/iwlmvm.ko.zst: kernel/drivers/net/wireless/intel/iwlwifi/iwlwifi.ko.zst kernel/net/mac80211/mac80211.ko.zst kernel/lib/crypto/libarc4.ko.zst kernel/net/wireless/cfg80211.ko.zst kernel/drivers/ptp/ptp.ko.zst kernel/drivers/pps/pps_core.ko.zst kernel/net/rfkill/rfkill.ko.zst
Or with the paths removed for readability:
iwldvm.ko.zst: iwlwifi.ko.zst mac80211.ko.zst libarc4.ko.zst cfg80211.ko.zst
iwlmvm.ko.zst: iwlwifi.ko.zst mac80211.ko.zst libarc4.ko.zst ...
This shouldn't be too hard to parse for package dependencies...
Adding Python to the kernel build
I'm sure it's possible to make an extremely convoluted shell script one-liner that calls 5 utilities to generate the data I need from just this file. But I don't enjoy writing obfuscated code and rather have something slightly more maintainable.
That's why I decided to make a small python script that parses this file and generates exactly the data needed for abuild to create the split packages for me.
This 80 line python script does 3 things. First it has an option to generate the names of the packages that should be split of:
$ ./subpkg.py --subpackages modules.dep "linux-lts-mod-{}:_splitmod"
linux-lts-mod-power:_splitmod linux-lts-mod-iwlwifi:_splitmod....
This causes abuild to call the _splitmod function in the build script for every module generated by the kernel build. Then in this function a bit more information is required to create the package:
_splitmod() {
local _mod=${subpkgname#$pkgname-mod-}
local _prefix=usr/lib/modules/$pkgver-$pkgrel-lts
local _moddir="$pkgdir"/$_prefix
depends="$pkgname=$pkgver-r$pkgrel $(python3 $srcdir/subpkg.py --deps ${_mod} ${_moddir}/modules.dep $pkgname-mod-{}=$pkgver-r$pkgrel)"
amove ${_prefix}/$(python3 $srcdir/subpkg.py --path ${_mod} ${_moddir}/modules.dep a)
}
First subpkg.py is called with the --deps $modulename argument to generate a list of package dependencies for that module, using the same placeholder system as the first command.
The second call converts the module name back to it's filesystem path so the file can be added to the package. This together will make an absolute massive amount of tiny packages that all depend on each other in the correct way.

The result is that the main linux-lts package is reduced to 23MB installed size (14MB compressed). Which is still way too big but that's mostly down to the configuration of the kernel.
All together there are exactly 6200 split packages generated that total up to 143MB. Which brings the total for the kernel+modules up to 157MB while the original unsplit package was 140MB. So there is 17MB of overhead for the metadata and wrapping of 6200 packages but unless I install all the kernel modules it's most likely that my install will be smaller in the end.
Improving this
Splitting the package up adds about 30 minutes to the build process which is mostly due to overhead of creating a split package in the abuild build process but also launching a new python interperter twice for every subpackage in the splitting step. This can probably be made a bit faster by not having this written in python but the extra 30 minutes is only once for building anyway after way more time is spend on building the kernel package itself which I sadly did not note down on my first build without ccache. It probably was something around 1.5 to 2 hours on this machine.
To make this actually usable I'd also need an utility that helps select which kernel modules I need for my hardware and a linux-lts-all package that does an initial install with all the modules installed to boot my system. Then it should be as easy as parsing the output of lsmod and installing the minimum set of packages to make my laptop run. This is quite a similar process as what's normally done to create a minimal kernel kconfig for booting your own hardware with only the modules it needs.
The other big blob of data I have currently shipped with the kernel is the linux-firmware pacakge. It is already split into vendor specific files but it might be possible to match up the right files in the firmware package with the kernel modules that actually can load it, so installing the modules will pull in the minimum set of firmware files that are needed.
But the minimal proof of concept works!
