Upgrade Guide: v0.5.1 to v0.5.2¶
v0.5.2 changes how the read-only rootfs erofs is produced. Through
v0.5.1, every host that ran shed create invoked mkfs.erofs
locally to flatten the OCI layer tarballs into a single erofs file
cached at {images_dir}/cache/sha256/<manifest-digest>.erofs. That
coupled the on-disk erofs format to whatever erofs-utils version
the host distro happened to ship — and on Ubuntu noble (and
Pop!_OS 24.04, and the mini2/mini3 deployment targets) the shipped
erofs-utils 1.7.1 has a writer bug that emits per-inode
big-pcluster headers without the matching superblock feature flag.
Resulting filesystems can't be mounted by any kernel — shed create
boots the VM, the kernel rejects /dev/vdb with
erofs: per-inode big pcluster without sb feature ... err [-117],
userspace can't read /workspace, and the agent's 9P mount times out.
v0.5.2 fixes this by moving the mkfs.erofs invocation to image
publish time, inside the new ghcr.io/charliek/shed-build-tools:vX.Y.Z
container image that pins a known-good erofs-utils version (1.9.1 at
the time of writing). The resulting erofs ships as a content-addressed
OCI blob with the new io.shed.rootfs.erofs.digest annotation. Hosts
just download the blob and mount it directly — no local mkfs.erofs
required.
Breaking changes for users¶
There is no on-host fallback in v0.5.2+. Pre-v0.5.2 images
(those built before this change, including everything published
through v0.5.1) lack the new annotation and fail to boot with:
image manifest <digest> lacks io.shed.rootfs.erofs.digest annotation
(built with pre-v0.5.2 tooling). Re-pull against current images:
shed image rm <digest> && shed-server pull-images.
This is intentional: silently falling back to the broken local
mkfs.erofs would re-introduce the bug we're escaping.
Required upgrade steps¶
Every host running shed-server must wipe its local image cache and re-pull against the v0.5.2 published images.
1. Stop existing sheds¶
The new image format isn't compatible with the cached lowers v0.5.1 created. Sheds that were created against v0.5.1 images need to be deleted before the upgrade — or the user can keep them by reverting to v0.5.1 temporarily.
shed list # see what's running
shed stop <name> # for each running shed
shed delete <name> --force # for each shed to be replaced
2. Upgrade the shed-server package¶
3. Wipe the legacy image cache¶
# Removes every tag (manifest digests remain until prune)
shed image ls -q | xargs -r -n1 shed image rm
shed image prune --force
If your images_dir still contains a legacy cache/ directory
(holdover from v0.5.1's locally-materialized erofs files), it's safe
to remove:
sudo rm -rf /var/lib/shed/firecracker/images/cache
# Or, for VZ: rm -rf ~/Library/Application\ Support/shed/vz/cache
4. Re-pull the v0.5.2 images¶
This downloads the new manifests, each carrying both the layer
tarballs (unchanged) and the prebuilt rootfs erofs blob (new).
On-disk usage is comparable to v0.5.1's cached layout (we trade
a derived cache/ file for a content-addressed blob), with the
upside that the blob is shared across every tag pointing at the
same manifest digest.
5. Create a new shed¶
What's new architecturally¶
ghcr.io/charliek/shed-build-toolsimage. Pinnedmkfs.erofs(plusdump.erofs,fsck.erofs) tagged in lockstep with shed releases. See Build Tools.- New manifest annotation
io.shed.rootfs.erofs.digestcarries the prebuilt erofs blob's content digest. Sits alongside the existingio.shed.kernel.digestandio.shed.initrd.digestannotations (same loose-blob pattern). - No more on-host
mkfs.erofsdependency. Hosts running shed-server no longer neederofs-utilsinstalled. Existing installations canapt remove erofs-utilssafely (the new image pipeline only needs it for the publishing side, which runs inside the build-tools container). - Image build flag
--build-tools-version.shed image buildpins the build-tools image used to mint the erofs. Defaults match the shed CLI's own version for clean releases; local dev should pass--build-tools-version devagainst amake build-toolsimage.
Why no backwards compatibility¶
A backwards-compatible path would require keeping the local
mkfs.erofs flow as a fallback for pre-v0.5.2 images. That:
- Re-introduces the writer-bug surface area we're escaping
(since the v0.5.1 images would still be flattened with whatever
mkfs.erofsthe host happens to have). - Keeps ~200 lines of
cache.goand the docker-export fallback alive indefinitely. - Makes the failure mode ambiguous when something goes wrong — was it the new path or the old path?
Cleaner: hard-fail on the missing annotation, document the precise upgrade command, accept the one-time inconvenience for a permanent simplification.
Rollback¶
If v0.5.2 doesn't work in your environment:
sudo apt install --reinstall shed-server=0.5.1
shed image rm <every v0.5.2 tag>
shed-server pull-images # re-pulls v0.5.1 images
Note that v0.5.1 itself was broken end-to-end on Ubuntu noble (the
bug this guide describes), so "rollback" is only useful if you have
a known-working environment with an older erofs-utils (e.g.,
hand-built 1.8.x).