Skip to content

Upgrade Guide: v0.5.1 to v0.5.2

v0.5.2 changes how the read-only rootfs erofs is produced. Through v0.5.1, every host that ran shed create invoked mkfs.erofs locally to flatten the OCI layer tarballs into a single erofs file cached at {images_dir}/cache/sha256/<manifest-digest>.erofs. That coupled the on-disk erofs format to whatever erofs-utils version the host distro happened to ship — and on Ubuntu noble (and Pop!_OS 24.04, and the mini2/mini3 deployment targets) the shipped erofs-utils 1.7.1 has a writer bug that emits per-inode big-pcluster headers without the matching superblock feature flag. Resulting filesystems can't be mounted by any kernel — shed create boots the VM, the kernel rejects /dev/vdb with erofs: per-inode big pcluster without sb feature ... err [-117], userspace can't read /workspace, and the agent's 9P mount times out.

v0.5.2 fixes this by moving the mkfs.erofs invocation to image publish time, inside the new ghcr.io/charliek/shed-build-tools:vX.Y.Z container image that pins a known-good erofs-utils version (1.9.1 at the time of writing). The resulting erofs ships as a content-addressed OCI blob with the new io.shed.rootfs.erofs.digest annotation. Hosts just download the blob and mount it directly — no local mkfs.erofs required.

Breaking changes for users

There is no on-host fallback in v0.5.2+. Pre-v0.5.2 images (those built before this change, including everything published through v0.5.1) lack the new annotation and fail to boot with:

image manifest <digest> lacks io.shed.rootfs.erofs.digest annotation
(built with pre-v0.5.2 tooling). Re-pull against current images:
  shed image rm <digest> && shed-server pull-images.

This is intentional: silently falling back to the broken local mkfs.erofs would re-introduce the bug we're escaping.

Required upgrade steps

Every host running shed-server must wipe its local image cache and re-pull against the v0.5.2 published images.

1. Stop existing sheds

The new image format isn't compatible with the cached lowers v0.5.1 created. Sheds that were created against v0.5.1 images need to be deleted before the upgrade — or the user can keep them by reverting to v0.5.1 temporarily.

shed list                       # see what's running
shed stop <name>                # for each running shed
shed delete <name> --force      # for each shed to be replaced

2. Upgrade the shed-server package

sudo apt update
sudo apt install --only-upgrade shed-server

3. Wipe the legacy image cache

# Removes every tag (manifest digests remain until prune)
shed image ls -q | xargs -r -n1 shed image rm
shed image prune --force

If your images_dir still contains a legacy cache/ directory (holdover from v0.5.1's locally-materialized erofs files), it's safe to remove:

sudo rm -rf /var/lib/shed/firecracker/images/cache
# Or, for VZ: rm -rf ~/Library/Application\ Support/shed/vz/cache

4. Re-pull the v0.5.2 images

sudo shed-server pull-images

This downloads the new manifests, each carrying both the layer tarballs (unchanged) and the prebuilt rootfs erofs blob (new). On-disk usage is comparable to v0.5.1's cached layout (we trade a derived cache/ file for a content-addressed blob), with the upside that the blob is shared across every tag pointing at the same manifest digest.

5. Create a new shed

shed create test --local-dir $(mktemp -d)
shed exec test -- ls /workspace
shed delete test --force

What's new architecturally

  • ghcr.io/charliek/shed-build-tools image. Pinned mkfs.erofs (plus dump.erofs, fsck.erofs) tagged in lockstep with shed releases. See Build Tools.
  • New manifest annotation io.shed.rootfs.erofs.digest carries the prebuilt erofs blob's content digest. Sits alongside the existing io.shed.kernel.digest and io.shed.initrd.digest annotations (same loose-blob pattern).
  • No more on-host mkfs.erofs dependency. Hosts running shed-server no longer need erofs-utils installed. Existing installations can apt remove erofs-utils safely (the new image pipeline only needs it for the publishing side, which runs inside the build-tools container).
  • Image build flag --build-tools-version. shed image build pins the build-tools image used to mint the erofs. Defaults match the shed CLI's own version for clean releases; local dev should pass --build-tools-version dev against a make build-tools image.

Why no backwards compatibility

A backwards-compatible path would require keeping the local mkfs.erofs flow as a fallback for pre-v0.5.2 images. That:

  1. Re-introduces the writer-bug surface area we're escaping (since the v0.5.1 images would still be flattened with whatever mkfs.erofs the host happens to have).
  2. Keeps ~200 lines of cache.go and the docker-export fallback alive indefinitely.
  3. Makes the failure mode ambiguous when something goes wrong — was it the new path or the old path?

Cleaner: hard-fail on the missing annotation, document the precise upgrade command, accept the one-time inconvenience for a permanent simplification.

Rollback

If v0.5.2 doesn't work in your environment:

sudo apt install --reinstall shed-server=0.5.1
shed image rm <every v0.5.2 tag>
shed-server pull-images        # re-pulls v0.5.1 images

Note that v0.5.1 itself was broken end-to-end on Ubuntu noble (the bug this guide describes), so "rollback" is only useful if you have a known-working environment with an older erofs-utils (e.g., hand-built 1.8.x).