Skip to main content

E2E Test Catalogue

This page describes every end-to-end and integration test in cli/tests/e2e/ in plain language. Each entry states the precondition and the expected outcome, followed by a reference to the exact test function for traceability.

Test tiers

TierWhere it runsWhat it needs
Tier 1Local temp directoryThe compiled aibox binary only
Tier 1 + MockLocal temp directoryBinary + mock docker/podman scripts on PATH
Tier 2Remote SSH companionaibox-e2e-testrunner container reachable (feature flag e2e)

Tier 1 tests run automatically with cargo test. Tier 2 tests require the companion container and the --features e2e flag. The expensive visual matrix tests are opt-in and are not included in the default Tier 2 command.

From inside the aibox devcontainer, the companion is a remote SSH target, not a local Docker/Podman dependency. Check it with:

ssh -i /workspace/.aibox-e2e-runner-home/.ssh/id_ed25519 testuser@aibox-e2e-testrunner 'echo ok'

Missing docker or podman in the main devcontainer does not mean Tier 2 cannot talk to the companion. The tests deploy the current aibox binary and addons with SCP, then use the companion's own runtime for lifecycle checks.

aibox commands inside the devcontainer

In normal dogfood use, the workspace container is the processkit/runtime side of the project: run pk-doctor there. aibox doctor is a host-side diagnostic and should not be used from inside the container to judge the live dogfood project. It still can run inside containers when that container is intentionally simulating a host environment for CLI development.

The aibox repository has deliberate exceptions because it develops the aibox CLI. Those exceptions simulate host-user behavior in controlled test projects:

  • Tier 1 tests run aibox init, aibox apply, aibox doctor, and related commands in temporary directories, usually without starting containers.
  • Tier 1 mock tests put fake docker/podman scripts on PATH to verify host runtime behavior without requiring a real runtime in the main devcontainer.
  • Tier 2 tests deploy the current binary to aibox-e2e-testrunner over SSH and may run aibox apply, aibox up, or aibox doctor there because the companion owns the nested container runtime for the test.
  • Release Phase 0 runs ./scripts/maintain.sh release-doctors, which invokes aibox doctor as an explicit host-context simulation.

Use these exceptions only in aibox CLI development/release harnesses. They are not general dogfood escape hatches.

The container-side release command runs Tier 2 as part of Phase 1 with cargo test --features e2e --test e2e, so the SSH companion, generated runtime probes, and non-ignored asciinema checks are release gates. The default Tier 2 suite intentionally performs only one full generated container build/start/probe. File-generation contracts use --no-container, and the companion's nested container storage is pruned once before and after the suite by ./scripts/maintain.sh test-e2e so Podman vfs cache is bounded without discarding layers between every test.

The release process also has a host-side generated-runtime smoke: ./scripts/maintain.sh release-runtime-smoke X.Y.Z. It is not an SSH companion test; it runs on the macOS host during release-host, creates a fresh downstream-style project, runs aibox init and aibox apply --standardize-config, starts the generated container, probes tmux-native status output and the diagnostics sidecar, and writes logs under dist/release-smoke/vX.Y.Z/. The default AIBOX_RELEASE_SMOKE_TIER=addons includes git-ui (lazygit) probes. Use minimal only for a quicker non-addon pass, or full to include preview addons and force --no-cache.

Opt-in visual E2E

Use these commands when the release diff touches generated runtime visuals or when the periodic full visual sweep is due:

CommandCovers
./scripts/maintain.sh test-e2e-visual-statusall generated layouts across all themes, tmux status/key rows, and theme RGB signatures
./scripts/maintain.sh test-e2e-visual-tabstmux window traversal, Yazi surface, Vim, shell, lazygit, and every enabled AI harness
./scripts/maintain.sh test-e2e-visual-yaziYazi preview plugins, optional preview tools, git symbols, and preview modes
./scripts/maintain.sh test-e2e-visualall visual tiers
./scripts/maintain.sh test-e2e-doc-capturesall visual tiers plus .cast, .screen.txt, tmux log, and metadata artifacts under docs-site/static/img/e2e/

Set AIBOX_E2E_VISUAL_ARTIFACT_DIR to write documentation capture artifacts elsewhere. The artifacts are intended as source material for current-release website screenshots and screencasts.


Lifecycle — lifecycle.rs

Companion is reachable If the SSH connection to aibox-e2e-testrunner is attempted, then the host must respond with ok, confirming the companion container is up and reachable before any other Tier 2 test runs. The test also asserts that the companion image has the expected tmux and Yazi tools for the visual/runtime tests; stale companion images fail here with a rebuild hint. [lifecycle.rs · companion_is_reachable]

Init then apply produces valid project If aibox init is run followed by aibox apply, then aibox.toml, .devcontainer/Dockerfile, .devcontainer/docker-compose.yml, and CLAUDE.md must all exist in the workspace. [lifecycle.rs · lifecycle_init_apply]

Generated container starts If a fresh project is initialized, applied, and the generated Compose service is started on the companion runtime, then the running container must expose /etc/aibox-version, tmux, Yazi, and valid aibox-status --plugin-json output. [lifecycle.rs · lifecycle_apply_starts_generated_container]

CLAUDE.md user content is preserved on apply If a user edits CLAUDE.md after aibox init and then runs aibox apply, then the edited content must still be present — aibox must not overwrite user-modified files. [lifecycle.rs · claudemd_preserved_on_sync]

Generated files are overwritten on apply If a generated file (e.g. .devcontainer/Dockerfile) is manually tampered with and aibox apply is run, then the file must contain regenerated content and the tampered content must be gone. [lifecycle.rs · generated_files_overwritten_on_sync]

Status reports missing when no container exists If aibox get runtime is run in a project with no running container, then the output must contain missing or equivalent wording. [lifecycle.rs · status_without_container_shows_missing]

Managed package writes the slim project skeleton If aibox init --harness claude is run, then the slim project skeleton must exist: aibox.toml, an empty context/ directory, AGENTS.md, and the thin provider pointer files for enabled harnesses. The single-file context tracks (BACKLOG.md, DECISIONS.md, STANDUPS.md) are not scaffolded by init — the corresponding processkit skills create entities in place on first use. [lifecycle.rs · init_with_managed_preset_creates_context_files]

Software package selection is recorded in aibox.toml If aibox init --context software is run, then the same slim project skeleton must exist. Concrete processkit content lands under context/skills/ only after aibox apply with a real [processkit].version pinned. [lifecycle.rs · init_with_software_preset_creates_code_files]


Addon management — addon.rs

Addon add writes to aibox.toml If aibox set addon python is run in an initialized project, then aibox.toml must contain an [addons.python] section afterwards. [addon.rs · set_addon_modifies_toml]

Addon remove cleans aibox.toml If a project is initialized with the python addon and then aibox delete addon python is run, then the [addons.python] section must no longer appear in aibox.toml. [addon.rs · delete_addon_cleans_toml]

Addon content appears in generated Dockerfile after apply If a project is initialized with the python addon and aibox apply is run, then .devcontainer/Dockerfile must contain Python-related content (install commands or references to uv). [addon.rs · addon_rebuild_includes_tools_in_dockerfile]

Addon list shows available addons If aibox get addon is run in an initialized project, then the output must list known addons such as python. [addon.rs · addon_list_shows_available]


Reset and backup — reset.rs

Reset with backup removes files and creates backup directory If aibox reset project --yes is run in an initialized project, then aibox.toml must be deleted and .aibox/backup/ must be created containing the backed-up files. [reset.rs · reset_creates_backup]

Reset with --no-backup removes all files without creating a backup If aibox reset project --no-backup --yes is run, then aibox.toml and .devcontainer/ must be deleted and .aibox/backup/ must not be created. [reset.rs · reset_no_backup_deletes_all]


Doctor diagnostics — doctor.rs

Doctor without a config reports an error If aibox doctor is run in a directory that has no aibox.toml, then the output must mention the missing config or config error and the command must still exit 0 (doctor is always non-fatal). [doctor.rs · doctor_reports_missing_files]

Doctor after init reports healthy checks If aibox doctor is run immediately after a successful aibox init, then the output must contain at least one passing check indicator (ok, , or similar). [doctor.rs · doctor_after_init_reports_healthy]


Version upgrade flows — version_upgrade.rs

Generated Dockerfile contains version label If aibox init is run, then the generated .devcontainer/Dockerfile must contain a LABEL aibox.version line so the built image carries a machine-readable version stamp. [version_upgrade.rs · dockerfile_contains_aibox_version_label]

Generated Dockerfile writes version to /etc/aibox-version If aibox init is run, then the generated .devcontainer/Dockerfile must contain a RUN statement that writes to /etc/aibox-version inside the image, making the build version queryable from within a running container. [version_upgrade.rs · dockerfile_contains_etc_aibox_version_write]

Up fails when container image version mismatches config If an existing container was built from image v0.0.1 (mock label) and aibox.toml pins the current version, then aibox up must exit non-zero and output a message containing mismatch and a suggestion to run aibox apply. [version_upgrade.rs · start_fails_on_image_version_mismatch]

Up succeeds when container image version matches config If an existing container reports the same image version as the one pinned in aibox.toml, then aibox up must not produce a version mismatch error. [version_upgrade.rs · start_does_not_error_when_versions_match]

Update -y exits zero without hanging If aibox self update -y is run (the global --yes flag), then the command must exit 0 regardless of registry availability — confirming the flag is correctly wired to cmd_update and does not block on an interactive prompt. [version_upgrade.rs · update_yes_flag_exits_zero]

Update --dry-run does not mention .aibox-version If aibox self update --dry-run is run, then the output must not contain the phrase Would update .aibox-version — that write was removed in BACK-060 because the image version is now tracked exclusively in aibox.toml. [version_upgrade.rs · update_dry_run_does_not_mention_aibox_version_file]

Doctor warns when running container has a stale image label If the running container reports aibox.version=0.0.1 (mock label) but aibox.toml pins the current version, then aibox doctor must emit a warning containing mismatch while still exiting 0. [version_upgrade.rs · doctor_warns_on_container_version_mismatch]

Doctor warns when .aibox-version is outdated If .aibox-version is overwritten with 0.0.1 (an old CLI version) and aibox doctor is run, then the output must contain CLI version mismatch and suggest running aibox apply to update generated files. [version_upgrade.rs · doctor_warns_on_cli_version_file_mismatch]


Migration — migration.rs

Apply absorbs legacy .aibox-version into aibox.lock If .aibox-version is overwritten with 0.1.0 (an old version) and aibox apply is run, then .aibox-version must be removed and aibox.lock must contain the current [aibox].cli_version sync state. [migration.rs · apply_absorbs_legacy_version_file_into_lock]


Update command — update.rs

Update exits zero when registry returns an error If aibox self update is run in a project where the GHCR registry is unreachable or returns a non-2xx response, then the command must still exit 0 — the error must be treated as a warning, not a hard failure. [update.rs · update_runs_without_crashing_in_derived_project]

Update --check exits zero If aibox self update --check is run in an initialized project, then the command must exit 0 and print output containing either Current CLI version: or Checking for updates, regardless of whether the registry is reachable. [update.rs · update_check_exits_cleanly]


Appearance — appearance.rs

All themes render without error and without leftover placeholders If aibox init is run for each of the seven supported themes (gruvbox-dark, catppuccin-mocha, catppuccin-latte, dracula, tokyo-night, nord, projectious), then the seeded config files must contain no unreplaced template placeholders such as AIBOX_THEME or AIBOX_VIM_COLORSCHEME. [appearance.rs · all_themes_render_without_error]

Gruvbox theme sets the correct vim colorscheme and tmux theme If aibox init --theme gruvbox-dark is run, then vimrc must contain gruvbox or retrobox as the colorscheme and tmux.conf must reference gruvbox-dark. [appearance.rs · theme_gruvbox_renders_correctly]

Catppuccin-mocha theme is reflected in tmux config If aibox init --theme catppuccin-mocha is run, then tmux.conf must reference catppuccin-mocha. [appearance.rs · theme_catppuccin_mocha_renders]

Changing the theme updates all themed tool configs If a project is initialized with gruvbox-dark and the theme is changed to dracula via aibox apply, then tmux.conf must contain dracula and no longer gruvbox-dark, and vimrc, yazi/theme.toml, and starship.toml must be updated. Lazygit config is optional and is checked only when the git-ui addon enables it. [appearance.rs · theme_change_auto_applies_untouched_runtime_files]

Each theme produces matching configs across all tools If aibox init is run for each of five themes with known vim colorscheme names, then vimrc must contain the exact colorscheme <name> line, tmux.conf must reference the theme name, yazi and starship configs must be non-empty, and lazygit config must be non-empty when present. [appearance.rs · theme_alignment_all_tools_match_selected_theme]

Yazi keymap includes the open-in-editor binding If aibox init is run, then yazi/keymap.toml must contain an "e" key binding that invokes open-in-editor. [appearance.rs · yazi_keymap_includes_edit_in_pane_binding]

All prompt presets produce a non-empty starship config If aibox init is run for each prompt preset (default, plain, minimal, nerd-font, pastel, powerline-pastel, bracketed, arrow), then starship.toml must exist and be non-empty. [appearance.rs · all_prompts_render_without_error]

Default prompt includes directory and git_branch modules If aibox init --prompt default is run, then starship.toml must contain both directory and git_branch module sections. [appearance.rs · prompt_default_generates_starship]

Plain prompt uses ASCII-only symbols If aibox init --prompt plain is run, then starship.toml must not contain Nerd Font glyph characters (e.g. no \ue0b0 powerline arrow). [appearance.rs · prompt_plain_no_nerd_font]


Config coverage — config_coverage.rs

Container name appears in docker-compose.yml If aibox.toml specifies a container name and aibox apply is run, then docker-compose.yml must contain that name. [config_coverage.rs · container_name_in_compose]

Container hostname appears in docker-compose.yml If aibox.toml specifies a hostname and aibox apply is run, then docker-compose.yml must contain that hostname. [config_coverage.rs · container_hostname_in_compose]

Port mappings appear in docker-compose.yml If aibox.toml defines ports (e.g. "8080:80") and aibox apply is run, then docker-compose.yml must contain those port entries. [config_coverage.rs · container_ports_in_compose]

Extra packages appear in the generated Dockerfile If aibox.toml lists extra packages and aibox apply is run, then .devcontainer/Dockerfile must contain those package names in an apt install block. [config_coverage.rs · container_extra_packages_in_dockerfile]

Environment variables appear in docker-compose.yml If aibox.toml defines environment variables and aibox apply is run, then docker-compose.yml must contain those key-value pairs. [config_coverage.rs · container_environment_in_compose]

Extra volumes appear in docker-compose.yml If aibox.toml defines extra volume mounts and aibox apply is run, then docker-compose.yml must contain those source and target paths. [config_coverage.rs · container_extra_volumes_in_compose]

Claude AI provider adds volume mount If aibox.toml lists claude as an AI provider and aibox apply is run, then docker-compose.yml must contain a volume mount for the .claude config directory. [config_coverage.rs · ai_claude_provider_volume_mount]

Aider AI provider adds volume mount If aibox.toml lists aider as an AI provider and aibox apply is run, then docker-compose.yml must contain a volume mount for the .aider config directory. [config_coverage.rs · ai_aider_provider_volume_mount]

Multiple AI providers each add their own volume mounts If aibox.toml lists both claude and gemini as providers and aibox apply is run, then docker-compose.yml must contain volume mounts for both .claude and .gemini. [config_coverage.rs · ai_multiple_providers_volume_mounts]

Audio enabled adds PulseAudio mounts and socket If aibox.toml enables audio and aibox apply is run, then docker-compose.yml must contain audio-related volume mounts or socket references. [config_coverage.rs · audio_enabled_adds_mounts]

Audio disabled produces no audio mounts If aibox.toml has audio disabled (the default) and aibox apply is run, then docker-compose.yml must not contain audio-related content. [config_coverage.rs · audio_disabled_no_mounts]

Python addon adds install commands to Dockerfile If aibox.toml includes the python addon and aibox apply is run, then .devcontainer/Dockerfile must contain Python install instructions. [config_coverage.rs · addon_python_in_dockerfile]

Rust addon adds rustup install to Dockerfile If aibox.toml includes the rust addon and aibox apply is run, then .devcontainer/Dockerfile must contain rustup installation instructions. [config_coverage.rs · addon_rust_in_dockerfile]

Multiple addons each contribute to the Dockerfile If aibox.toml includes both the python and rust addons and aibox apply is run, then .devcontainer/Dockerfile must contain install content for both. [config_coverage.rs · addon_multiple_in_dockerfile]

Minimal package creates slim project skeleton If aibox init --context minimal is run, then aibox.toml, aibox.lock, an empty context/ directory, and a thin CLAUDE.md pointer must exist. The single-file context tracks (BACKLOG.md, DECISIONS.md, STANDUPS.md) are not created at init time — the corresponding processkit skills create them in place on first use.

Managed package is the recommended default If aibox init --harness claude is run, then the slim project skeleton must exist. With a real [processkit].version pinned, aibox apply then installs the full processkit skill catalogue under context/skills/ and the immutable upstream snapshot under context/templates/processkit/<version>/.

Product / research / software packages The five processkit packages (minimal, managed, software, research, product) are declarative metadata in [processkit.context].packages. In v0.16.0 they do not change which files land on disk — every project gets every processkit skill — but they tell agents which subset to prefer.


File preview — preview.rs

svg.yazi plugin is seeded into .aibox-home after init If aibox init is run, then .aibox-home/.config/yazi/plugins/svg.yazi/init.lua must exist. [preview.rs · svg_yazi_plugin_seeded]

eps.yazi plugin is seeded into .aibox-home after init If aibox init is run, then .aibox-home/.config/yazi/plugins/eps.yazi/init.lua must exist. [preview.rs · eps_yazi_plugin_seeded]

svg.yazi plugin invokes resvg for conversion If svg.yazi/init.lua is read after init, then its content must reference resvg as the SVG-to-PNG conversion tool. [preview.rs · svg_yazi_plugin_uses_resvg]

eps.yazi plugin invokes ghostscript for conversion If eps.yazi/init.lua is read after init, then its content must reference gs (ghostscript) as the EPS-to-PNG conversion tool. [preview.rs · eps_yazi_plugin_uses_ghostscript]

yazi.toml has a [plugin] section with prepend_previewers If aibox init is run, then yazi.toml must contain a [plugin] section that defines prepend_previewers. [preview.rs · yazi_toml_has_plugin_section]

*yazi.toml routes .svg to the svg previewer If aibox init is run, then yazi.toml must contain a prepend_previewers entry matching *.svg with run = "svg". [preview.rs · yazi_toml_svg_previewer_entry]

*yazi.toml routes .eps to the eps previewer If aibox init is run, then yazi.toml must contain a prepend_previewers entry matching *.eps with run = "eps". [preview.rs · yazi_toml_eps_previewer_entry]

SVG and EPS entries appear before built-in image entries If aibox init is run, then the *.svg and *.eps entries in prepend_previewers must appear at a lower byte offset than the *.jpg entry, ensuring first-match semantics dispatch SVG/EPS to the custom plugins rather than the built-in image previewer. [preview.rs · yazi_toml_svg_and_eps_precede_builtin_previewers]

sample.svg fixture is valid XML If tests/e2e/fixtures/sample.svg is read, then its content must start with <svg or <?xml, confirming the fixture file is intact. [preview.rs · fixture_sample_svg_is_valid_xml]

sample.eps fixture has a valid EPS header If tests/e2e/fixtures/sample.eps is read, then its content must start with %!PS-Adobe or contain %%BoundingBox, confirming the fixture file is intact. [preview.rs · fixture_sample_eps_has_eps_header]


Generated runtime — runtime_generated.rs

Generated runtime tools are usable If a fresh project is initialized with git-ui and shell status enabled, then aibox apply --no-container --standardize-config must generate Yazi config that parses with the pinned Yazi binary, lazygit state directories that permit startup, and an aibox-status --plugin-json payload with required fields. [runtime_generated.rs · generated_runtime_yazi_lazygit_and_status_are_usable]

Generated tmux status renders If the generated dev layout is launched under asciinema with tmux status enabled, then the cast must show key/status row text and runtime status output. [runtime_generated.rs · generated_runtime_tmux_status_renders_key_and_status_rows]


Visual matrix — visual_matrix.rs

These tests are ignored by default and run only through the explicit visual E2E commands above.

Generated layouts render across all themes If each generated layout is launched for each supported theme, then the recording must include the theme RGB signature and tmux status/key row text. [visual_matrix.rs · visual_generated_layouts_render_across_all_themes]

Generated tools and harness windows render when enabled If all harnesses and visual runtime addons are enabled, then window traversal must show the expected Yazi surface, Vim, shell, lazygit, and every harness marker. [visual_matrix.rs · visual_generated_tools_and_harness_windows_render_when_enabled]

Yazi previews, git symbols, and optional plugins render If the Yazi preview addons are enabled, then generated Yazi config must parse, preview plugins must be installed, git symbols must be configured, and directory, Markdown, CSV, TSV, and SQLite previews must render their markers. [visual_matrix.rs · visual_yazi_previews_git_symbols_and_optional_plugins_render]


Smoke tests — smoke.rs

These tests validate that the Tier 2 companion container's container runtime is functional end-to-end (Tier 2 only).

Container runtime is available on the companion If the companion container is queried for its selected runtime, then the command must succeed and the output must contain either docker or podman. [smoke.rs · runtime_available_on_companion]

Container runtime can pull and run a container If the selected runtime runs alpine echo hello-e2e on the companion, then the image pull, container creation, and command execution must succeed, and the output must contain hello-e2e. [smoke.rs · runtime_can_pull_and_run_container]