Use this checklist before tagging or publishing a context-eval release.
Run the same quality gates expected in CI:
python -m pytest
context-eval validate-config --config examples/basic/context-eval.yaml
powershell -ExecutionPolicy Bypass -File scripts\validate-skills.ps1 -SkipExternal
python scripts/check-release-state.py
python -m build --outdir C:\tmp\context-eval-dist
python scripts/inspect-package-artifacts.py C:\tmp\context-eval-dist
python scripts/install-smoke-artifacts.py --dist-dir C:\tmp\context-eval-dist
python scripts/build-windows-portable.py --dist-dir C:\tmp\context-eval-dist --frontend-dist frontend\dist --output-dir C:\tmp\context-eval-dist
git diff --check
Run ruff check . when the dev dependencies are installed.
The release preparation entrypoint is:
python scripts/prepare-release.py --dist-dir C:\tmp\context-eval-dist
This command checks CHANGELOG.md, runs the release-state check before package builds, builds wheel and sdist artifacts, inspects artifacts before publish, and runs the release candidate install smoke. It is a preparation gate only. It does not create Git tags, and it does not upload or publish packages.
The manual publish checkpoint remains after this command succeeds: confirm the reviewed commit, confirm CI, create the Git tag intentionally, and publish the already inspected artifacts with the selected package index tooling.
Run the install smoke after package artifact inspection:
python scripts/install-smoke-artifacts.py --dist-dir C:\tmp\context-eval-dist
The smoke installs the built wheel into a temporary Python environment, then
runs the installed context-eval console script against a local fixture repository,
a fake local agent, temporary local config files, and local run
artifacts. It runs validate-config, run, report, CSV/JSON export, and
ui, then verifies the generated artifacts are parseable and self-contained.
It also runs the installed context-eval-app launcher startup preflight:
context-eval-app --workspace <temp> --config <temp>/context-eval.yaml --no-browser --port 0 --check-startup
That preflight verifies the installed launcher entry point, workspace/config resolution, loopback startup settings, and local launcher log path without opening a browser or blocking in the server loop.
The smoke does not call hosted services, does not install or run a real external coding agent, does not create Git tags, and does not upload or publish packages. It is still a release candidate gate only; the publish boundary remains a manual checkpoint.
Build the Windows portable local app archive after package artifact inspection, install smoke, and frontend validation:
python scripts/build-windows-portable.py --dist-dir C:\tmp\context-eval-dist --frontend-dist frontend\dist --output-dir C:\tmp\context-eval-dist
The output is context-eval-windows-x64-<version>.zip. It contains the built
context-eval wheel, dependency wheelhouse, frontend/dist, Start Context
Eval.cmd, scripts/start-context-eval.ps1, a package-local workspace, and a
README for users.
Acceptance for the zip is intentionally simple: unzip it on Windows with Python
3.11 or newer installed, then double-click Start Context Eval.cmd. The script
creates or reuses a private .venv, installs only from the bundled wheelhouse,
passes --frontend-dist frontend\dist to the local app launcher, starts the
loopback local app, and opens the browser. The default builder downloads
Windows dependency wheels for Python 3.11, 3.12, and 3.13; use repeated
--python-version flags only when intentionally narrowing a candidate package.
The portable package does not install coding agents, does not install target repository dependencies, does not call hosted context-eval services, does not create Git tags, and does not publish packages. The release flow still stops at the manual tag and publish checkpoint.
The package supports Python 3.11 or newer through requires-python = ">=3.11".
CI gates Python 3.11 and Python 3.12 on pull requests. CI gates Ubuntu and Windows
for the runtime test matrix. macOS is not a release-blocking CI platform yet.
Vendored skill validation is release-blocking on Windows because it depends on the PowerShell validation script. Other local development hosts may work when Python and shell prerequisites are available, but they are not part of the current release gate.
Inspect package configuration before release:
python scripts/check-release-state.py before building; it checks hidden local release blockers that git status --short does not show..context-eval/.build/.dist/.*.egg-info/..codex/config.toml..venv/.python scripts/inspect-package-artifacts.py C:\tmp\context-eval-dist;
it inspects both the wheel and sdist artifacts.context_eval/.context_eval/reports/templates/..context-eval/..agents/..codex/skills/.openspec/.scripts/.project.license as an SPDX string, currently
license = "MIT".license = { text = "MIT" }.context_eval/.context_eval/reports/templates/..context-eval/..agents/..codex/skills/.openspec/.scripts/..codex/config.toml must not be committed; use
.codex/config.example.toml only.CHANGELOG.md.