context-eval

Release Checklist

Use this checklist before tagging or publishing a context-eval release.

Local Verification

Run the same quality gates expected in CI:

python -m pytest
context-eval validate-config --config examples/basic/context-eval.yaml
powershell -ExecutionPolicy Bypass -File scripts\validate-skills.ps1 -SkipExternal
python scripts/check-release-state.py
python -m build --outdir C:\tmp\context-eval-dist
python scripts/inspect-package-artifacts.py C:\tmp\context-eval-dist
python scripts/install-smoke-artifacts.py --dist-dir C:\tmp\context-eval-dist
python scripts/build-windows-portable.py --dist-dir C:\tmp\context-eval-dist --frontend-dist frontend\dist --output-dir C:\tmp\context-eval-dist
git diff --check

Run ruff check . when the dev dependencies are installed.

Automated Preparation Command

The release preparation entrypoint is:

python scripts/prepare-release.py --dist-dir C:\tmp\context-eval-dist

This command checks CHANGELOG.md, runs the release-state check before package builds, builds wheel and sdist artifacts, inspects artifacts before publish, and runs the release candidate install smoke. It is a preparation gate only. It does not create Git tags, and it does not upload or publish packages.

The manual publish checkpoint remains after this command succeeds: confirm the reviewed commit, confirm CI, create the Git tag intentionally, and publish the already inspected artifacts with the selected package index tooling.

Release Candidate Install Smoke

Run the install smoke after package artifact inspection:

python scripts/install-smoke-artifacts.py --dist-dir C:\tmp\context-eval-dist

The smoke installs the built wheel into a temporary Python environment, then runs the installed context-eval console script against a local fixture repository, a fake local agent, temporary local config files, and local run artifacts. It runs validate-config, run, report, CSV/JSON export, and ui, then verifies the generated artifacts are parseable and self-contained. It also runs the installed context-eval-app launcher startup preflight:

context-eval-app --workspace <temp> --config <temp>/context-eval.yaml --no-browser --port 0 --check-startup

That preflight verifies the installed launcher entry point, workspace/config resolution, loopback startup settings, and local launcher log path without opening a browser or blocking in the server loop.

The smoke does not call hosted services, does not install or run a real external coding agent, does not create Git tags, and does not upload or publish packages. It is still a release candidate gate only; the publish boundary remains a manual checkpoint.

Windows Portable Package

Build the Windows portable local app archive after package artifact inspection, install smoke, and frontend validation:

python scripts/build-windows-portable.py --dist-dir C:\tmp\context-eval-dist --frontend-dist frontend\dist --output-dir C:\tmp\context-eval-dist

The output is context-eval-windows-x64-<version>.zip. It contains the built context-eval wheel, dependency wheelhouse, frontend/dist, Start Context Eval.cmd, scripts/start-context-eval.ps1, a package-local workspace, and a README for users.

Acceptance for the zip is intentionally simple: unzip it on Windows with Python 3.11 or newer installed, then double-click Start Context Eval.cmd. The script creates or reuses a private .venv, installs only from the bundled wheelhouse, passes --frontend-dist frontend\dist to the local app launcher, starts the loopback local app, and opens the browser. The default builder downloads Windows dependency wheels for Python 3.11, 3.12, and 3.13; use repeated --python-version flags only when intentionally narrowing a candidate package.

The portable package does not install coding agents, does not install target repository dependencies, does not call hosted context-eval services, does not create Git tags, and does not publish packages. The release flow still stops at the manual tag and publish checkpoint.

Supported Runtime And Platforms

The package supports Python 3.11 or newer through requires-python = ">=3.11". CI gates Python 3.11 and Python 3.12 on pull requests. CI gates Ubuntu and Windows for the runtime test matrix. macOS is not a release-blocking CI platform yet.

Vendored skill validation is release-blocking on Windows because it depends on the PowerShell validation script. Other local development hosts may work when Python and shell prerequisites are available, but they are not part of the current release gate.

Packaging Scope

Inspect package configuration before release:

Release Steps

  1. Update CHANGELOG.md.
  2. Confirm the working tree is clean.
  3. Run the local verification commands above.
  4. Confirm CI passes on the release branch or pull request.
  5. Tag the release from the reviewed commit.
  6. Build, inspect, and install-smoke artifacts before publishing.