Spectral Docs — AI Control Plane

Prerequisites

Python 3.10 or higher
PostgreSQL 17
Anthropic API key (for Claude-powered evaluation)

Installation

Clone the repository

bash

git clone https://github.com/your-org/spectral.git
cd spectral

Install Python dependencies

bash

pip install fastapi uvicorn psycopg2-binary anthropic

Start PostgreSQL 17
bash
```
sudo pg_ctlcluster 17 main start
```
Create the Spectral database
bash
```
createdb spectral
```
Set your Anthropic API key
bash
```
export ANTHROPIC_API_KEY=sk-ant-...
```
Start the FastAPI server (runs on port 8000)
bash
```
cd spectral && python api_server.py
```
Open the dashboard in your browser
url
```
http://localhost:8000/index.html
```

Your First Workflow

Follow these API calls in order to set up your first AI optimization workflow:

Create a workflow

POST /api/workflows

curl -X POST http://localhost:8000/api/workflows \
  -H "Content-Type: application/json" \
  -d '{"name": "Prior Authorization", "agent_name": "pa_agent",
       "description": "Automate prior auth decisions"}'

Upload traces — use JSON ingestion or CSV upload

POST /api/ingest-traces

curl -X POST http://localhost:8000/api/ingest-traces \
  -H "Content-Type: application/json" \
  -d '{"workflow_id": 1, "traces": [
    {"input": "Patient needs MRI...", "output": "Approved", "latency_ms": 340}
  ]}'

Create evaluation rubrics for your agents

POST /api/rubrics

curl -X POST http://localhost:8000/api/rubrics \
  -H "Content-Type: application/json" \
  -d '{"agent_name": "pa_agent", "name": "PA Rubric v1",
       "dimensions": [{"name": "accuracy", "weight": 0.5},
                      {"name": "completeness", "weight": 0.3},
                      {"name": "safety", "weight": 0.2}]}'

Run your first Spectral Scan — the autonomous optimization loop

POST /api/spectral-scan

curl -X POST http://localhost:8000/api/spectral-scan \
  -H "Content-Type: application/json" \
  -d '{"workflow_id": 1, "sample_size": 50}'

Poll for results using the returned scan_id
GET /api/scans/{scan_id}
```
curl http://localhost:8000/api/scans/1
```
The scan returns a verdict field: GO (safe to promote) or NO-GO (regression detected).

How a Scan Works

Every Spectral Scan runs a closed-loop, autonomous evaluation and optimization cycle:

Observe

Run agent on sampled test cases, collect traces

→

Evaluate

LLM-as-judge scores traces against rubrics

→

Diagnose

Cluster failures by root cause pattern

→

Optimize

Generate prompt/config mutations

→

Tournament

A/B test champion vs. challengers

→

Gate

GO/NO-GO decision via holdout validation

The scan engine runs entirely autonomously. After triggering POST /api/spectral-scan, Spectral will observe, evaluate, diagnose, generate candidate patches, run tournaments, and issue a GO/NO-GO verdict — no manual steps required.

Key Concepts

Workflow

A named AI agent pipeline (e.g., "Prior Authorization"). The top-level container for cases, configs, and scans.

Case

A single test input with optional expected output. Cases belong to eval pools: working, holdout, adversarial, or business_critical.

Trace

A recorded agent execution: input, output, latency, and token usage. The raw signal that gets evaluated.

Config

A versioned set of agent hyperparameters (system prompt, model, temperature). Configs are mutated and tested.

Rubric

An evaluation specification with weighted dimensions (accuracy, safety, completeness). Used by LLM-as-judge.

Scan

One complete run of the Observe→Gate loop. Produces a champion config and a GO/NO-GO verdict.

Holdout

A reserved set of cases never seen during optimization. Used for unbiased final validation in the Gate step.

GO/NO-GO Gate

The binary promotion decision. GO means the challenger outperforms champion on the holdout set without regression.

109

Total
endpoints

GET

POST

PUT

DELETE

Workflows

5 endpoints

GET /api/workflows List all workflows with case/config/scan counts.

Function: list_workflows

List all workflows with case/config/scan counts.

Example

curl http://localhost:8000/api/workflows

POST /api/workflows Create a new workflow.

Function: create_workflow

Create a new workflow.

Example

curl -X POST http://localhost:8000/api/workflows \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/workflows/{workflow_id}/economics Get the business economics parameters for a workflow.

Function: get_workflow_economics

Get the business economics parameters for a workflow. Returns the workflow-specific cost model used in report generation. Falls back to system defaults if not configured.

Example

curl http://localhost:8000/api/workflows/{workflow_id}/economics

PUT /api/workflows/{workflow_id}/economics Set or update the business economics parameters for a workflow.

Function: update_workflow_economics

Set or update the business economics parameters for a workflow. Partial updates are supported — omit any field to keep its current value. All values fall back to system defaults if not previo

Example

curl -X PUT http://localhost:8000/api/workflows/{workflow_id}/economics \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/workflows/{workflow_id}/sample-recommendation Recommend sample size for a workflow based on historical failure rates.

Function: workflow_sample_recommendation

Recommend sample size for a workflow based on historical failure rates.

Example

curl http://localhost:8000/api/workflows/{workflow_id}/sample-recommendation

Ingestion

6 endpoints

POST /api/cases/bulk-pool Set eval_pool for multiple cases at once.

Function: bulk_pool_assignment

Set eval_pool for multiple cases at once.

Example

curl -X POST http://localhost:8000/api/cases/bulk-pool \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/ingest-cases Import test cases from a CSV file.

Function: ingest_cases

Import test cases from a CSV file. Supports flexible column mapping: Required: input_text (or input or patient_summary) Optional: external_case_id, workflow_id, expected_output,

Example

curl -X POST http://localhost:8000/api/ingest-cases \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/ingest-csv Ingest traces from a CSV file upload (multipart) or raw CSV body.

Function: ingest_csv

Ingest traces from a CSV file upload (multipart) or raw CSV body.

Example

curl -X POST http://localhost:8000/api/ingest-csv \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/ingest-jsonl Ingest traces from a JSONL body.

Function: ingest_jsonl

Ingest traces from a JSONL body.

Example

curl -X POST http://localhost:8000/api/ingest-jsonl \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/ingest-traces Ingest traces from an external pipeline into Spectral.

Function: ingest_traces

Ingest traces from an external pipeline into Spectral.

Example

curl -X POST http://localhost:8000/api/ingest-traces \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/webhooks/traces Webhook endpoint for external systems to push trace data.

Function: webhook_traces

Webhook endpoint for external systems to push trace data.

Example

curl -X POST http://localhost:8000/api/webhooks/traces \
  -H "Content-Type: application/json" \
  -d '{...}'

Cases & Configs

8 endpoints

GET /api/cases List all test cases. Optional ?workflow_id=X filter.

Function: list_cases

List all test cases. Optional ?workflow_id=X filter.

Example

curl http://localhost:8000/api/cases

GET /api/cases/pool-stats Return case counts per eval_pool.

Function: get_pool_stats

Return case counts per eval_pool. Optional ?workflow_id=X filter.

Example

curl http://localhost:8000/api/cases/pool-stats

POST /api/cases/{case_id}/pool Set the eval_pool for a case.

Function: set_case_pool

Set the eval_pool for a case. Valid values: working, holdout, adversarial, business_critical.

Example

curl -X POST http://localhost:8000/api/cases/{case_id}/pool \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/configs List all configurations. Optional ?workflow_id=X filter.

Function: list_configs

List all configurations. Optional ?workflow_id=X filter.

Example

curl http://localhost:8000/api/configs

POST /api/configs Create a new configuration.

Function: create_config

Create a new configuration.

Example

curl -X POST http://localhost:8000/api/configs \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/configs/{config_id} Get a single configuration with full details.

Function: get_config

Get a single configuration with full details.

Example

curl http://localhost:8000/api/configs/{config_id}

GET /api/traces List traces with optional filters.

Function: list_traces

List traces with optional filters.

Example

curl http://localhost:8000/api/traces

GET /api/traces/{trace_id} Single trace with its eval_results.

Function: get_trace

Single trace with its eval_results.

Example

curl http://localhost:8000/api/traces/{trace_id}

Rubrics

9 endpoints

GET /api/rubric-library Return the built-in rubric template library.

Function: get_rubric_library

Return the built-in rubric template library.

Example

curl http://localhost:8000/api/rubric-library

GET /api/rubrics List all rubrics, optionally filtered by agent.

Function: list_rubrics

List all rubrics, optionally filtered by agent.

Example

curl http://localhost:8000/api/rubrics

POST /api/rubrics Create a new rubric with dimensions.

Function: create_rubric

Create a new rubric with dimensions.

Example

curl -X POST http://localhost:8000/api/rubrics \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/rubrics/active Get all active rubrics (one per agent).

Function: get_active_rubrics

Get all active rubrics (one per agent).

Example

curl http://localhost:8000/api/rubrics/active

GET /api/rubrics/agent/{agent_name} Get the active rubric for a specific agent.

Function: get_agent_rubric

Get the active rubric for a specific agent.

Example

curl http://localhost:8000/api/rubrics/agent/{agent_name}

DELETE /api/rubrics/{rubric_id} Delete a rubric.

Function: delete_rubric

Delete a rubric.

Example

curl -X DELETE http://localhost:8000/api/rubrics/{rubric_id}

GET /api/rubrics/{rubric_id} Get a single rubric with dimensions.

Function: get_rubric

Get a single rubric with dimensions.

Example

curl http://localhost:8000/api/rubrics/{rubric_id}

PUT /api/rubrics/{rubric_id} Update a rubric and optionally its dimensions.

Function: update_rubric

Update a rubric and optionally its dimensions.

Example

curl -X PUT http://localhost:8000/api/rubrics/{rubric_id} \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/rubrics/{rubric_id}/activate Set a rubric as the active rubric for its agent.

Function: activate_rubric

Set a rubric as the active rubric for its agent.

Example

curl -X POST http://localhost:8000/api/rubrics/{rubric_id}/activate \
  -H "Content-Type: application/json" \
  -d '{...}'

Evaluation

5 endpoints

GET /api/eval-agent-report Return eval agent health metrics derived from eval_traces.

Function: eval_agent_report

Return eval agent health metrics derived from eval_traces.

Example

curl http://localhost:8000/api/eval-agent-report

POST /api/eval-consistency-check Cross-evaluator consensus check — re-score a sample of traces with strict/lenient variants.

Function: eval_consistency_check

Cross-evaluator consensus check — re-score a sample of traces with strict/lenient variants.

Example

curl -X POST http://localhost:8000/api/eval-consistency-check \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/eval-results List eval results with optional config filter.

Function: list_eval_results

List eval results with optional config filter.

Example

curl http://localhost:8000/api/eval-results

POST /api/evaluate Run LLM-as-judge evaluation on traces using per-agent rubrics.

Function: evaluate_traces

Run LLM-as-judge evaluation on traces using per-agent rubrics.

Example

curl -X POST http://localhost:8000/api/evaluate \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/run Execute the 3-agent pipeline on cases using LLM with full hyperparameter support.

Function: run_pipeline

Execute the 3-agent pipeline on cases using LLM with full hyperparameter support.

Example

curl -X POST http://localhost:8000/api/run \
  -H "Content-Type: application/json" \
  -d '{...}'

Scans

4 endpoints

POST /api/eval-scan Start an evaluate-only scan for externally ingested traces. No agent re-execution.

Function: start_eval_scan

Start an evaluate-only scan for externally ingested traces. No agent re-execution.

Example

curl -X POST http://localhost:8000/api/eval-scan \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/scans List all scans. Optional ?workflow_id=X filter.

Function: list_scans

List all scans. Optional ?workflow_id=X filter.

Example

curl http://localhost:8000/api/scans

GET /api/scans/{scan_id} Get scan detail with full results.

Function: get_scan

Get scan detail with full results.

Example

curl http://localhost:8000/api/scans/{scan_id}

POST /api/spectral-scan Start a Spectral Scan — full autonomous Observe->Evaluate->Diagnose->Optimize loop.

Function: start_spectral_scan

Start a Spectral Scan — full autonomous Observe->Evaluate->Diagnose->Optimize loop.

Example

curl -X POST http://localhost:8000/api/spectral-scan \
  -H "Content-Type: application/json" \
  -d '{...}'

Reports

2 endpoints

GET /api/scan-report/{scan_id} Enhanced scan report with dynamically generated finding narrative.

Function: get_scan_report

Enhanced scan report with dynamically generated finding narrative.

Example

curl http://localhost:8000/api/scan-report/{scan_id}

GET /api/scan-report/{scan_id}/pdf Generate and return a PDF scan report with Spectral branding.

Function: get_scan_report_pdf

Generate and return a PDF scan report with Spectral branding.

Example

curl http://localhost:8000/api/scan-report/{scan_id}/pdf

Failure Analysis

3 endpoints

POST /api/analyze Run LLM failure analysis on eval results for a config.

Function: analyze_failures

Run LLM failure analysis on eval results for a config.

Example

curl -X POST http://localhost:8000/api/analyze \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/compositional/{config_id} Gap 5: Returns multiplicative system reliability breakdown.

Function: get_compositional_reliability

Gap 5: Returns multiplicative system reliability breakdown.

Example

curl http://localhost:8000/api/compositional/{config_id}

GET /api/failures/{config_id} Get failure clusters for a config.

Function: get_failures

Get failure clusters for a config.

Example

curl http://localhost:8000/api/failures/{config_id}

Experiments

3 endpoints

GET /api/experiments List experiments.

Function: list_experiments

List experiments.

Example

curl http://localhost:8000/api/experiments

POST /api/experiments Create an experiment and record initial results.

Function: create_experiment

Create an experiment and record initial results.

Example

curl -X POST http://localhost:8000/api/experiments \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/experiments/{experiment_id} Experiment detail with leaderboard.

Function: get_experiment

Experiment detail with leaderboard.

Example

curl http://localhost:8000/api/experiments/{experiment_id}

Promotion

8 endpoints

POST /api/promote-auto Auto-promote the patch config from a GO scan to champion.

Function: promote_auto

Auto-promote the patch config from a GO scan to champion.

Example

curl -X POST http://localhost:8000/api/promote-auto \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/promote/{config_id} Promote a config to champion.

Function: promote_config

Promote a config to champion.

Example

curl -X POST http://localhost:8000/api/promote/{config_id} \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/promotion Champion vs challenger comparison data.

Function: get_promotion

Champion vs challenger comparison data.

Example

curl http://localhost:8000/api/promotion

POST /api/v2/promotion-stages Create a promotion stage pipeline for a candidate config.

Function: create_v2_promotion_stages

Create a promotion stage pipeline for a candidate config.

Example

curl -X POST http://localhost:8000/api/v2/promotion-stages \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/v2/promotion-stages/{candidate_id} Get promotion stage status for a candidate config.

Function: get_v2_promotion_stages

Get promotion stage status for a candidate config.

Example

curl http://localhost:8000/api/v2/promotion-stages/{candidate_id}

POST /api/v2/promotion-stages/{candidate_id}/advance Advance to the next promotion stage.

Function: advance_v2_promotion_stage

Advance to the next promotion stage.

Example

curl -X POST http://localhost:8000/api/v2/promotion-stages/{candidate_id}/advance \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/v2/promotion-stages/{candidate_id}/approve Approve the current promotion stage (requires human sign-off).

Function: approve_v2_promotion_stage

Approve the current promotion stage (requires human sign-off).

Example

curl -X POST http://localhost:8000/api/v2/promotion-stages/{candidate_id}/approve \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/v2/promotion-stages/{candidate_id}/rollback Rollback all promotion stages for a candidate.

Function: rollback_v2_promotion_stage

Rollback all promotion stages for a candidate.

Example

curl -X POST http://localhost:8000/api/v2/promotion-stages/{candidate_id}/rollback \
  -H "Content-Type: application/json" \
  -d '{...}'

Dashboard

3 endpoints

GET /api/control-plane Cross-workflow summary for the Control Plane screen.

Function: get_control_plane

Cross-workflow summary for the Control Plane screen.

Example

curl http://localhost:8000/api/control-plane

GET /api/dashboard Returns KPIs computed from the database.

Function: get_dashboard

Returns KPIs computed from the database.

Example

curl http://localhost:8000/api/dashboard

GET /api/workflow-overview Workflow-level KPIs, activity feed, and accuracy trend for Prior Authorization.

Function: get_workflow_overview

Workflow-level KPIs, activity feed, and accuracy trend for Prior Authorization.

Example

curl http://localhost:8000/api/workflow-overview

Meta-Improvement

22 endpoints

GET /api/meta/convergence Champion score trajectory over scans (convergence analysis).

Function: get_convergence_data

Champion score trajectory over scans (convergence analysis).

Example

curl http://localhost:8000/api/meta/convergence

GET /api/meta/dashboard Full meta-improvement system dashboard.

Function: meta_dashboard

Full meta-improvement system dashboard.

Example

curl http://localhost:8000/api/meta/dashboard

GET /api/meta/experiments List meta-improvement experiments.

Function: list_meta_experiments

List meta-improvement experiments.

Example

curl http://localhost:8000/api/meta/experiments

GET /api/meta/improvement-log Return recent events from the meta_improvement_log table.

Function: get_improvement_log

Return recent events from the meta_improvement_log table.

Example

curl http://localhost:8000/api/meta/improvement-log

POST /api/meta/rubric-audit Layer 1: Audit rubrics against outcome data and propose mutations.

Function: meta_rubric_audit

Layer 1: Audit rubrics against outcome data and propose mutations.

Example

curl -X POST http://localhost:8000/api/meta/rubric-audit \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/meta/rubric-mutations List all rubric mutations with optional filters.

Function: list_rubric_mutations

List all rubric mutations with optional filters.

Example

curl http://localhost:8000/api/meta/rubric-mutations

POST /api/meta/rubric-mutations/{mutation_id}/apply Apply a specific proposed rubric mutation.

Function: apply_rubric_mutation

Apply a specific proposed rubric mutation.

Example

curl -X POST http://localhost:8000/api/meta/rubric-mutations/{mutation_id}/apply \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/meta/rubric-mutations/{mutation_id}/reject Reject a proposed rubric mutation.

Function: reject_rubric_mutation

Reject a proposed rubric mutation.

Example

curl -X POST http://localhost:8000/api/meta/rubric-mutations/{mutation_id}/reject \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/meta/strategies List optimization strategies with ELO ratings.

Function: list_strategies

List optimization strategies with ELO ratings.

Example

curl http://localhost:8000/api/meta/strategies

POST /api/meta/strategies/{strategy_name}/record Record a win/loss for a strategy to update ELO ratings.

Function: record_strategy

Record a win/loss for a strategy to update ELO ratings.

Example

curl -X POST http://localhost:8000/api/meta/strategies/{strategy_name}/record \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/meta/tune Layer 3: Recommend optimal meta-hyperparameters (sample_size, num_candidates, promotion_threshold).

Function: meta_tune

Layer 3: Recommend optimal meta-hyperparameters (sample_size, num_candidates, promotion_threshold).

Example

curl -X POST http://localhost:8000/api/meta/tune \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/v2/intervention-memory List past intervention records.

Function: list_intervention_memory

List past intervention records.

Example

curl http://localhost:8000/api/v2/intervention-memory

GET /api/v2/intervention-memory/effectiveness Summarize intervention effectiveness.

Function: get_intervention_effectiveness

Summarize intervention effectiveness.

Example

curl http://localhost:8000/api/v2/intervention-memory/effectiveness

GET /api/v2/intervention-memory/similar Retrieve historically similar interventions for a cluster.

Function: get_similar_interventions

Retrieve historically similar interventions for a cluster.

Example

curl http://localhost:8000/api/v2/intervention-memory/similar

GET /api/v2/mutations List mutation types from the type registry.

Function: list_v2_mutations

List mutation types from the type registry.

Example

curl http://localhost:8000/api/v2/mutations

GET /api/v2/mutations/recommend Recommend mutation types based on recent failure attributions.

Function: recommend_mutations

Recommend mutation types based on recent failure attributions.

Example

curl http://localhost:8000/api/v2/mutations/recommend

GET /api/v2/mutations/{mutation_id} Get a specific mutation type by ID.

Function: get_v2_mutation

Get a specific mutation type by ID.

Example

curl http://localhost:8000/api/v2/mutations/{mutation_id}

GET /api/v2/objective-functions List all objective functions. Optional ?workflow_id=X filter.

Function: list_objective_functions

List all objective functions. Optional ?workflow_id=X filter.

Example

curl http://localhost:8000/api/v2/objective-functions

POST /api/v2/objective-functions Create a new objective function for a workflow.

Function: create_objective_function

Create a new objective function for a workflow.

Example

curl -X POST http://localhost:8000/api/v2/objective-functions \
  -H "Content-Type: application/json" \
  -d '{...}'

DELETE /api/v2/objective-functions/{obj_id} Delete an objective function.

Function: delete_objective_function

Delete an objective function.

Example

curl -X DELETE http://localhost:8000/api/v2/objective-functions/{obj_id}

GET /api/v2/objective-functions/{obj_id} Get a single objective function by ID.

Function: get_objective_function

Get a single objective function by ID.

Example

curl http://localhost:8000/api/v2/objective-functions/{obj_id}

PUT /api/v2/objective-functions/{obj_id} Update an existing objective function.

Function: update_objective_function

Update an existing objective function.

Example

curl -X PUT http://localhost:8000/api/v2/objective-functions/{obj_id} \
  -H "Content-Type: application/json" \
  -d '{...}'

Outcomes

6 endpoints

GET /api/outcomes List outcome signals with optional type filter.

Function: list_outcome_signals

List outcome signals with optional type filter.

Example

curl http://localhost:8000/api/outcomes

POST /api/outcomes Ingest a single real-world outcome signal and correlate to traces.

Function: ingest_outcome_signal

Ingest a single real-world outcome signal and correlate to traces.

Example

curl -X POST http://localhost:8000/api/outcomes \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/outcomes/batch Batch ingest outcome signals.

Function: ingest_outcome_batch

Batch ingest outcome signals.

Example

curl -X POST http://localhost:8000/api/outcomes/batch \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/outcomes/correlation Compute correlation between rubric eval scores and real-world outcomes.

Function: outcome_rubric_correlation

Compute correlation between rubric eval scores and real-world outcomes.

Example

curl http://localhost:8000/api/outcomes/correlation

GET /api/v2/outcome-correlations Compute enhanced outcome correlations for v2.

Function: v2_outcome_correlations

Compute enhanced outcome correlations for v2.

Example

curl http://localhost:8000/api/v2/outcome-correlations

POST /api/v2/outcomes/enhanced Ingest enhanced outcome signal with business metrics.

Function: ingest_enhanced_outcome

Ingest enhanced outcome signal with business metrics.

Example

curl -X POST http://localhost:8000/api/v2/outcomes/enhanced \
  -H "Content-Type: application/json" \
  -d '{...}'

Safety

4 endpoints

GET /api/intervention-log Gap 3: Returns the cross-workflow intervention history.

Function: get_intervention_log

Gap 3: Returns the cross-workflow intervention history.

Example

curl http://localhost:8000/api/intervention-log

GET /api/safety-report/{scan_id} Gap 8: Returns safety check summary for all traces in a scan.

Function: get_safety_report

Gap 8: Returns safety check summary for all traces in a scan.

Example

curl http://localhost:8000/api/safety-report/{scan_id}

POST /api/v2/anti-deception/check Run anti-deception suite for a candidate config and scan.

Function: anti_deception_check

Run anti-deception suite for a candidate config and scan.

Example

curl -X POST http://localhost:8000/api/v2/anti-deception/check \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/v2/anti-deception/drift Detect score drift in recent scans.

Function: anti_deception_drift

Detect score drift in recent scans.

Example

curl http://localhost:8000/api/v2/anti-deception/drift

Scheduling

5 endpoints

GET /api/scan-schedules List all scan schedules.

Function: list_scan_schedules

List all scan schedules.

Example

curl http://localhost:8000/api/scan-schedules

POST /api/scan-schedules Create a scheduled scan.

Function: create_scan_schedule

Create a scheduled scan.

Example

curl -X POST http://localhost:8000/api/scan-schedules \
  -H "Content-Type: application/json" \
  -d '{...}'

DELETE /api/scan-schedules/{schedule_id} Delete a scan schedule.

Function: delete_scan_schedule

Delete a scan schedule.

Example

curl -X DELETE http://localhost:8000/api/scan-schedules/{schedule_id}

POST /api/scan-schedules/{schedule_id}/toggle Enable/disable a scan schedule.

Function: toggle_scan_schedule

Enable/disable a scan schedule.

Example

curl -X POST http://localhost:8000/api/scan-schedules/{schedule_id}/toggle \
  -H "Content-Type: application/json" \
  -d '{...}'

POST /api/scan-trigger-on-ingest Auto-trigger a scan if >= 50 new traces since last scan.

Function: scan_trigger_on_ingest

Auto-trigger a scan if >= 50 new traces since last scan.

Example

curl -X POST http://localhost:8000/api/scan-trigger-on-ingest \
  -H "Content-Type: application/json" \
  -d '{...}'

Settings & Notifications

6 endpoints

GET /api/digest/weekly Weekly summary of scan activity, verdicts, and top failure clusters.

Function: weekly_digest

Weekly summary of scan activity, verdicts, and top failure clusters.

Example

curl http://localhost:8000/api/digest/weekly

GET /api/notifications List recent scan notifications.

Function: list_notifications

List recent scan notifications.

Example

curl http://localhost:8000/api/notifications

GET /api/settings Get all settings key-value pairs.

Function: get_settings

Get all settings key-value pairs.

Example

curl http://localhost:8000/api/settings

POST /api/settings Upsert one or more settings. Body: {key: value, ...}.

Function: upsert_settings

Upsert one or more settings. Body: {key: value, ...}.

Example

curl -X POST http://localhost:8000/api/settings \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/settings/webhook Get current webhook URL.

Function: get_webhook_url

Get current webhook URL.

Example

curl http://localhost:8000/api/settings/webhook

POST /api/settings/webhook Configure webhook URL for scan notifications.

Function: set_webhook_url

Configure webhook URL for scan notifications.

Example

curl -X POST http://localhost:8000/api/settings/webhook \
  -H "Content-Type: application/json" \
  -d '{...}'

Validation

3 endpoints

POST /api/validate/run Kick off a new validation analysis (analyzes recent scans for proof pack patterns).

Function: run_validation

Kick off a new validation analysis (analyzes recent scans for proof pack patterns).

Example

curl -X POST http://localhost:8000/api/validate/run \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/validate/status Return status of all validation runs.

Function: get_validation_status

Return status of all validation runs.

Example

curl http://localhost:8000/api/validate/status

GET /api/validate/{run_id} Get status/results of a specific validation run.

Function: get_validation_run

Get status/results of a specific validation run.

Example

curl http://localhost:8000/api/validate/{run_id}

Integration

2 endpoints

POST /api/generate-api-key Generate a new API key stub for the demo.

Function: generate_api_key

Generate a new API key stub for the demo.

Example

curl -X POST http://localhost:8000/api/generate-api-key \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/integration-status Returns integration health metrics.

Function: integration_status

Returns integration health metrics.

Example

curl http://localhost:8000/api/integration-status

Autonomy

5 endpoints

GET /api/v2/autonomy List all autonomy settings.

Function: list_autonomy_settings

List all autonomy settings.

Example

curl http://localhost:8000/api/v2/autonomy

POST /api/v2/autonomy Create autonomy settings for a workflow.

Function: create_autonomy_settings

Create autonomy settings for a workflow.

Example

curl -X POST http://localhost:8000/api/v2/autonomy \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/v2/autonomy/{workflow_id} Get autonomy settings for a workflow.

Function: get_autonomy_settings

Get autonomy settings for a workflow.

Example

curl http://localhost:8000/api/v2/autonomy/{workflow_id}

PUT /api/v2/autonomy/{workflow_id} Update autonomy settings for a workflow.

Function: update_autonomy_settings

Update autonomy settings for a workflow.

Example

curl -X PUT http://localhost:8000/api/v2/autonomy/{workflow_id} \
  -H "Content-Type: application/json" \
  -d '{...}'

GET /api/v2/workflow-graph Return workflow graph structure.

Function: get_workflow_graph

Return workflow graph structure.

Example

curl http://localhost:8000/api/v2/workflow-graph

No endpoints match your search.

System Diagram

Browser Dashboard

index.html · Chart.js · Vanilla JS

HTTP / REST

FastAPI Server

api_server.py · port 8000

scan_engine.py

Autonomous Observe→Gate loop

eval.py

LLM-as-judge evaluation

llm.py

Retry wrapper for Claude API

anti_deception.py

Safety checks & drift detection

PostgreSQL 17

31 tables · All persistent state

Claude (Anthropic)

LLM eval · Mutation gen · Diagnosis

Database Tables

Spectral uses 31 PostgreSQL tables to store all state across the optimization lifecycle:

workflows

Named agent pipelines with metadata and economics parameters

test_cases

Input/output pairs with eval_pool assignment (working, holdout, adversarial)

traces

Raw agent executions — input, output, latency, token usage

configs

Versioned agent hyperparameter sets including system prompts and model settings

rubrics

Evaluation criteria per agent, with weighted dimensions

rubric_dimensions

Individual scoring dimensions within a rubric

eval_results

LLM-judge scores per trace and dimension, with reasoning

scans

Scan runs with status, verdict (GO/NO-GO), and champion config

scan_configs

Configurations evaluated within a scan (champion + challengers)

failure_clusters

Clustered failure patterns from diagnosis phase

experiments

Tournament A/B experiments with leaderboard results

promotions

Config promotion history and champion lineage

rubric_mutations

Proposed rubric changes from meta-improvement layer

strategy_registry

Optimization strategies with ELO ratings for selection

intervention_memory

Historical intervention records for pattern matching

outcome_signals

Real-world outcomes correlated back to eval scores

scan_schedules

Cron-style scheduled scan triggers

notifications

Scan completion alerts and webhook delivery records

autonomy_settings

Per-workflow automation thresholds and mode settings

objective_functions

Custom optimization targets blending rubric scores and business metrics

meta_improvement_log

Audit trail of all meta-level system changes

Key Services

scan_engine.py

The core autonomous loop. Orchestrates all phases: sampling test cases, dispatching eval, clustering failures, invoking the LLM to generate config mutations, running tournament experiments, and issuing the final GO/NO-GO verdict against the holdout set. This is the primary differentiator of the Spectral platform.

eval.py

LLM-as-judge evaluation engine. Scores agent traces against multi-dimensional rubrics using Claude. Supports consistency checking (strict/lenient variants), compositional reliability analysis, and eval agent health monitoring to detect evaluator drift.

llm.py

Resilient wrapper around the Anthropic Claude API. Implements exponential backoff with jitter, request deduplication, token budgeting, and structured output parsing. All LLM calls in Spectral go through this module.

anti_deception.py

Safety layer that runs on every scan. Detects prompt injection, evaluator gaming, and score drift. Cross-validates eval results with consistency checks. Flags suspicious patterns in the intervention log for human review before promotion.

The Scan Loop

The full Spectral Scan runs six sequential phases. Each phase is autonomous — no human input is required between steps:

Phase 01

Observe

Sample n cases from the working pool. Execute the agent pipeline on each input using the current champion config. Record traces — input, output, latency, tokens.

Phase 02

Evaluate

Run Claude as LLM-judge on every trace, scoring each rubric dimension independently. Aggregate weighted scores into a single quality score per trace.

Phase 03

Diagnose

Cluster failing traces by failure mode (e.g., "hallucinated drug dosage", "ignored patient history"). Use LLM to write a root-cause narrative per cluster.

Phase 04

Optimize

For each failure cluster, generate 3–5 config mutations (prompt edits, temperature changes, few-shot additions). Mutations are guided by the strategy registry ELO ratings.

Phase 05

Tournament

A/B test all candidate configs against the champion on the working set. Score each challenger. The top-scoring challenger becomes the promotion candidate.

Phase 06

Gate

Re-run the promotion candidate on the holdout set (cases never seen during optimization). If it beats champion by the promotion threshold, verdict = GO and config is eligible for promotion.

A GO verdict means the challenger config is statistically better than the current champion on unseen data. Use POST /api/promote-auto or the Promotion Gate screen to deploy it.

The anti-deception checks in anti_deception.py run alongside every phase. Any detected score drift or evaluator gaming will block promotion and log an intervention event regardless of the GO/NO-GO verdict.

Quickstart Guide

Prerequisites

Installation

Your First Workflow

How a Scan Works

Key Concepts

API Reference

Workflows

Ingestion

Cases & Configs

Rubrics

Evaluation

Scans

Reports

Failure Analysis

Experiments

Promotion

Dashboard

Meta-Improvement

Outcomes

Safety

Scheduling

Settings & Notifications

Validation

Integration

Autonomy

Architecture

System Diagram

Database Tables

Key Services

The Scan Loop