Stuart, a personal learning agent that grows with the student. One command, offline-first.
Claw-STU is a command-line personal learning agent for students. Stuart runs an adaptive teach-assess-adapt loop — it picks a learning modality based on what's been working for this specific student, presents a short block, asks a check-for-understanding question, and steps the complexity tier up or down based on the answer. Then it does it again.
Stuart is explicitly not a tutor, a friend, a therapist, or an authority figure. It is a cognitive tool. It never claims to feel emotions, never praises innate ability, and never replaces a teacher or a guardian. The boundary is documented in SOUL.md and enforced at every entry point by an inbound safety gate that runs before anything else.
Works with multiple AI providers (Anthropic, OpenAI, Ollama, OpenRouter) and a deterministic Echo provider as the guaranteed fallback floor. The session loop never stalls when a provider is unreachable. Requires Python 3.11+. Works on macOS, Windows, and Linux.
First run creates ~/.claw-stu/ with 0700 permissions, loads
defaults from AppConfig, and picks up API keys from
secrets.json or environment variables
(ANTHROPIC_API_KEY, OPENAI_API_KEY,
OPENROUTER_API_KEY, OLLAMA_BASE_URL). Missing keys
fall through the chain, ending at Echo.
mypy --strict clean across every phase. Ruff clean.
filterwarnings = ["error"] clean. Runtime under 2 seconds for
the full suite. Every phase has its own AST-enforced layering guard.
Stuart is a deterministic state machine with an optional LLM boundary. The session runner, safety gate, and memory layer are all pure Python. Only the content-generation seam talks to a provider, and even that seam has a guaranteed Echo fallback floor so the loop never stalls.
approaching,
meeting, or exceeding the grade-level standard.
The estimate updates after every observed check.
SOCRATIC_DIALOGUE, BLOCK_GENERATION,
CHECK_GENERATION, RUBRIC_EVALUATION,
PATHWAY_PLANNING, CONTENT_CLASSIFY,
DREAM_CONSOLIDATION) is routed to the provider / model best
suited for it. Ollama for local latency-sensitive tasks, Anthropic Haiku
for accuracy-critical rubric evaluation, GLM 4.5 Air on OpenRouter for
prose blocks.
build_learner_context() — it pulls the learner's
compiled-truth brain page, the concept's HAPP framing, the last three
session pages, and any flagged misconceptions. The second time a student
asks about the same topic, Stuart already knows what they understood last
week.
SessionPage, updates the LearnerPage compiled
truth, and adds KG triples. Overnight, the scheduler runs a dream cycle
that rewrites compiled truths, detects concept gaps, and re-indexes
embeddings.
Every layer is enforced by tests/test_hierarchy.py, an
AST-based import-DAG guard that runs on every commit. Violations fail CI
before merge.
Safety is the lowest layer in the stack — nothing imports from it
without being checked. Every student-text entry point
(/sessions/{id}/calibration-answer,
/check-answer, /socratic,
/learners/{id}/capture) runs through
InboundSafetyGate.scan(text) before any other logic.
The gate returns one of three decisions:
CRISIS_PAUSE. The evaluator and the orchestrator are
never called. Escalation resources are returned. A single
structured event {event: crisis_detected, kind, session_id_hash,
learner_id_hash} is logged — no raw text, no PII, no brain
page entry. The omission is deliberate: SOUL.md §5 says Stuart
surfaces human resources and steps out of the teach loop. Preserving a
paper trail of a specific crisis message would create a PII retention
hazard the project refuses to accept.
A paused session can only be closed (producing a summary that acknowledges
the pause) or explicitly unpaused by an administrator. next_directive
has an explicit CRISIS_PAUSE branch at the top of the dispatch
so a paused session cannot be accidentally routed back into the teach
loop. This is covered by
test_crisis_paused_session_refuses_next_directive.
Claw-STU and Claw-ED are independent projects built by the same team. They compose: a teacher uses Claw-ED to generate lessons in their voice, and Claw-STU delivers them adaptively to each student with memory-backed personalization.
Both are MIT-licensed, Python, multi-provider, and offline-first.
Claw-STU is BYOK (bring your own key). Each TaskKind is routed
to a provider / model picked for that job. The fallback chain is
ollama → openai → anthropic → openrouter, ending
at a deterministic EchoProvider floor so the session loop never
stalls.
| Task | Default provider | Default model |
|---|---|---|
SOCRATIC_DIALOGUE |
Ollama (local) | llama3.2 — free + instant |
BLOCK_GENERATION |
OpenRouter | z-ai/glm-4.5-air — cheap prose |
CHECK_GENERATION |
OpenRouter | z-ai/glm-4.5-air |
RUBRIC_EVALUATION |
Anthropic | claude-haiku-4-5 — accuracy-critical |
PATHWAY_PLANNING |
OpenRouter | z-ai/glm-4.5-air |
CONTENT_CLASSIFY |
Ollama (local) | llama3.2 — never network |
DREAM_CONSOLIDATION |
OpenRouter | z-ai/glm-4.5-air — overnight batch |
Missing API keys don't crash the router — they just fall through the chain. For a fully offline deployment, install Ollama with llama3.2 and leave the other keys unset. Stuart still works; it just uses Ollama for every task kind instead of the default split.
Reachability is not probed at construction time. The router only knows
presence or absence of an API key. Real network health checks land behind
clawstu doctor --ping.
clawstu # start a learning session clawstu learn "photosynthesis" # learn a specific topic clawstu resume <learner_id> # warm-start from last session clawstu ask "What is a primary source?" # one-shot Socratic question clawstu wiki <concept> # per-student concept notes clawstu progress # learner dashboard (ZPD, modality) clawstu history # past sessions clawstu review # concepts due for review clawstu setup # pick your AI provider clawstu serve # web UI at localhost:8000 clawstu doctor [--ping] # self-diagnosis clawstu profile export <id> --out profile.tar.gz # portable profile clawstu profile import profile.tar.gz # restore a profile clawstu scheduler run-once --task <name> # run a proactive task
doctor is a pure static config dump by default. It never
touches the network. Pass --ping to opt in to real
reachability checks. This guarantee is enforced by
test_doctor_without_ping_does_not_make_network_calls, which
monkey-patches httpx.Client.post to raise on any invocation.
Everything is also reachable over HTTP via the FastAPI app. All
student-text routes go through InboundSafetyGate first.
POST /sessions # onboard a learner
GET /sessions/{session_id} # current state
POST /sessions/{id}/calibration-answer # submit calibration answer
POST /sessions/{id}/finish-calibration # transition to teach loop
POST /sessions/{id}/next # next directive
POST /sessions/{id}/check-answer # submit check answer
POST /sessions/{id}/socratic # free-form dialogue
POST /sessions/{id}/close # close + write to brain
GET /learners/{id}/wiki/{concept} # per-student concept wiki
POST /learners/{id}/resume # warm-start (pre-gen'd)
GET /learners/{id}/queue # scheduler queue for learner
POST /learners/{id}/capture # student-shared source
GET /admin/scheduler # scheduler status
GET /admin/health # process health
Learner routes are gated by a shared-secret bearer token when
STU_LEARNER_AUTH_TOKEN is set in the environment. Without
the env var, auth is a no-op — single-household dev mode. Per-learner
JWTs are a post-MVP concern.
Claw-STU is built in public and ships in 7 phases. Contributions are welcome.
git clone https://github.com/SirhanMacx/Claw-STU.git cd Claw-STU pip install -e ".[dev]" pytest # 373 tests, under 2s mypy clawstu # strict, 81 files ruff check clawstu tests # clean
* sentinel)prepare_next_session (Phase 6 writes a placeholder artifact)refresh_zpd