Live at chat.bcat.app
A smart cat
that thinks with you.
bcat is an AI chat agent that researches, writes, runs code, remembers, and reasons — privately. Everything stays on your side. No cloud ears, no data sold, no surveillance.
Private by design
Your conversations never leave your control. No telemetry, no third-party model calls, no cloud processing — ever.
Real research
Searches the web, reads pages, finds academic papers, browses Wikipedia, and digs into technical Q&A — all in one conversation.
Self-improving
bcat reflects on its own answers and stores high-confidence reasoning patterns — getting sharper the more you use it.
Continuously self-correcting
A nightly self-test that finds its own mistakes
Every night, bcat runs a battery of probes against itself, grades its answers with a second model, and only keeps the lessons that actually fix the failure. The reasoning patterns that survive are added to its working memory — so next time, it doesn't make the same mistake.
- 01
Probe
A fixed battery of tricky questions covers math, logic, false premises, prompt-injection robustness, hedging, and conciseness.
- 02
Grade
A second model strictly grades each answer PASS · WEAK · FAIL. PASSes are left alone — only failures move forward.
- 03
Extract a fix
For each failure, the judge proposes one behavioral pattern that would have prevented it, scored 0–10. Low-quality or duplicate fixes are dropped.
- 04
Verify or roll back
The probe is re-run with the new pattern injected. If it doesn't actually fix the failure, the pattern is rolled back. No false lessons stick.
Why it matters → Most chat AIs are static — the model that shipped is the model you talk to forever. bcat keeps a private journal of what tripped it up and how to avoid it next time, verified end-to-end before anything is added. Your conversations make bcat sharper without ever leaving your account.
Everything you need in one conversation
bcat routes your question to the right capability automatically — no prompts to memorize, no plugins to manage.
Web research
Live web search, full-page reading, Wikipedia, Hacker News, Wayback Machine, and more — cited and summarised.
Code & math
Writes, explains, and runs code in a safe sandbox. Solves equations and proofs with a dedicated math specialist.
Persistent memory
Tell bcat something once and it remembers. Ask it to recall, search, or forget — your memory is yours to control.
Image understanding
Upload a screenshot, photo, diagram, or document and ask anything about it. bcat describes, analyzes, and answers.
Academic papers
Searches arXiv, PubMed, and OpenAlex for peer-reviewed research — summarised in plain language, with citations.
Smart routing
Hard questions get handed to domain specialists — biology, security, science, ancient history — automatically, without you having to ask.
Translation
Translates text across dozens of languages and explains nuance, idiom, and context — not just literal word substitution.
Live facts
Current weather, country data, time zones, definitions — real answers from live sources, not stale training data.
Developer tools
Searches GitHub repos and Stack Exchange threads, explains error messages, and helps debug — without leaving the chat.
Under the hood
How a question finds the right brain
bcat doesn't pick a model — it picks a specialist. Every prompt passes through a three-layer router that escalates only when the cheap layers can't decide. If a specialist needs help, it can hand off to another specialist mid-answer.
- L2
Regex routing
Pattern matches the obvious cases in roughly a millisecond — code, math, translation, ancient-language hints. No model call, no token cost.
- L3
Semantic embedding
kNN over a curated bank of exemplars. Catches the cases where the regex abstains but the meaning is unambiguous.
- L4
LLM classifier
A tiny
qwen3:1.7bclassifier returns strict JSON{ "specialist": "..." }when the lower layers both abstain.
Inter-specialist handoff →
Any specialist can ask any other specialist a sub-question.
Depth-limited, whitelisted, and inlined into the final reply as a labeled
blockquote so you see the full chain of reasoning.
The full roster
Eleven local Ollama models, each picked for what it actually does best. Nothing leaves the box — no third-party model APIs, no cloud inference, no telemetry.
Default conversational brain. Tool-use, multi-step reasoning, vision, CoT scaffolding. Multimodal — handles image uploads itself.
Tiny, fast classifier. Strict JSON output when L2 regex and L3 embeddings abstain.
Grades nightly probes and proposed reasoning patterns 0–10. Only patterns above threshold (and that survive verify) get persisted.
Software dev, code review, debugging, architecture, ML/fine-tuning. Loaded on demand for [ACTION: code].
Derivations, proofs, calculus, linear algebra, statistics, classical & quantum mechanics.
Molecular biology, genetics, physiology, pharmacology, clinical reasoning. Pairs with PubMed for evidence-grounded answers.
Offensive & defensive security: CVE analysis, exploit reasoning, hardening, SOC / incident response, secure code review.
Chemistry, astrophysics, quantum mechanics, and interdisciplinary questions that don't fit math or bio cleanly.
Modern languages with formality + glossary controls (30-day cache), and philology of dead and historic languages — Latin, Ancient Greek, Sanskrit, Old Norse, Akkadian, and more.
Robotics
Optional ROS2 bridge to a connected robot: read camera, list topics, drive Twist commands, and run scripted poses. Admin-gated, single-publish, magnitude-clamped — robotics features are off by default.
Smart-routing JSON API
The same routing brain, exposed as a clean JSON API for your apps. Bearer-token auth, scoped permissions, per-token rate limits. Not a public service — every call is audited.
Read the API docsBuilt lean
The whole stack, in plain text.
Ready to think with a smart cat?
Open the chat and try it. Sign in with Google to save conversation history and unlock persistent memory — everything else works immediately.