Live at chat.bcat.app

A smart cat
that thinks with you.

bcat is an AI chat agent that researches, writes, runs code, remembers, and reasons — privately. Everything stays on your side. No cloud ears, no data sold, no surveillance.

Private by design

Your conversations never leave your control. No telemetry, no third-party model calls, no cloud processing — ever.

Real research

Searches the web, reads pages, finds academic papers, browses Wikipedia, and digs into technical Q&A — all in one conversation.

Self-improving

bcat reflects on its own answers and stores high-confidence reasoning patterns — getting sharper the more you use it.

Continuously self-correcting

A nightly self-test that finds its own mistakes

Every night, bcat runs a battery of probes against itself, grades its answers with a second model, and only keeps the lessons that actually fix the failure. The reasoning patterns that survive are added to its working memory — so next time, it doesn't make the same mistake.

  1. 01

    Probe

    A fixed battery of tricky questions covers math, logic, false premises, prompt-injection robustness, hedging, and conciseness.

  2. 02

    Grade

    A second model strictly grades each answer PASS · WEAK · FAIL. PASSes are left alone — only failures move forward.

  3. 03

    Extract a fix

    For each failure, the judge proposes one behavioral pattern that would have prevented it, scored 0–10. Low-quality or duplicate fixes are dropped.

  4. 04

    Verify or roll back

    The probe is re-run with the new pattern injected. If it doesn't actually fix the failure, the pattern is rolled back. No false lessons stick.

Why it matters → Most chat AIs are static — the model that shipped is the model you talk to forever. bcat keeps a private journal of what tripped it up and how to avoid it next time, verified end-to-end before anything is added. Your conversations make bcat sharper without ever leaving your account.

Everything you need in one conversation

bcat routes your question to the right capability automatically — no prompts to memorize, no plugins to manage.

Web research

Live web search, full-page reading, Wikipedia, Hacker News, Wayback Machine, and more — cited and summarised.

Code & math

Writes, explains, and runs code in a safe sandbox. Solves equations and proofs with a dedicated math specialist.

Persistent memory

Tell bcat something once and it remembers. Ask it to recall, search, or forget — your memory is yours to control.

Image understanding

Upload a screenshot, photo, diagram, or document and ask anything about it. bcat describes, analyzes, and answers.

Academic papers

Searches arXiv, PubMed, and OpenAlex for peer-reviewed research — summarised in plain language, with citations.

Smart routing

Hard questions get handed to domain specialists — biology, security, science, ancient history — automatically, without you having to ask.

Translation

Translates text across dozens of languages and explains nuance, idiom, and context — not just literal word substitution.

Live facts

Current weather, country data, time zones, definitions — real answers from live sources, not stale training data.

Developer tools

Searches GitHub repos and Stack Exchange threads, explains error messages, and helps debug — without leaving the chat.

30+
Agent tools
7
Specialists
3
Router layers
100%
Local Ollama

Under the hood

How a question finds the right brain

bcat doesn't pick a model — it picks a specialist. Every prompt passes through a three-layer router that escalates only when the cheap layers can't decide. If a specialist needs help, it can hand off to another specialist mid-answer.

  1. L2

    Regex routing

    Pattern matches the obvious cases in roughly a millisecond — code, math, translation, ancient-language hints. No model call, no token cost.

  2. L3

    Semantic embedding

    kNN over a curated bank of exemplars. Catches the cases where the regex abstains but the meaning is unambiguous.

  3. L4

    LLM classifier

    A tiny qwen3:1.7b classifier returns strict JSON { "specialist": "..." } when the lower layers both abstain.

Inter-specialist handoff → Any specialist can ask any other specialist a sub-question. Depth-limited, whitelisted, and inlined into the final reply as a labeled blockquote so you see the full chain of reasoning.

The full roster

Eleven local Ollama models, each picked for what it actually does best. Nothing leaves the box — no third-party model APIs, no cloud inference, no telemetry.

★ Main agent
gemma4:e2b

Default conversational brain. Tool-use, multi-step reasoning, vision, CoT scaffolding. Multimodal — handles image uploads itself.

⚡ Router (L4)
qwen3:1.7b

Tiny, fast classifier. Strict JSON output when L2 regex and L3 embeddings abstain.

⚖ Self-learning judge
qwen3:8b · mistral:7b

Grades nightly probes and proposed reasoning patterns 0–10. Only patterns above threshold (and that survive verify) get persisted.

</> Code specialist
deepseek-coder-v2:16b

Software dev, code review, debugging, architecture, ML/fine-tuning. Loaded on demand for [ACTION: code].

∑ Math specialist
mathstral:7b

Derivations, proofs, calculus, linear algebra, statistics, classical & quantum mechanics.

⚕ Bio / biomedical
meditron:7b

Molecular biology, genetics, physiology, pharmacology, clinical reasoning. Pairs with PubMed for evidence-grounded answers.

⚠ Security specialist
whiterabbitneo-v3:7b

Offensive & defensive security: CVE analysis, exploit reasoning, hardening, SOC / incident response, secure code review.

⚛ Science fallback
mistral:7b

Chemistry, astrophysics, quantum mechanics, and interdisciplinary questions that don't fit math or bio cleanly.

🌐 Translate / ancient
aya-expanse:8b

Modern languages with formality + glossary controls (30-day cache), and philology of dead and historic languages — Latin, Ancient Greek, Sanskrit, Old Norse, Akkadian, and more.

Robotics

Optional ROS2 bridge to a connected robot: read camera, list topics, drive Twist commands, and run scripted poses. Admin-gated, single-publish, magnitude-clamped — robotics features are off by default.

Smart-routing JSON API

The same routing brain, exposed as a clean JSON API for your apps. Bearer-token auth, scoped permissions, per-token rate limits. Not a public service — every call is audited.

Read the API docs

Built lean

The whole stack, in plain text.

PythonFlaskGunicornOllamaPostgreSQL 18Google OAuth + Step-UpPQC TLS 1.3 (X25519MLKEM768)3-Layer RouterInter-Specialist askSSE StreamingContinuous Self-LearningNightly Verify-or-Roll-BackDocker Code SandboxROS2 / rosbridgeBrave + SearXNGarXiv · PubMed · OpenAlexSmart-Routing API

Ready to think with a smart cat?

Open the chat and try it. Sign in with Google to save conversation history and unlock persistent memory — everything else works immediately.