Case study · 2025 · EdTech · AI

An AI math tutor that streams formulas, generates videos, and listens

Mathmatika is a Synara product — streaming Claude explanations with KaTeX-rendered formulas, on-demand Manim animation videos with a self-healing codegen pipeline, RAG over uploaded PDFs using in-memory BM25Plus (not a vector database), and a realtime voice mentor on LiveKit + Deepgram. One LLM provider end-to-end. One FastAPI backend. One typed contract.

ClientSynara product
CategoryAI tutoring · Generative video · Realtime voice
Year2025
EngagementBuilt and operated by Synara · advanced prototype

Mathmatika AI math tutoring platform — streaming explanations, KaTeX formulas, on-demand Manim animation videos, and a realtime voice mentor.

01 — At a glance

1 LLM: Provider countAnthropic Claude end-to-end. No multi-provider routing complexity.
0: Vector databasesIn-memory BM25Plus for PDF RAG. Whole stack fits on a laptop.
≤ 2: Self-healing retriesManim codegen validates → renders → retries on stderr → falls back.

Summary

Mathmatika is an AI math and DSA learning platform built at Synara. Students upload PDFs, ask questions, and get back streaming explanations with KaTeX-rendered formulas, AI-generated Manim animation videos for the harder concepts, and a realtime voice mentor that listens, transcribes, and responds. The whole product runs on a single LLM provider (Anthropic Claude), a single FastAPI backend, and a single typed contract between backend and frontend. The non-obvious technical bets — BM25Plus over a vector database, Manim self-healing codegen with template fallback, an 80-character KaTeX-aware streaming batcher — are what make the product feel native instead of stitched.

The problem

Procurement was held together with Excel and goodwill.

AI tutoring products have a generic shape. A chat surface. An LLM behind it. Some retrieval over a textbook corpus. The user asks a math question. The LLM answers in prose. The user closes the tab because prose is the wrong medium for math.

Math is a visual subject. The notation is the substance. A derivative explained as words is not an explanation of the derivative; it's a description of one. A concept like a Fourier transform, an eigenvector, a recurrence relation lives in pictures and in expressions, not in sentences.

“Prose is the wrong medium for math. The notation is the substance. We wanted the visual layer to be first-class.”

We wanted Mathmatika to ship the visual layer as a first-class part of the answer. That meant three things. First, streaming math notation that actually rendered — KaTeX without flickering, formulas without breaking, expressions arriving in coherent chunks. Second, generated animation videos for the concepts where the picture is the explanation — and a codegen pipeline reliable enough that the videos actually rendered most of the time. Third, a voice mentor for the conversations that wanted to be conversations, not chat threads.

And we wanted to ship all of it on a stack we could actually operate. One LLM provider. One backend framework. One typed contract. Anything else and the platform would be a distributed-systems demo wearing an EdTech costume.

$Mathmatika streaming an explanation — KaTeX-rendered formulas appearing mid-response without flicker, citations anchored to the source PDF on the right.$

Math arrives in coherent chunks. Formulas render the moment they're complete. Citations stay anchored.

02 — Why us

Why Synara owns this

Mathmatika is a Synara product, not a client engagement. We built it because the existing AI math tutors did not respect the visual medium and we wanted to find out whether a different stack could. The technical bets — streaming-aware rendering, self-healing codegen, in-memory retrieval — are the kind of thing we'd otherwise be recommending to a client building an EdTech product. Imagika and Mathmatika are working reference implementations of the architectures we believe in.

Synara is also the right shape of team for this product specifically. Streaming UX is a design problem first and an engineering problem second; the way KaTeX should *feel* arriving from a stream is a craft decision, not an algorithmic one. The voice mentor is a real-time-audio problem with a sensory-design layer on top. We do both registers in the same team.

03 — What we built

A three-portal procurement platform — buyer, supplier, admin.

Mathmatika rendered on a MacBook — streaming math explanation on the left, a generated Manim animation playing alongside, voice mentor dock at the bottom.

Streaming explanations with KaTeX-rendered formulas

Claude's responses stream into the UI in 80-character batches so partial LaTeX expressions never reach the renderer mid-token. Math appears in coherent chunks, never flickers, never throws. The streaming feels native to the medium rather than fighting it.

Manim animation videos with self-healing codegen

For concepts where the picture is the explanation, Claude writes Manim Python. The pipeline validates, renders via subprocess, parses stderr on failure, asks Claude to fix the code, retries up to twice, and falls back to a curated template if the retries don't land. SHA-256 cache keyed by concept + quality so a video is only ever rendered once.

Realtime voice mentor on LiveKit + Deepgram

A live conversation surface backed by LiveKit for transport, Deepgram for STT, OpenAI TTS for the voice, and Silero VAD for turn detection. The voice mentor uses the same Claude context as the chat — same retrieved evidence, same lesson topology — so the medium is interchangeable mid-session.

RAG over PDFs with in-memory BM25Plus

Students upload textbook PDFs and ask questions against them. We chose in-memory BM25Plus with sentence-transformers embeddings over a vector database because the retrieval quality is already strong for textbook-style content and the operational complexity of a second data store wasn't worth the latency win. Whole stack fits on a laptop.

Source attribution on every retrieved answer

Every RAG-grounded response cites the page and paragraph it drew from. Citations are clickable and open the source PDF at the cited paragraph. Students see what the answer was built from — not just the answer.

Type-safe FastAPI ↔ Next.js contract

The OpenAPI schema is emitted by FastAPI and consumed by the Next.js frontend via openapi-typescript + openapi-fetch + TanStack Query. The frontend has zero hand-written API client code; the backend has zero hand-maintained types. Breaking the contract is a build error, not a Tuesday-afternoon support ticket.

Manim self-healing pipeline diagram — Claude code generation, subprocess render, stderr parse, retry loop, cached final video.

Codegen → validate → render → on-error retry → fallback template. SHA-256 cache so a video is only rendered once.

04 — Architecture

In plain English, why each choice.

Frontend: Next.js 15 + React 19 on Vercel. KaTeX for math rendering, with a streaming batcher between the SSE stream and the renderer to handle partial LaTeX expressions. TanStack Query for server data; openapi-typescript for the API client types. Framer Motion for the small entrance and feedback animations that make the product feel alive rather than clinical.
Backend: FastAPI on Python 3.12+, deployed via Docker to a GCP VM. Async-first throughout. The backend exposes a typed OpenAPI schema consumed end-to-end by the frontend. Streaming responses use SSE; the voice mentor uses LiveKit for transport.
AI orchestration: Anthropic Claude is the only LLM provider — extended thinking enabled where the question warrants it. Multi-agent research uses parallel Exa searches with a lead-agent synthesis step for questions that need live web evidence. No multi-provider routing layer; one model, used well, is cheaper to operate than three.
RAG layer: PDF ingestion segments text into paragraphs, indexes both BM25Plus (for keyword precision) and sentence-transformer embeddings (for semantic recall) in memory. Retrieval combines both signals. The choice avoids a separate vector database — the whole stack runs on Postgres for application data and Python memory for retrieval state.
Manim pipeline: Claude generates Manim Python from a structured request. The pipeline validates syntax, attempts to render via subprocess, parses stderr for failures, and asks Claude to fix detected errors. Up to two retries. On final failure, falls back to a curated template. Generated videos cached by SHA-256 of (concept + quality).
Voice mentor: LiveKit for room and audio transport. Deepgram for streaming STT. OpenAI TTS for the spoken response. Silero VAD as the agent worker for turn detection. The voice mentor shares the same Claude context as the chat — switching media mid-session is supported without losing state.

05 — Outcomes

What changed after launch.

1 LLM provider: Operational simplicityAnthropic Claude across chat, voice, RAG, Manim codegen.
Native math: Streaming UXKaTeX renders cleanly — no flicker, no thrown errors.
Self-healing: Video generationManim codegen validates, retries, falls back. Reliable enough to ship.
0 vector DBs: Retrieval stackIn-memory BM25Plus + embeddings. Fits on a laptop.

MethodologyOutcomes describe the deployed architecture and shipped product surfaces. Stability claims are observations from operating the advanced prototype with internal and pilot users; the streaming and Manim pipelines are battle-tested against real student question loads.

The first AI tutor where the formulas actually rendered and the animations actually played. The medium finally fits the subject.

Pilot studentUndergraduate · math + CS

06 — What’s next

Hardening the Manim render pipeline against the long tail of edge cases. Expanding the RAG layer to handle full course corpora, not just single textbooks. A study-plan layer that uses the same graph primitive as Orno — concepts as nodes, dependencies as edges — so the tutor can recommend the next-best topic, not just answer the question in front of it.

Stack

Next.js 15
React 19
KaTeX
TanStack Query
openapi-typescript
openapi-fetch
Framer Motion
FastAPI
Python 3.12
Anthropic Claude
Exa
BM25Plus
sentence-transformers
Manim
LiveKit
Deepgram
OpenAI TTS
Silero VAD
Docker
GCP
Vercel

Building an AI learning product where the medium matters?

Generic chat over a textbook is the wrong product. Subjects with their own grammar — math, music, chemistry, formal logic, programming — need streaming, visualization, voice, and retrieval that respect the medium. We can scope a Mathmatika-shaped product for any subject where the notation is the substance.

Start a conversation