How ZimRoots Works — System Architecture

System Architecture

Two interfaces, one backend, one AI brain. No custom routing logic anywhere.

Interface

WhatsApp User

Phone number = identity. Auto-authenticated.

Interface

Web Browser

Session ID + auth token. Needs OTP.

--- POST /webhook (Twilio/Meta) --- --- POST /api/chat (JSON) ---

Backend (single file)

Express Server — chatbot/src/server.js

2,600 lines. converse() function handles both channels identically.

In-memory sessions (Map, 30-min timeout) • Message persistence • Image processing • Rate limiting

↓↓↓ anthropic.messages.create() with tools[] ↓↓↓

AI Brain

Claude Sonnet 4 — Agentic Tool-Use Loop

System prompt (spec.md ~7,500 words) + 18 tools. Claude decides what to do. Loop continues until text reply.

$3/M input tokens • $15/M output tokens • Every call logged to api_costs

↓ tool calls ↓ ↓ tool calls ↓ ↓ vision API ↓

Database

Supabase (Postgres + pgvector)

users, businesses, manufacturers, products, events, customers, services, jobs, messages, flags, api_costs

External APIs

Serper (Google) • BulkSMS • Resend

Web search, SMS verification, email verification

Backend Files

server.js (logic) + spec.md (personality)

Tools Available to Claude

search, register, verify, create, edit, remove, flag, image, scan, website, web_search, fetch, instructions, admin

Entity Types

Business, Manufacturer, Product, Event, Customer, Service, Job, User

API Call Sites

Conversation, tool-use loop, scan_image (vision), website_gen

How the System Prompt is Built

Every Claude API call sends the same base prompt + dynamic auth context. Here's exactly what Claude sees.

system prompt — assembled at each API call ~7,500 words • ~9,000 tokens

Part 1: spec.md (static)

# ZimRoots — AI Directory Assistant
You are ZimRoots, an AI-powered directory assistant for Harare's informal economy...

Personality rules, entity type definitions, anti-fraud warning text, two-tier verification flow, 6 conversation modes (Reader, Poster, Editor, Reporter, Image, Community Contributor), field requirements per entity type, [OPTIONS] button syntax, 10 hard rules for Claude's behavior.

Part 2: Session context (injected dynamically)

[Current user session: +263771234567]
[Auth: logged in as Chipo Mwale, phone: +263771234567, verified]

Varies per user state: "not logged in", "WhatsApp user ... not yet registered", "logged in as NAME, verified"

Part 3: Channel tag (WhatsApp only)

[Channel: WhatsApp — the user is already authenticated by their phone number. Never ask for phone number, never ask for OTP/verification. If they are not registered yet, just ask for their name.]

Prompt Assembly Code

// server.js line 1900
const systemPrompt = spec
  + `\n\n[Current user session: ${sessionId}]\n${authContext}`
  + (session.channel === 'whatsapp'
    ? `\n[Channel: WhatsApp — ...]`
    : '');
  

Why this matters for cost: The system prompt is ~9,000 tokens. It's sent with every single API call — initial conversation, every tool-use loop iteration, etc. On a 5-turn conversation with 3 tool calls, that's 8 API calls × 9K tokens = 72K input tokens just for the prompt. At $3/M, that's $0.22 in prompt tokens alone.

Database Tables

CSS mockups of real Supabase tables showing actual column types and sample data.

● users — registered people

id	phone	phone_verified	id_verified	banned	meta (jsonb)	created_at
a1b2c3d4-...	+263771234567	true	false	false	{"name":"Chipo Mwale","auth_token":"...","phone_verified_via":"sms"}	2026-04-08 10:30
e5f6g7h8-...	+263772555111	true	true	false	{"name":"Tendai M.","id_image_url":"https://...","verified_via":"whatsapp"}	2026-04-07 14:15

● businesses — named business entities

id	user_id	name	description	category	status	meta (jsonb)
b3c4d5e6-...	a1b2c3d4-...	Chipo's Vegetable Stand	Fresh vegetables: tomatoes, onions, cabbage, carrots	Produce/Fresh Foods	verified	{"location":"Mbare","contact":"+263771234567","hours":"6am-6pm","payment_methods":"Cash, EcoCash","images":["https://..."],"delivery_areas":"Mbare","slug":"chipos-vegetable-stand"}
f8g9h0i1-...	e5f6g7h8-...	Tendai's Welding Works	Custom gates, burglar bars, window frames	Metal Fabrication	verified	{"location":"Glen View","contact":"+263772555111","verified_owner":true,"images":["https://...","https://..."]}

The columns + meta pattern: Only fields used in SQL queries (WHERE, ORDER BY, JOIN, ilike search) are real columns. Everything else — location, hours, contact, images, payment methods — lives in the meta JSONB column. Adding a new field = just put it in meta. No migration needed. Claude reads the whole row anyway.

● messages — full conversation history

id	session_id	role	content	image_url	meta (jsonb)	created_at
msg-001	+263771234567	user	I want to register my business selling vegetables in Mbare	null	{}	10:30:01
msg-002	+263771234567	assistant	Welcome to ZimRoots! / Mauya kuZimRoots! I'd love to help you list your vegetable business...	null	{"tool_calls":[{"name":"register_user","input":{"phone":"+263771234567","name":"Chipo"}}]}	10:30:03

● api_costs — every LLM call, with cost

session_id	user_phone	phase	model	input	output	cost_usd	content	tool_calls	created_at
+263771234567	+263771234567	conversation	claude-sonnet-4-20250514	9,847	186	$0.0325	null (tool_use, no text)	[{"name":"register_user"}]	10:30:02
+263771234567	+263771234567	tool_call	claude-sonnet-4-20250514	10,234	312	$0.0775	Welcome to ZimRoots! / Mauya kuZimRoots!...	null	10:30:03
+263771234567	+263771234567	website_gen	claude-sonnet-4-20250514	1,247	6,892	$0.1072	<!DOCTYPE html><html lang="en">...	null	10:35:12

The Agentic Loop

Claude doesn't just answer — it acts. The server runs a loop: call Claude, execute tools, call Claude again, until it produces a final text reply.

// The core loop (server.js)

// Cacheable system prompt — static spec cached (90% discount on repeat calls)
const cachedSystem = [{ type: 'text', text: systemPrompt,
  cache_control: { type: 'ephemeral' } }];  // <-- $0.30/M vs $3/M
const cachedTools = tools.map((t, i) =>
  i === tools.length - 1 ? { ...t, cache_control: { type: 'ephemeral' } } : t);

let response = await anthropic.messages.create({
  model: 'claude-sonnet-4-20250514',
  max_tokens: 1024,
  system: cachedSystem,            // spec.md + auth context (~9K tokens, CACHED)
  messages: session.messages,      // conversation history (max 30 msgs)
  tools: cachedTools,              // 18 tool definitions (last one CACHED)
});
logApiCost(sessionId, userPhone, response, 'conversation');  // $$$

const MAX_TOOL_ROUNDS = 8;  // Safety cap — prevents runaway costs
let toolRound = 0;

while (response.stop_reason === 'tool_use' && toolRound < MAX_TOOL_ROUNDS) {
  toolRound++;
  // 1. Extract tool calls from Claude's response
  const toolNames = [];
  for (const block of response.content) {
    if (block.type === 'tool_use') {
      toolNames.push(block.name);
      result = await handleTool(block.name, block.input, ...);  // Execute!
    }
  }

  // 2. Feed tool results back to Claude
  session.messages.push({ role: 'user', content: toolResults });

  // 3. Call Claude again with updated context (prompt hits cache!)
  response = await anthropic.messages.create({ ...same cached params... });
  logApiCost(sessionId, userPhone, response, 'tool_call', toolNames);  // $$$
}
  

Cost with caching: The first API call pays full price for the ~9K token prompt ($0.027). Subsequent calls in the loop hit the cache at 90% discount ($0.003 each). A 3-round conversation goes from ~$0.09 to ~$0.03 in prompt costs. The MAX_TOOL_ROUNDS = 8 cap prevents unbounded loops — normal flows use 1-3 rounds.

Workflow: User Registration (WhatsApp)

Trace every step when a new WhatsApp user sends their first message.

User sends: "Hi, I want to list my vegetable business"

Arrives via Twilio/Meta webhook. Phone: +263771234567

Server

converse("+263771234567", message, null, "whatsapp")

Session auto-created. WhatsApp auto-auth kicks in:
session.registeredPhone = "+263771234567"
session.authenticated = true
Lookup user in DB → not found (new user). Will need name.

Server

Build system prompt

spec.md +

[Auth: WhatsApp user +263771234567 — authenticated but not yet registered. Ask for their name only, then call register_user.]

+ [Channel: WhatsApp]

Claude API Call #1

Initial conversation call

Model: claude-sonnet-4-20250514 | Max tokens: 1024
Messages: [{role: "user", content: "Hi, I want to list my vegetable business"}]
Tools: all 18 available

~9,900 input • ~180 output • ~$0.032

Cost Logged

api_costs INSERT

phase: "conversation", model: "claude-sonnet-4-20250514", tokens logged, cost calculated

Claude Responds

stop_reason: "end_turn" (no tool calls)

Claude asks for the user's name (as instructed by auth context). No tools needed yet.

User

Hi, I want to list my vegetable business

ZimRoots

Mhoro! / Welcome to ZimRoots! I'd love to help you get your vegetable business listed in our directory.

First, what's your name so I can set up your account?

User sends: "Chipo Mwale"

Claude API Call #2

Claude sees name, calls register_user tool

stop_reason: "tool_use"

~10,200 input • ~120 output • ~$0.032

Tool: register_user

handleTool("register_user", {phone: "+263771234567", name: "Chipo Mwale"})

Database

INSERT INTO users

{ "id": "a1b2c3d4-e5f6-...", "phone": "+263771234567", "phone_verified": true, // WhatsApp = auto-verified "meta": { "name": "Chipo Mwale" } }

Tool Result Returned to Claude

{ "user": { "id": "a1b2c3d4-...", "phone": "+263771234567", "name": "Chipo Mwale" }, "message": "Registered Chipo Mwale (+263771234567).", "needs_verification": false // WhatsApp: already verified }

Claude API Call #3

Tool-use continuation — Claude sees registration result

stop_reason: "end_turn" (produces greeting + anti-fraud warning)

~10,600 input • ~280 output • ~$0.036

Cost Logged

api_costs INSERT

phase: "tool_call", tool_names: ["register_user"]

Messages Persisted

2 rows inserted into messages table (user msg + assistant reply). Fire-and-forget.

User

Chipo Mwale

register_user({phone: "+263771234567", name: "Chipo Mwale"})

ZimRoots

Welcome, Chipo! You're all set up.

Before we create your listing, I need to be upfront: ZimRoots takes accuracy very seriously. Every listing you submit is tied to your verified phone number. If we find false or misleading information — fake businesses, wrong addresses, impersonating someone else's business — your account will be permanently banned with no option to appeal.

Are you happy to proceed?

Yes, I agree No, I'd rather not

Total API Calls (this interaction)

1 conversation + 1 tool_use + 1 tool_call continuation

Estimated Cost

$0.10

~30,700 input tokens + ~580 output tokens

DB Operations

1 user lookup, 1 INSERT user, 2 INSERT messages, 2 INSERT api_costs

Workflow: Create a Business Listing

After registration + anti-fraud warning. Claude collects info conversationally, then creates the entity.

User

"Yes, I agree" (anti-fraud confirmation)

Claude

Asks: "What's the name of your business?"

No tool call — just conversation. Claude knows to collect: name, description, category, location, contact.

~$0.034

User

"I sell tomatoes, onions, cabbage at Mbare Market. Open 6am-6pm daily."

User provides multiple fields at once. Claude parses all of them.

Claude API Call

Claude calls create_entity with extracted data

stop_reason: "tool_use"

Tool: create_entity

handleTool("create_entity", {...})

// Claude sends this input: { "entity_type": "businesses", "name": "Chipo's Vegetable Stand", "description": "Fresh vegetables: tomatoes, onions, cabbage, carrots", "category": "Produce/Fresh Foods", "location": "Mbare Market", "contact": "+263771234567", "hours": "6am-6pm daily" }

Server: splitEntityInput()

Separates columns from meta

// Real columns (used in SQL queries): { "name": "Chipo's Vegetable Stand", "description": "...", "category": "Produce/Fresh Foods" } // Everything else → meta JSONB: { "location": "Mbare Market", "contact": "+263771234567", "hours": "6am-6pm daily" }

Database

INSERT INTO businesses

{ "id": "b3c4d5e6-f7g8-...", "user_id": "a1b2c3d4-...", "name": "Chipo's Vegetable Stand", "description": "Fresh vegetables: tomatoes, onions, cabbage, carrots", "category": "Produce/Fresh Foods", "status": "verified", // phone_verified = true → live immediately "embedding": null, // not generated yet "meta": { "location": "Mbare Market", "contact": "+263771234567", "hours": "6am-6pm daily", "verified_owner": false, // no ID photo yet "images": [] } }

Tool Result → Claude

{ "success": true, "entity": { "id": "b3c4d5e6-...", "name": "Chipo's Vegetable Stand", "status": "verified" }, "nudges": ["id_verification_available"], "message": "It's now live in the directory!" }

Claude Continuation

Sees result, offers enrichment + photo + website

~$0.038

User

I sell tomatoes, onions, cabbage at Mbare Market. Open 6am-6pm daily.

create_entity({entity_type: "businesses", name: "Chipo's Vegetable Stand", ...})

ZimRoots

Your listing "Chipo's Vegetable Stand" is now live in the ZimRoots directory!

Want to make it stand out? I can ask a few quick questions to add more details, or you can upload a photo of your stand.

Add more details Upload a photo Create a free website I'm done for now

Workflow: Search the Directory

How a search query flows from user to database and back. No authentication required.

User

"Where can I buy fresh vegetables in Mbare?"

Claude API Call

Claude calls search_directory tool

stop_reason: "tool_use"

// Tool call: { "name": "search_directory", "input": { "query": "fresh vegetables", "entity_type": "businesses", "location": "Mbare" } }

Server: searchDirectory()

1. Semantic search (pgvector) → 2. Fallback to ilike → 3. Suburb expansion → 4. Location filter

Step 1: Semantic Search (primary)

Generate embedding for query via OpenAI text-embedding-3-small, then call search_all_entities pgvector RPC.

-- Postgres RPC: search_all_entities(query_embedding, match_count)
SELECT * FROM (
  (SELECT 'business', id, name, description, category,
    1 - (embedding <=> query_embedding) AS similarity
    FROM businesses WHERE embedding IS NOT NULL AND status = 'verified')
  UNION ALL
  -- ... 6 more entity tables ...
) ORDER BY similarity DESC LIMIT 10;
      

Cosine similarity: 1.0 = identical, 0.0 = unrelated. Results above ~0.3 are relevant.

Step 1b: ilike Fallback

If semantic search returns nothing (no embeddings yet, or OpenAI API error), falls back to keyword matching:

SELECT * FROM businesses
WHERE status = 'verified'
  AND (name ILIKE '%vegetable%' OR description ILIKE '%vegetable%' ...)
      

Step 2: getNearbySuburbs("Mbare")

Reads nearby.json (148 Harare suburbs, 3km radius adjacency map)

// Returns: ["Mbare", "Mbare South", "National Sports Stadium", "Ardbennie", "Southerton"]

Step 3: Location filter

Server-side: filter results where meta.location contains any of the expanded suburbs ("Mbare", "Mbare South", etc.)

Why semantic search matters: In Harare's informal economy, people describe things in wildly different ways. "I fix cars" vs "motor vehicle mechanic" vs "panel beater" — keyword matching misses most connections. Semantic search finds them all. Real test results: "someone to fix my car" → Willowvale Auto Mechanics Hub (sim: 0.418). ilike would have returned nothing.

Tool Result → Claude

{ "results": [ { "id": "b3c4d5e6-...", "name": "Chipo's Vegetable Stand", "description": "Fresh vegetables: tomatoes, onions, cabbage, carrots", "category": "Produce/Fresh Foods", "entity_type": "businesses", "location": "Mbare Market", "contact": "+263771234567", "hours": "6am-6pm daily" } ], "searched_suburbs": ["Mbare", "Mbare South", "Ardbennie", ...], "total": 1 }

Claude Formats Response

Presents results in a readable format

User

Where can I buy fresh vegetables in Mbare?

search_directory({query: "fresh vegetables", location: "Mbare"})

ZimRoots

I found 1 result in the Mbare area:

Chipo's Vegetable Stand
Fresh vegetables: tomatoes, onions, cabbage, carrots
Location: Mbare Market
Hours: 6am-6pm daily
Contact: +263771234567

I also searched nearby areas: Mbare South, Ardbennie, Southerton.

Search again See all produce

Workflow: Website Generation

The most expensive operation. A separate Claude API call with 8,192 max tokens generates a full HTML page.

User

"Create a free website for my business"

Claude API Call #1 (Conversation)

Claude calls generate_website tool

~$0.034

Tool: generate_website

Server fetches entity data, builds generation prompt

Database Reads

1. SELECT business by id
2. SELECT related products (WHERE business_id = ...)
3. SELECT related events (WHERE business_id = ...)
4. Check slug uniqueness

Claude API Call #2 (Website Generation)

SEPARATE API call — different from the conversation loop

Model: claude-sonnet-4-20250514
Max tokens: 8,192 (8x normal)
No system prompt. No tools. Just a single user message with the entity data + design requirements.
Response: complete, self-contained HTML document.

~1,200 input • ~6,500 output • ~$0.101

Cost Logged (website_gen)

api_costs INSERT with entity_id and entity_type

phase: "website_gen" — the most expensive phase per call due to high output tokens

Database Write

UPDATE businesses SET slug, website_html

slug: "chipos-vegetable-stand"
website_html: "..." (~15-25KB of HTML)

Claude API Call #3 (Continuation)

Claude receives tool result, presents URL to user

~$0.038

Website generation is the cost outlier. At ~$0.10 per generation (mostly output tokens), it costs 3x more than a typical conversation turn. But it produces a real, deployable HTML page that would take a human designer hours. The ROI is enormous — but worth monitoring.

Total API Calls

1 conversation + 1 website_gen + 1 continuation

Total Cost

~$0.17

Website gen dominates (~60% of cost)

Output Size

~20KB

Complete HTML with embedded CSS, responsive, branded

Performance Optimizations

Three architectural improvements that reduce costs and improve search quality.

Prompt Caching

-78%

Input token costs on repeat calls. System prompt + tools cached via Anthropic API (cache_control: ephemeral). Cached tokens: $0.30/M vs $3/M. 5-min TTL covers any active conversation.

Loop Safety Cap

8 max

Tool-use rounds per message. Normal: 1-3. Prevents runaway costs from stuck tool cycles. If hit, logs a warning and returns whatever text Claude last produced.

Semantic Search

pgvector

OpenAI text-embedding-3-small (1536 dims). Embeddings generated fire-and-forget on create/edit. search_all_entities RPC for cosine similarity. ilike as fallback.

Semantic Search: Before vs After

● Real test results from the ZimRoots directory

User Query	Semantic Search (Top Result)	Similarity	ilike Would Find
"someone to fix my car"	Willowvale Auto Mechanics Hub	0.418	Nothing
"where can I get my hair done"	African Crown Hair Salon	0.455	Nothing
"live music this weekend"	Jamtree Easter Music Festival	0.425	Nothing
"handmade crafts and art"	Patch Moekoe / National Handicraft Centre	0.476	Only if "craft" in name
"fresh vegetables market"	Maasdorp Farmers Market	0.515	Only exact keyword matches

Embedding Generation Flow

// Fire-and-forget: called after createEntity() and editEntity()
function embedEntity(table, entityId, name, description, category, meta) {
  const parts = [name, description, category, meta?.location,
                 meta?.what_they_make, meta?.hours].filter(Boolean);
  const text = parts.join(' — ');

  getEmbedding(text).then(embedding => {
    supabase.from(table).update({ embedding }).eq('id', entityId);
  });  // Never blocks the response
}
  

Cost: Embeddings are nearly free. OpenAI text-embedding-3-small costs $0.02/M tokens. The entire 68-entity backfill cost less than $0.01. Each new entity costs ~$0.00001 to embed. The real cost is the search query embedding (~$0.00001 per search). Compare to $0.03-0.10 per Claude API call — embeddings are a rounding error.

Cost Tracking Architecture

How every dollar is tracked from API call to database to dashboard.

What's Logged Per API Call

Identity

session_id + user_phone

Who triggered the call (phone for WhatsApp, session ID for web)

Tokens

input + output

From response.usage. Cache tokens tracked in meta.

Cost

Pre-calculated USD

(input * $3/M) + (output * $15/M). Frozen at insert time.

Content

Text + tool calls

What Claude actually said/did. Full audit trail.

The 4 Instrumented Call Sites

● API call sites in server.js

Phase	Trigger	Max Tokens	Has System Prompt?	Has Tools?	Typical Cost
conversation	Every user message (initial call)	1,024	Yes (spec + auth + channel)	Yes (all 18)	$0.030-0.040
tool_call	After tool execution (loop continuation)	1,024	Yes (same prompt)	Yes (all 18)	$0.032-0.045
scan_image	scan_business_image tool (vision)	1,024	No	No	$0.008-0.015
website_gen	generate_website tool (HTML generation)	8,192	No	No	$0.080-0.120

Estimated Cost Per Workflow

● Cost estimates by user action

User Action	API Calls	Est. Input Tokens	Est. Output Tokens	Est. Cost
Simple greeting	1	~9,800	~200	$0.032
Search for a business	2-3	~20,000	~500	$0.068
Register (WhatsApp)	2-3	~20,000	~500	$0.068
Register (Web + OTP)	6-8	~60,000	~1,200	$0.20
Create business listing	3-5	~35,000	~800	$0.12
Enrich listing (deep discovery)	4-6	~45,000	~1,000	$0.15
Scan business card (vision)	1 (+ loop)	~1,500	~300	$0.009
Generate website	1 (+ loop)	~1,200	~6,500	$0.101
Full onboarding (register + list + enrich + website)	12-18	~120,000	~9,000	~~~$0.50~~ ~$0.25

The economics (with caching): A full user onboarding costs roughly $0.25 with prompt caching (down from ~$0.50 without). At scale: 100 users/day = ~$25/day = ~$750/month. The /costs dashboard shows exact spending, and the cache_read_input_tokens metric in each cost row confirms caching is working.

Dashboard

The costs dashboard is live at /costs on the Railway deployment. It queries the api_costs table and shows:

Total spend over configurable period (7/30/90/365 days)
Daily spend bar chart
Breakdown by phase (conversation vs tool_call vs scan_image vs website_gen)
Top users by cost (who's using the most API?)
Recent 20 calls with content preview

ZimRoots System Walkthrough — Last updated 2026-04-08

Back to ZimRoots · All Tools