Transparency

This entire portfolio runs at $0 / month.

Live RAG against my CV. Streaming AI chat. Vector search. Cross-encoder reranking. Three LLM choices. Static deploys with previews. All of it on free tiers — and not in a hacky way. Here's the receipt.

Monthly cost

Services composed

<2s

End-to-end RAG latency

The stack

Service	Role	Tier	Cost / mo
Next.js 14	App framework (App Router, Server Components)	Open source	$0
Netlify Hosting · CDN · build · preview deploys 100 GB bandwidth + 300 build min / mo — well within limits.	Hosting · CDN · build · preview deploys	Starter (free)	$0
GitHub	Source control · CI trigger	Free	$0
Pinecone Vector database for RAG retrieval 1 serverless index, ~36 CV chunks, < 0.1% of free quota.	Vector database for RAG retrieval	Starter (free)	$0
Google Gemini Embeddings (gemini-embedding-001, 3072 dim) 60 RPM, 1,500 RPD. Portfolio traffic is nowhere near.	Embeddings (gemini-embedding-001, 3072 dim)	Free tier	$0
Groq LLM inference (Llama 3.3 70B + 3.1 8B + Gemma 2) Sub-second latency on Llama 3.3 70B. Ridiculous quality-to-cost ratio.	LLM inference (Llama 3.3 70B + 3.1 8B + Gemma 2)	Free tier	$0
Cohere Cross-encoder reranking (rerank-v3.5) 1,000 calls / month free — enough for portfolio traffic for years.	Cross-encoder reranking (rerank-v3.5)	Trial tier	$0
Vercel Analytics + Speed Insights	RUM · Core Web Vitals	Hobby (free)	$0
Total			$0

Engineering principles behind it

Best-of-tier for each job

Groq for fast LLM. Cohere for rerank. Pinecone for vectors. Gemini for embeddings. Each one is independently the best free tier in its category — and they compose cleanly.

Graceful degradation built-in

If Cohere is missing, the API falls back to pure vector retrieval. If a key is missing, the feature degrades — the site never breaks.

Cost > 0 — but it's negligible

If traffic ever pushes one of these tiers, the next step up is $20–50 / month — still cheaper than a single hour of a human analyst.

Show, don't tell

This page exists because the cleanest way to demonstrate engineering judgement is to expose your trade-offs publicly. Transparency is a signal.

Want this kind of cost discipline applied to your team's AI infra? Let's talk →