022023 – PresentSCAD

Intelligent Conversational Analytics Platform

Teaching SQL to 200+ people who can't write SQL — through plain English (and Arabic).

Architecture

nl_to_sql.flow

● live

Before

Data analysts were a bottleneck. Every report request waited 3-5 days in their queue. Non-technical staff couldn't even define what they needed because they didn't know what existed in the data.

After

200+ staff now query 8 databases in plain language, getting answers in seconds. The analytics backlog dropped 70%. Analysts moved up the stack to harder problems.

Challenge

Non-technical users needed SQL database access without coding knowledge.

Approach

Built natural language to SQL query system with conversational interface and automatic error correction.

How it was built

1
Schema audit
Weeks 1–3
Catalogued all 8 production databases — 240 tables, 3,200 columns, many cryptically named in mixed Arabic-English transliterations. Built a semantic schema layer with human-readable labels before writing a line of LLM code.
2
Few-shot SQL generation
Weeks 4–6
Curated 80 question→SQL examples spanning the most common query patterns. GPT-4 with these examples + relevant schema slices hit 72% accuracy on the eval set.
3
Execution-aware repair
Weeks 7–9
Added a repair loop — when generated SQL throws an error, the error message goes back to the model with the original schema for a corrected attempt. Lifted accuracy to 85%.
4
Safety + access control
Weeks 10–11
Read-only DB users per role, query whitelisting, row-level filters. A natural-language interface to a database without these is a security incident waiting to happen.
5
Conversational UX
Weeks 12–14
Multi-turn refinement, result explanation, follow-up suggestions. The chat UI is what made non-technical users actually adopt it. The model was already good enough.

Key architecture decisions

Schema-aware context injection over fine-tuning

Why · Fine-tuning would lock us to one schema version. Dynamic schema injection means the system updates when the DB does — zero retraining.

Read-only DB user with row-level security

Why · Defence in depth. Even a fully prompt-injected model cannot mutate data or read across tenants.

Execution-aware repair loop

Why · Generated SQL fails for predictable reasons (typos, ambiguous joins). Letting the model see and fix its own errors with the schema context lifted accuracy 13 points.

Impact

Enabled 200+ non-technical staff to query databases using plain English
Handles 15K+ queries monthly across 8 different databases
Reduced analytics request backlog by 70%
85% query accuracy with automatic error correction

200+

users

15K+/mo

queries

-70%

backlog

85%

accuracy

What I'd tell someone building this

01 · Schema design is more important than prompt design. Bad column names break the model long before bad prompts do.
02 · The repair loop is more powerful than picking a bigger model.
03 · Users want explanations as much as answers. "Here's the SQL I ran" builds trust.
04 · Arabic column data needs explicit transliteration handling — don't assume the model will guess right.

“He didn't ship until the numbers said it was ready. The platform changed how 200 people work — and the team that maintains it after Mazhar's involvement hasn't had to call him once in eight months.”

— Head of Analytics · SCAD

Tech stack

GPT-4Semantic KernelAzure OpenAI.NET Core Web APIAngularSQL Server

Ask anything about Intelligent Conversational Analytics Platform

AI scoped to this project · Llama 3.3 70B

Schema audit

Few-shot SQL generation

Execution-aware repair

Safety + access control

Conversational UX

Schema-aware context injection over fine-tuning

Read-only DB user with row-level security

Execution-aware repair loop