Move beyond "vibes-based" evaluation. My architecture is validated by rigorous, production-grade metrics: Precision@k, NDCG, and Sub-millisecond Latencies.
Redesigned a naive RAG pipeline into a high-performance semantic search architecture, integrating a cross-encoder re-ranker and semantic caching to drastically reduce TTFT (Time To First Token) and eliminate hallucinations in financial querying.
Built a HIPAA-compliant, cross-platform voice conversational agent using Vapi and React Native, allowing elderly patients to fluidly schedule appointments and query post-op instructions with < 800ms conversational latency.