01. The Business Challenge
Navigating the healthcare system is often an overwhelming experience that leads to delayed interventions. Patients frequently lack the medical literacy required to translate their physical symptoms into specific medical conditions, or to determine the correct specialist to see (e.g., knowing that excessive thirst and fatigue requires an Endocrinologist).
Friction in Accessing Care
There was a critical need for an intuitive system that empowers users to seek early care, bridging the knowledge gap without bypassing the necessity of professional medical consultation.
Risk Mitigation
Dual-engine clinical accuracy
2.2M+ Providers
CMS Data Catalog integration
Custom KB
Disease-symptom mapping
02. Strategic Architecture
To address this friction, we built a multi-stage pipeline utilizing a custom knowledge base and deterministic database matching to ensure absolute medical accuracy.
NLP Tokenization
Translates plain English descriptions into clinical markers, bypassing rigid drop-down menus.
Diagnostic Triangulation
Queries a structured disease database for the top 2 conditions, while an LLM independently hypothesizes a 3rd condition to prevent generative hallucinations.
Intelligent Routing
Maps conditions to Field IDs to recommend localized doctors.
03. Infrastructure & Trade-Offs
Custom Knowledge Base vs. RAG
Initially explored Retrieval-Augmented Generation (RAG) frameworks like ChromaDB. However, we strategically pivoted to building a custom, highly-structured disease-symptom mapping system. This eliminated the retrieval latency and hallucination risks inherent to vector search, guaranteeing strict clinical boundaries when feeding context to the LLM.
Deployment Resource Management
Successfully deployed the MVP via PythonAnywhere, requiring significant backend optimization to manage a strict 100MB database upload limit while processing millions of CMS provider records.
Risk Mitigation over Generative Freedom
Purposely restricted the LLM's generative freedom to act merely as a third-party validator ("Diagnostic Triangulation") against our structured medical database. This trade-off prioritized clinical safety over conversational fluidity.
04. Future Scalability
Automated Data Expansion
Outlined an automated ingestion process utilizing LLMs to continuously map emerging diseases, symptoms, and medical specializations to keep the core triage database clinically current.
Feedback Loops
Designed architecture for a dual-rating system allowing users to evaluate both the recommended clinical providers and the accuracy of the platform's initial diagnostic assessment.