From writing business rules on enterprise stacks to orchestrating autonomous AI agents under regulatory constraints — each phase built the intuition the next one needed.
Data Scientist and applied ML researcher with 15 years building production AI systems at the intersection of large language models, graph analytics, and enterprise compliance. Currently architecting multi-agent LLM orchestration and knowledge graph infrastructure for regulatory workflows in financial services — and independently researching in-context learning theory and model compression. I write about the gap between ML research and production reality: what the papers don't tell you, what breaks at scale, and what's worth building from scratch.
Below: the arc that got me here — not a career history, but a sequence of transformations where each identity was prerequisite to the next.
Distributed agent system on AWS coordinating specialized LLMs across Neo4j knowledge graphs and Bedrock inference for enterprise compliance workflows.
AI-enabled system that ingested, curated, and refined data quality rules across disparate source systems — translating heterogeneous rule definitions into standardized, configurable Great Expectations YAML for enterprise-wide validation.
Neo4j-based fraud detection using community detection (Louvain, k-core, label propagation) and structural entropy measures across transaction networks.
Decomposed federal deposit insurance mandates into 110 structured IT controls using an 8-pass LLM extraction pipeline — each control anchored to its source regulatory provision, enabling automated gap analysis and audit-ready compliance traceability.
Mined and curated 12+ security event types from raw, unutilized enterprise logs — evolving into a no-code/low-code detection platform that centralized event observability org-wide via YAML-driven pattern configuration.
Multi-pass relationship extraction system using the Claude API with pronoun resolution and iterative refinement across unstructured text corpora.
Formalizing ICL success conditions through an information-theoretic lens — mutual information between prompt features and output correctness, tractable bounds, and empirical predictions.
Empirical study of how memorization concentrates across transformer layers — identifying which layers disproportionately contribute to verbatim recall using GPT-2 on A100 GPU.
Hands-on exploration of post-training quantization on 7B-parameter models — per-layer error accumulation, accuracy/compression tradeoffs, and what GPTQ actually looks like from scratch.
Open to research collaborations, technical discussions, and consulting on LLM systems, graph analytics, or compliance AI in financial services.