A collection of AI agents, RAG systems, and ML pipelines I've built. Click any project to see the full story.
Multi-tenant CRM with automatic Row-Level Security at the ORM layer. Each company's data is completely isolated via JWT-injected tenant filters.
Building a CRM that could serve multiple companies on the same database without any data leakage, while keeping the codebase clean and maintainable.
Instead of adding WHERE company_id = ? to every query manually, I
implemented Row-Level Security at the SQLAlchemy ORM level using event listeners. Every
query automatically gets filtered by the company_id extracted from the JWT token.
@event.listens_for(db.session, 'do_orm_execute') decorator that injects
with_loader_criteria on all queries
ReAct-pattern agentic loop using Gemini 2.0 Flash with native function calling. The agent autonomously decides when to search the web, read emails, or fetch news.
Create an AI agent that can autonomously perform real-world tasks—not just chat, but actually take actions like searching the web, reading emails, and aggregating news.
Implemented the ReAct (Reasoning + Acting) pattern where the model thinks about what tool to use, executes it, observes the result, and decides the next step—all in a loop until the task is complete.
types.FunctionDeclaration
RAG system supporting PDFs, images, audio, and spreadsheets. Uses Gemini Vision for OCR and SentenceTransformers for semantic retrieval.
Build a document search system that can answer questions across any file type—not just text files, but scanned documents, photos of whiteboards, meeting recordings, and spreadsheets.
Created a unified pipeline where every file gets converted to text (using the right tool for each format), then embedded into a vector space for semantic search. The AI answers based strictly on uploaded content.
DQN agent trained in a custom Gymnasium environment to learn optimal server scaling policies. Outperforms threshold-based autoscalers on the cost-latency tradeoff.
Traditional autoscalers use simple thresholds (if CPU > 80%, add server). But this leads to cold-start latency and wasted idle resources. Can an AI learn a smarter policy?
Built a custom Gymnasium environment simulating realistic traffic patterns (sinusoidal + random flux). Trained a DQN agent to minimize a reward function balancing utilization efficiency vs. server costs.
ServerEnv(gym.Env) with
observation space [CPU Load, Normalized Server Count] and action space [Scale Down,
Hold, Scale Up].pth for deployment100% offline voice assistant running Vosk STT, Ministral 3B (GGUF), and Kokoro TTS locally. No internet, no cloud, complete privacy.
Build a voice assistant that works entirely offline—no API calls, no cloud dependencies. Must run on consumer hardware with acceptable latency.
Carefully selected lightweight models for each stage of the pipeline: Vosk for fast local STT, a quantized GGUF model via llama.cpp for reasoning, and Kokoro for natural-sounding TTS. Optimized the loop to minimize latency.
Background service that polls Gmail, generates executive-style AI summaries, and delivers briefings to Slack or WhatsApp.
Email overload is real. I wanted a system that would monitor my inbox, intelligently summarize what matters, and push a briefing to my preferred channel—without me checking email constantly.
Built a polling service that reads unread emails (including attachments), uses Gemini to generate human-like summaries, and forwards them to Slack/WhatsApp webhooks.
SQL product catalog with FAISS vector embeddings for natural language queries. Ask about products in plain English, get structured answers.
Traditional product databases require exact queries. I wanted to enable natural language questions like "What's a good laptop under $1000 for video editing?" against a structured catalog.
Embedded all product data (name, specs, features, reviews) using SentenceTransformers, stored in FAISS for fast similarity search. The AI retrieves relevant products and generates contextual responses.
AI-powered resume scorer with human-in-the-loop feedback. Uses Pinecone to learn each user's preferences over time.
Generic resume scorers don't account for individual hiring manager preferences. I wanted a system that learns from feedback—when you disagree with a score, it remembers why.
Built a feedback loop where user corrections get embedded and stored in Pinecone. On subsequent evaluations, the system retrieves similar past feedback to adjust its scoring criteria.
Real-time web search augmented responses. Fast mode (6s) uses snippets, deep mode (20s) scrapes full pages for comprehensive answers.
LLM knowledge has a cutoff date. For questions about current events, prices, or news, I needed a system that can search the web and synthesize real-time information.
Built a pipeline that converts user queries into optimized search terms, fetches top results via Google Custom Search API, and feeds the content to Gemini for synthesis.
LangChain tool agent that dynamically constructs API calls from JSON schema. Give it a bank's API spec and it figures out how to query it.
Banks have complex APIs with many endpoints. I wanted an agent that could take a natural language query like "What's my account balance?" and automatically figure out which API to call.
Feed the agent a structured JSON schema of all bank endpoints. It reads the schema, constructs the right URL, makes the API call, and synthesizes the response—all autonomously.
create_agent() with custom tool for API
requestsmake_api_request() with rate limiting and error
handlingWeb search powered news aggregation with automatic summarization. Get caught up on any topic in seconds.
News is scattered across many sources. I wanted a simple way to ask "What's happening with [topic]?" and get a synthesized summary from multiple sources.
Combined web search with article extraction. The system searches for recent news, extracts key content from top results, and generates a cohesive summary.
Interactive Q&A over any local codebase. Point it at a project folder and ask questions about the code in natural language.
Understanding a new codebase takes time. I wanted a tool where I could load any project and immediately start asking "What does this function do?" or "How are these modules connected?"
Scan all code files in a project, store their content, and feed it as context to Gemini. Maintain chat history for follow-up questions that build on previous answers.