Projects | Zaid Khanusiya

Full-Stack CRM System

Multi-tenant CRM with automatic Row-Level Security at the ORM layer. Each company's data is completely isolated via JWT-injected tenant filters.

The Challenge

Building a CRM that could serve multiple companies on the same database without any data leakage, while keeping the codebase clean and maintainable.

My Approach

Instead of adding WHERE company_id = ? to every query manually, I implemented Row-Level Security at the SQLAlchemy ORM level using event listeners. Every query automatically gets filtered by the company_id extracted from the JWT token.

Technical Implementation

Backend: Flask + Flask-RESTful with JWT authentication
Database: MySQL with SQLAlchemy ORM
Security: Custom @event.listens_for(db.session, 'do_orm_execute') decorator that injects with_loader_criteria on all queries
Frontend: Vanilla JS with modular API client
Modules: Leads, Contacts, Opportunities, Accounts, Tasks, Notes, Products, Quotes with full CRUD

Key Features

Real-time dashboard with 8 KPI cards and sales pipeline visualization
Global search across all modules with keyboard shortcuts
Role-based access control (Admin vs Member)
Quick action modals for rapid data entry

Gemini Autonomous Agent

ReAct-pattern agentic loop using Gemini 2.0 Flash with native function calling. The agent autonomously decides when to search the web, read emails, or fetch news.

The Challenge

Create an AI agent that can autonomously perform real-world tasks—not just chat, but actually take actions like searching the web, reading emails, and aggregating news.

My Approach

Implemented the ReAct (Reasoning + Acting) pattern where the model thinks about what tool to use, executes it, observes the result, and decides the next step—all in a loop until the task is complete.

Technical Implementation

Model: Gemini 2.0 Flash with native function calling via types.FunctionDeclaration
Tools: Web search (Google Custom Search), Gmail read/send (simplegmail), News aggregator (feedparser + newspaper3k)
Architecture: Flask backend with persistent chat history stored as JSON files
Loop: Model generates function call → backend executes tool → result fed back to model → repeat until final answer

Key Features

Chat session management with create/load/save functionality
Tool descriptions engineered for minimal hallucination
Automatic OAuth flow for Gmail integration

Multi-Modal Document Search

RAG system supporting PDFs, images, audio, and spreadsheets. Uses Gemini Vision for OCR and SentenceTransformers for semantic retrieval.

The Challenge

Build a document search system that can answer questions across any file type—not just text files, but scanned documents, photos of whiteboards, meeting recordings, and spreadsheets.

My Approach

Created a unified pipeline where every file gets converted to text (using the right tool for each format), then embedded into a vector space for semantic search. The AI answers based strictly on uploaded content.

Technical Implementation

File Processing: PyMuPDF (PDF), python-docx (Word), pandas (Excel/CSV), python-pptx (Slides), Pillow (Images)
Vision AI: Gemini 2.5 Flash Lite for OCR on images and audio transcription with speaker separation
Embeddings: SentenceTransformers (all-MiniLM-L6-v2) with cosine similarity via Scikit-Learn
Two Modes: Search Query (Kendra-style citations) and Answer Mode (conversational response)

Key Features

Automatic keyword extraction from queries for better retrieval
Context-aware answers that cite specific files
Simple upload interface with automatic embedding generation

RL Server Autoscaler

DQN agent trained in a custom Gymnasium environment to learn optimal server scaling policies. Outperforms threshold-based autoscalers on the cost-latency tradeoff.

The Challenge

Traditional autoscalers use simple thresholds (if CPU > 80%, add server). But this leads to cold-start latency and wasted idle resources. Can an AI learn a smarter policy?

My Approach

Built a custom Gymnasium environment simulating realistic traffic patterns (sinusoidal + random flux). Trained a DQN agent to minimize a reward function balancing utilization efficiency vs. server costs.

Technical Implementation

Environment: Custom ServerEnv(gym.Env) with observation space [CPU Load, Normalized Server Count] and action space [Scale Down, Hold, Scale Up]
Reward Engineering: +10 for optimal utilization (40-70%), -20 for overload risk (>90%), -5 for waste (<20%), minus server costs
Agent: PyTorch DQN with experience replay (10k buffer), target network sync every 5 episodes, epsilon-greedy exploration
Network: 2 hidden layers (256 neurons each) with ReLU, trained for 3000 episodes on Apple Metal GPU

Key Features

Interactive testing CLI for real-time scaling recommendations
Training visualization with reward history plots
Model exports to .pth for deployment

Offline Voice Chatbot

100% offline voice assistant running Vosk STT, Ministral 3B (GGUF), and Kokoro TTS locally. No internet, no cloud, complete privacy.

The Challenge

Build a voice assistant that works entirely offline—no API calls, no cloud dependencies. Must run on consumer hardware with acceptable latency.

My Approach

Carefully selected lightweight models for each stage of the pipeline: Vosk for fast local STT, a quantized GGUF model via llama.cpp for reasoning, and Kokoro for natural-sounding TTS. Optimized the loop to minimize latency.

Technical Implementation

STT: Vosk (vosk-model-small-en-us-0.15) for lightweight speech recognition
LLM: Ministral 3B via llama-cpp-python, Q4_K_M quantization for memory efficiency
TTS: Kokoro for high-fidelity voice synthesis with multiple voice options
Audio: sounddevice + soundfile for real-time mic input and speaker output
Optimization: Microphone disabled during TTS to prevent self-hearing

Key Features

Zero internet dependency after initial model download
Configurable context window and GPU layer offloading
Works on Apple Silicon and x86 CPU

Email Briefing Assistant

Background service that polls Gmail, generates executive-style AI summaries, and delivers briefings to Slack or WhatsApp.

The Challenge

Email overload is real. I wanted a system that would monitor my inbox, intelligently summarize what matters, and push a briefing to my preferred channel—without me checking email constantly.

My Approach

Built a polling service that reads unread emails (including attachments), uses Gemini to generate human-like summaries, and forwards them to Slack/WhatsApp webhooks.

Technical Implementation

Email: Gmail API via simplegmail with OAuth authentication
AI: Gemini 2.5 Flash for smart summarization with attachment context
Delivery: Slack webhooks and WhatsApp Business API
Architecture: Flask backend with continuous polling loop (configurable interval)

Key Features

Attachment analysis included in summaries
Automatic mark-as-read after processing
Multi-platform delivery (Slack or WhatsApp)

Product Insight System

SQL product catalog with FAISS vector embeddings for natural language queries. Ask about products in plain English, get structured answers.

The Challenge

Traditional product databases require exact queries. I wanted to enable natural language questions like "What's a good laptop under $1000 for video editing?" against a structured catalog.

My Approach

Embedded all product data (name, specs, features, reviews) using SentenceTransformers, stored in FAISS for fast similarity search. The AI retrieves relevant products and generates contextual responses.

Technical Implementation

Database: SQLAlchemy with Products, ProductReviews, and ChatHistory models
Embeddings: SentenceTransformers → FAISS index with automatic sync on product updates
Retrieval: Top-20 semantic matches fed as context to Gemini 2.5 Flash
Features: Compare products, explain features, get specs, summarize reviews, generate quotations

Key Features

Automatic embedding sync when products change
Persistent chat history per user per feature
Multiple specialized endpoints (compare, specs, quotes, reviews)

Resume Scoring System

AI-powered resume scorer with human-in-the-loop feedback. Uses Pinecone to learn each user's preferences over time.

The Challenge

Generic resume scorers don't account for individual hiring manager preferences. I wanted a system that learns from feedback—when you disagree with a score, it remembers why.

My Approach

Built a feedback loop where user corrections get embedded and stored in Pinecone. On subsequent evaluations, the system retrieves similar past feedback to adjust its scoring criteria.

Technical Implementation

AI: Mistral AI via LangChain for semantic resume-JD matching
Vector Store: Pinecone for storing user feedback embeddings
Document Parsing: pypdf, python-docx for multi-format resume support
Feedback Loop: Combined JD+Resume text embedded, stored with user's reason for disagreement

Key Features

Multi-resume batch scoring
Per-user preference learning
Transparent scoring with pros/cons breakdown

Web Search Powered LLM

Real-time web search augmented responses. Fast mode (6s) uses snippets, deep mode (20s) scrapes full pages for comprehensive answers.

The Challenge

LLM knowledge has a cutoff date. For questions about current events, prices, or news, I needed a system that can search the web and synthesize real-time information.

My Approach

Built a pipeline that converts user queries into optimized search terms, fetches top results via Google Custom Search API, and feeds the content to Gemini for synthesis.

Technical Implementation

Search: Google Custom Search API with configurable result count and pagination
Fast Mode: Uses title + snippet from search results (4-6s response time)
Deep Mode: Scrapes full page content via BeautifulSoup (15-20s, more accurate)
Chat: Persistent history per session with SQLAlchemy storage

Key Features

Automatic search query generation from conversational input
Token usage tracking for cost monitoring
Configurable search depth and result count

Bank API Agent

LangChain tool agent that dynamically constructs API calls from JSON schema. Give it a bank's API spec and it figures out how to query it.

The Challenge

Banks have complex APIs with many endpoints. I wanted an agent that could take a natural language query like "What's my account balance?" and automatically figure out which API to call.

My Approach

Feed the agent a structured JSON schema of all bank endpoints. It reads the schema, constructs the right URL, makes the API call, and synthesizes the response—all autonomously.

Technical Implementation

Framework: LangChain with ChatGoogleGenerativeAI
Agent: create_agent() with custom tool for API requests
Schema: JSON files per bank containing endpoint structures
Tool: make_api_request() with rate limiting and error handling

Key Features

Minimal API calls—agent inspects context before calling
Supports multiple bank schemas
System instructions prevent over-calling and hallucination

News Summariser

Web search powered news aggregation with automatic summarization. Get caught up on any topic in seconds.

The Challenge

News is scattered across many sources. I wanted a simple way to ask "What's happening with [topic]?" and get a synthesized summary from multiple sources.

My Approach

Combined web search with article extraction. The system searches for recent news, extracts key content from top results, and generates a cohesive summary.

Technical Implementation

Search: Google Custom Search targeting news sources
Extraction: BeautifulSoup for article text parsing
Summarization: Gemini 2.5 Flash for coherent multi-source synthesis
Backend: Flask-RESTful API

Key Features

Multi-source aggregation
Configurable search parameters
Clean, readable summaries

Codebase Chat Interface

Interactive Q&A over any local codebase. Point it at a project folder and ask questions about the code in natural language.

The Challenge

Understanding a new codebase takes time. I wanted a tool where I could load any project and immediately start asking "What does this function do?" or "How are these modules connected?"

My Approach

Scan all code files in a project, store their content, and feed it as context to Gemini. Maintain chat history for follow-up questions that build on previous answers.

Technical Implementation

File Scanning: Automatic detection of .py, .js, .ts, .go, .java, .txt files
Smart Filtering: Skips .venv, __pycache__, node_modules
Context: All code content stored in JSON, sent with each query
Session: Persistent chat history for contextual follow-ups

Key Features

Three modes: New project, continue session, refresh codebase
Local path persistence across sessions
Works with any Gemini-compatible project size

Selected Work

Full-Stack CRM System

The Challenge

My Approach

Technical Implementation

Key Features

Gemini Autonomous Agent

The Challenge

My Approach

Technical Implementation

Key Features

Multi-Modal Document Search

The Challenge

My Approach

Technical Implementation

Key Features

RL Server Autoscaler

The Challenge

My Approach

Technical Implementation

Key Features

Offline Voice Chatbot

The Challenge

My Approach

Technical Implementation

Key Features

Email Briefing Assistant

The Challenge

My Approach

Technical Implementation

Key Features

Product Insight System

The Challenge

My Approach

Technical Implementation

Key Features

Resume Scoring System

The Challenge

My Approach

Technical Implementation

Key Features

Web Search Powered LLM

The Challenge

My Approach

Technical Implementation

Key Features

Bank API Agent

The Challenge

My Approach

Technical Implementation

Key Features

News Summariser

The Challenge

My Approach

Technical Implementation

Key Features

Codebase Chat Interface

The Challenge

My Approach

Technical Implementation

Key Features