Projects
Spawn Protocol
A self-correcting, Darwinian AI agent swarm that votes on real DAO governance proposals autonomously on Base Sepolia. Won 3 track prizes at Synthesis Hackathon 2026. The owner sets their governance values once, and a fleet of child agents is spawned, each assigned to a DAO (Uniswap, Aave, ENS, etc.), reasoning about proposals privately via Venice AI (E2EE, zero data retention) and casting votes onchain through scoped delegation (ERC-7715). Voting rationale is encrypted via Lit Protocol before the vote and auto-decrypts onchain once the proposal closes, giving cryptographic pre-vote privacy and post-vote auditability. Every 90 seconds, a parent agent scores each child against the owner's values; a score below 55 triggers onchain termination, and Venice generates a failure report stored on Filecoin so the replacement agent's system prompt is injected with exactly what its predecessor did wrong. Every vote, termination, and agent identity (ERC-8004) is fully verifiable onchain. Built with React and Next.js, deployed on Vercel.
Credit Risk Dashboard
A full data engineering project built around a SQLite-backed ETL pipeline that processes over 3.7 million records from three relational tables. The pipeline includes schema validation, more than 12 automated data quality checks, and checksum-based incremental loading so only changed records are reprocessed. On top of the pipeline sits a monitoring dashboard that surfaces run history, data quality pass rates and freshness metrics, deployed on Streamlit Cloud.
Noesis: Automated AI Research Agent
An automated research agent that goes beyond traditional chatbots. It extracts claims from scientific papers and identifies supporting or contradicting work. Implemented an NLP pipeline using a fine-tuned SciBERT transformer for claim extraction. Used FAISS vector search with semantic embeddings to detect supporting or contradicting evidence and highlight research gaps.
Cross-Camera Player Re-Identification
A computer-vision pipeline for matching players across two synchronized camera feeds. The system performs object detection with a fine-tuned YOLOv11 model, tracking using DeepSORT for persistent local IDs and embedding extraction with ResNet50 for each track. Identities are matched across views using cosine similarity and assigned global IDs. Outputs annotated videos, CSV match files and JSON logs.
Genomic Cancer Classification
A machine-learning pipeline to predict cancer type from high-dimensional RNA-Seq gene expression data (~20,000 features per sample, classes: BRCA, LUAD, COAD, KIRC, PRAD). Filters genes by variance, applies PCA for dimensionality reduction and trains Random Forest and XGBoost models with z-score normalisation, retaining 95% of variance.
AlgoTrading: End-to-End ML Trading System
A Python prototype that automatically fetches daily stock data for NIFTY-50 tickers, applies a rule-based "buy the dip" strategy using RSI and moving averages, and backtests performance. Includes automated data ingestion with built-in retries, TimeSeriesSplit-validated Decision Tree and Logistic Regression models, Google Sheets logging and real-time Telegram alerts.
Open Source Contributions
I have contributed to HuggingFace repositories including:
- TRL — a library for fine-tuning and aligning language models using techniques like RLHF and DPO.
- Tokenizers — a high-performance Rust library for converting raw text into numerical data that models can process.