Available for Senior DS · MLE · Applied Scientist roles

Aman
Jain.

_

I build ML systems that move metrics at scale — ranking, personalization, and pricing for 1M+ monthly users at Cars24, and India's first SEBI-compliant agentic trading platform at KotiLabs. 4+ years across marketplaces, fintech, and AI products.

B.Tech + M.Tech CSE · LNMIT Published · JURISIN 2022 · Springer LNAI
user intent Voice / Chat Input
L2 · intent router Mastra AI · LLM
L5 · compliance OPA / Rego
L3 · strategy JSON DSL Engine
L3 · backtester VectorBT · 500+ inst.
storage ClickHouse · TimescaleDB
L4 · execution Zerodha · Groww · Upstox
memory · RAG pgvector · HNSW
status SEBI Apr 2026 ✓

Production ML.
Measurable outcomes.

+0%
Click Recall@50 improvement
12% → 30% · User-level personalization vs 25-cohort baseline
Cars24 · N1 Personalization
+0%
V2Bi uplift on default ranking
Top-5 deciles · OOT validated + live A/B confirmed
Cars24 · Default Sort V2
0M+
Logo comparisons per pipeline run
AWS S3 + Hive · SLA-bound to Fortune 500 clients
6sense · Logo Comparison
0%
FSP predictions within ±5% of sale price
LLM-structured inspection features · 60% inventory coverage
Cars24 · Pricing Model
0+
Instruments screened per run
>10× throughput vs sequential · Memory-bounded vectorized execution
KotiLabs · Batch Screener
0%
LLM inference cost reduction
Multi-layered RAG memory (pgvector + PostgreSQL) + fine-tuned Llama-3-8B intent router with Redis semantic cache
KotiLabs · Trading Platform

Where I've
built things.

July 2025 – Present
Bengaluru
KotiLabs
FOUNDING ENGINEER — AAGMAN AI
  • Architected a 5-layer neuro-symbolic trading platform (0→1) — SEBI April 2026 compliant: LLM planning (Mastra AI) + deterministic OPA/Rego policy enforcement, guaranteeing zero hallucination-induced regulatory breach
  • Designed a JSON DSL compiler as a safe LLM output format — AI selects building blocks, interpreter executes them; eliminates arbitrary code execution and prompt injection risk in financial execution
  • Built VectorBT-backed backtesting engine with golden test suites verifying exact trade counts, timestamps, and metrics across all executions — deterministic by design
  • Engineered vectorized batch screener processing 500+ instruments per run at >10× sequential throughput; real-time ClickHouse + TimescaleDB ingestion pipeline with multi-broker adapters (Zerodha, Groww, Angel One, Upstox)
  • Designed HNSW-indexed RAG memory (pgvector) with hybrid BM25 + dense retrieval; intent router (distilled Llama-3 8B + Redis semantic cache) reduces full LLM calls by >90%
  • Mentored 2 engineers on deterministic execution design, OPA policy authoring, and CI/golden-test methodology
PythonOPA/RegoClickHouse TimescaleDBVectorBTpgvector Mastra AIFastAPISEBI Compliance
Dec 2023 – July 2025
Gurugram
Cars24
DATA SCIENTIST — RANKING, PERSONALIZATION & PRICING
  • N1 user-level personalization — replaced 25-cohort system; +150% Click Recall@50, +8% U2BI uplift across 100k+ monthly sessions; solved cold-start for zero-click users (35% of base) via implicit search/filter signals
  • GBDT Default Sort V2 — dynamic de-boosting of underperforming inventory; +21% V2Bi uplift (top-5 deciles); out-of-time validated + live A/B confirmed; PSI monitoring for ongoing drift detection
  • FSP pricing model — vehicle fingerprint + demand/supply + LLM-structured inspection data; ~70% predictions within ±5% of actual sale price; multi-dimensional OOT evaluation framework
  • Similar Cars hybrid recommendation (70:30 collab:content) — +6% impressions, +12% SimilarCar U2Bi; resolved exploration-vs-exploitation tradeoff
  • Migrated 9 DS models to GA4 with zero downtime; owned cross-geo DS for UAE, Thailand, and Australia simultaneously
  • Mentored 2 junior analysts on SQL-based A/B test validation, out-of-time evaluation methodology, and ML pipeline best practices
GBDTLightGBMCollaborative Filtering A/B TestingRecall@KSnowflake RedisLLM FeaturesMulti-geo
May 2021 – Oct 2023
Bengaluru
6sense
DATA SCIENTIST · INTERN → FULL-TIME
  • Logo-similarity microservice — embedding-based CV deployed on AWS S3 + Hive; 400M+ comparisons/run, 900k backlog records, SLA-bound to fortnightly Fortune 500 data deliveries; 4% false-positive reduction at millions-of-detections scale
  • Contributed to Togylop — 6sense's internal NLP training library (BERT/RoBERTa multi-class, multi-label, token classification); listed as library maintainer
  • Scaled B2B intent taxonomy 57 → 198 divisions (12 → 15 functions) via supervised entity classification; signals consumed by Fortune 500 ABM workflows
Computer VisionBERT/RoBERTaAWS S3 Apache HiveNLPB2B Intelligence
Dec 2021 – May 2023
New Delhi
Ministry of Science & Technology, Government of India
DATA ANALYST
  • Built automated ETL pipelines and analytics dashboards for policy monitoring, consolidating multi-source departmental datasets into standardised reports for senior government officials.
ETLData AnalyticsPolicy Monitoring

Systems I've
architected.

Cars24 · 2024
N1 User-Level Personalization Engine

Replaced a 25-cohort recommendation system with individual user-level rankings using GBDT-based affinity scoring and collaborative filtering. First system on the platform to serve personalized recommendations to zero-click users.

+150%
Click Recall@50 — 12% → 30%
+8%
U2BI (User-to-Buyer Intent) uplift
Cars24 · 2024–2025
Final Selling Price Prediction Model

Supervised regression model using vehicle fingerprint, demand/supply signals, market science, and LLM-extracted inspection quality scores to predict optimal listing price.

70%
Predictions within ±5% of actual sale price
60%
Inventory coverage at appointment level

What I work
with.

ML & Modeling
Ranking Systems Personalization GBDT / LightGBM LambdaMART Collaborative Filtering Two-Tower Retrieval Pricing Models A/B Testing Recall@K · NDCG Uplift Modeling Cold-Start Solutions Causal Inference Training-Serving Skew
Agentic AI & LLMs
Multi-Agent Orchestration RAG (Hybrid BM25 + HNSW) LoRA / PEFT LLM Evaluation (RAGAS) Intent Classification Semantic Caching pgvector Mastra AI Prompt Engineering Hallucination Prevention Transformer Fine-tuning
Compliance & Systems
OPA / Rego JSON DSL Compilers SEBI Regulations Sandbox Isolation Feature Stores PSI Drift Monitoring Deterministic Execution
Data Engineering
ClickHouse TimescaleDB PostgreSQL Redis Snowflake AWS S3 · Apache Hive Parquet WebSocket Pipelines
NLP & Research
BERT / RoBERTa Legal-BERT Domain Adaptation Multi-class Classification Token Classification Hugging Face PyTorch
Languages & Tools
Python SQL FastAPI LightGBM PyTorch VectorBT TA-Lib Scikit-learn Docker Git
Experimentation & Validation
Out-of-Time Validation User-Level A/B Splits Power Analysis Bayesian A/B Testing MTC Correction SRM Detection

Research that
ships.

My M.Tech thesis on domain-specific transformer adaptation for legal NLP — the same principle I apply in production today.

JURISIN 2022 Workshop · JSAI International Symposium on AI · Published: Springer LNAI 2025
"Comparative Study of BERT and Legal-BERT for Predicting Indian Legal Case Judgements"
1st Author · 2 faculty co-authors · Peer-reviewed workshop proceedings

Demonstrated that domain-specific pre-training (Legal-BERT) substantially outperforms general BERT on Indian legal case judgment prediction tasks. Established a benchmark for domain-adapted transformer models in legal NLP, validating the intrinsic dimensionality hypothesis: domain-specific adaptations occupy a low-rank subspace of the weight space — the same principle behind LoRA fine-tuning, which I apply at KotiLabs for intent classification.

Let's talk
about ML.

I'm actively looking for Senior DS, MLE, and Applied Scientist roles. I respond to well-matched opportunities within 24 hours.

Open to opportunities · Immediate to 4 weeks
Email
amanj3335@gmail.com
Phone
+91 876 912 8326
Location
Bengaluru · Open to all metros + remote