Applied Scientist · AI Engineer · MLE · Founding Engineer roles

Aman
Jain.

I build ML systems that move metrics at scale — ranking, personalization, and pricing for 1M+ monthly users at Cars24, and India's first SEBI-compliant agentic trading platform at KotiLabs. 4+ years across marketplaces, fintech, and AI products. Open to FAANG and high-growth startups.

View my work ↓ Contact me

B.Tech + M.Tech CSE · LNMIT · SPI 9.25 Published · JURISIN 2022 · Springer LNAI

user intent Voice / Chat Input

L2 · intent router Mastra AI · LLM

L5 · compliance OPA / Rego

L3 · strategy JSON DSL Engine

L3 · backtester VectorBT · 500+ inst.

storage ClickHouse · TimescaleDB

L4 · execution Zerodha · Groww · Upstox

memory · RAG pgvector · HNSW

status SEBI Apr 2026 ✓

Impact by numbers

Production ML.
Measurable outcomes.

+0%

Click Recall@50 improvement

12% → 30% · User-level personalization vs 25-cohort baseline

Cars24 · N1 Personalization

+0%

V2Bi uplift on default ranking

Top-5 deciles · OOT validated + live A/B confirmed

Cars24 · Default Sort V2

0M+

Logo comparisons per pipeline run

AWS S3 + Hive · SLA-bound to Fortune 500 clients

6sense · Logo Comparison

FSP predictions within ±5% of sale price

LLM-structured inspection features · 60% inventory coverage · ~90% within ±5%

Cars24 · Pricing Model

Instruments screened per run

>10× throughput vs sequential · Memory-bounded vectorized execution

KotiLabs · Batch Screener

LLM inference cost reduction

Multi-layered RAG memory (pgvector + PostgreSQL) + fine-tuned Llama-3-8B intent router with Redis semantic cache

KotiLabs · Trading Platform

+0%

Buyer conversion uplift

Live A/B · <100ms p99 serving latency

Cars24 · Default Ranking

Career

Where I've
built things.

July 2025 – Present

Bengaluru

KotiLabs

Pre-seed · 10-person team · SEBI-registered algo trading startup

FOUNDING AI ENGINEER — AAGMAN AI

Architected a 5-layer neuro-symbolic trading platform (0→1) — SEBI April 2026 compliant: LLM planning (Mastra AI) + deterministic OPA/Rego policy enforcement, guaranteeing zero hallucination-induced regulatory breach
Designed a JSON DSL compiler as a safe LLM output format — AI selects building blocks, interpreter executes them; eliminates arbitrary code execution and prompt injection risk in financial execution
Built VectorBT-backed backtesting engine with golden test suites verifying exact trade counts, timestamps, and metrics across all executions — deterministic by design
Engineered vectorized batch screener processing 500+ instruments per run at >10× sequential throughput; real-time ClickHouse + TimescaleDB ingestion pipeline with multi-broker adapters (Zerodha, Groww, Angel One, Upstox)
Designed HNSW-indexed RAG memory (pgvector) with hybrid BM25 + dense retrieval; intent router (distilled Llama-3 8B + Redis semantic cache) reduces full LLM calls by >90%, cutting per-query cost from $0.10 → $0.01
Mentored 2 engineers on deterministic execution design, OPA policy authoring, and CI/golden-test methodology

PythonOPA/RegoClickHouse TimescaleDBVectorBTpgvector Mastra AIFastAPISEBI Compliance

Dec 2023 – July 2025

Gurugram

Cars24

DATA SCIENTIST — RANKING, PERSONALIZATION & PRICING

N1 user-level personalization — replaced 25-cohort system; +150% Click Recall@50 (12%→30%), +16% buyer conversion uplift, <100ms p99 serving latency — 1M+ monthly sessions; solved cold-start for zero-click users (35% of base) via implicit search/filter signals
GBDT Default Sort V2 — dynamic de-boosting of underperforming inventory; +21% V2Bi uplift (top-5 deciles), +7% clicks per vehicle; out-of-time validated + live A/B confirmed; PSI monitoring for ongoing drift detection
FSP pricing model — vehicle fingerprint + demand/supply + LLM-structured inspection data; ~70% predictions within ±5% of actual sale price; multi-dimensional OOT evaluation framework
Similar Cars hybrid recommendation (70:30 collab:content) — +6% impressions, +12% SimilarCar U2Bi; resolved exploration-vs-exploitation tradeoff
Served recommendations at <100ms p99 latency via Two-Tower retrieval model with HNSW ANN index; expanded personalised coverage from 65% → 100% of user base
Migrated 9 DS models to GA4 with zero downtime; owned cross-geo DS for UAE, Thailand, and Australia simultaneously
Mentored 2 junior analysts on SQL-based A/B test validation, out-of-time evaluation methodology, and ML pipeline best practices

GBDTLightGBMCollaborative Filtering A/B TestingRecall@KSnowflake RedisLLM FeaturesMulti-geo

May 2021 – Oct 2023

Bengaluru

6sense

DATA SCIENTIST · INTERN → FULL-TIME

Logo-similarity microservice — embedding-based CV deployed on AWS S3 + Hive; 400M+ comparisons/run, 900k backlog records, SLA-bound to fortnightly Fortune 500 data deliveries; 4% false-positive reduction at millions-of-detections scale
Contributed to Togylop — 6sense's internal NLP training library (BERT/RoBERTa multi-class, multi-label, token classification); listed as library maintainer
Scaled B2B intent taxonomy 57 → 198 divisions (12 → 15 functions) via supervised entity classification; signals consumed by Fortune 500 ABM workflows

Computer VisionBERT/RoBERTaAWS S3 Apache HiveNLPB2B Intelligence

Dec 2021 – May 2023

New Delhi

Ministry of Science & Technology, Government of India

DATA ANALYST

Built automated ETL pipelines and analytics dashboards for policy monitoring, consolidating multi-source departmental datasets into standardised reports for senior government officials.

ETLData AnalyticsPolicy Monitoring

Selected work

Systems I've
architected.

KotiLabs · 2025–Present

Aagman AI — SEBI-Compliant Agentic Trading Platform

A full 0→1 system that takes a user's voice command ("Buy Tata Steel if it breaks VWAP") through intent classification, LLM planning, compliance validation, strategy backtesting, and live broker execution — with zero hallucination risk at any step. Built to meet SEBI's April 2026 algorithmic trading circular.

>10×

screener throughput vs sequential baseline

>90%

LLM calls eliminated · $0.10 → $0.01 per query

hallucination-induced regulatory breaches via OPA

SYSTEM ARCHITECTURE

L1 User Surface — Next.js · React Native · Voice

↓

L2 Mastra AI Orchestrator · RAG Memory · pgvector

↓

L3 JSON DSL Compiler · Execution Engine · Explainability

↓

L4 Multi-Broker Adapters — Zerodha · Groww · Upstox

↓

L5 OPA / Rego Compliance Firewall — SEBI rules

Cars24 · 2023–2025

N1 User-Level Personalization Engine

Rebuilt the recommendation engine using a Two-Tower retrieval model with HNSW ANN index, replacing a 25-cohort system with individual user-level rankings at 1M+ monthly sessions. Solved cold-start for 35% of zero-click users via implicit search/filter signals — first time the platform achieved personalisation for this segment.

+150%

Click Recall@50 — 12% → 30%

+8%

U2BI (User-to-Buyer Intent) uplift

Cars24 · 2024–2025

Final Selling Price Prediction Model

Supervised regression model using vehicle fingerprint, demand/supply signals, market science, and LLM-extracted inspection quality scores to predict optimal listing price.

70%

Predictions within ±5% of actual sale price

60%

Inventory coverage at appointment level

Capabilities

What I work
with.

ML & Modeling

Ranking Systems Personalization GBDT / LightGBM LambdaMART Collaborative Filtering Two-Tower Retrieval Pricing Models A/B Testing Recall@K · NDCG Uplift Modeling Cold-Start Solutions Causal Inference Training-Serving Skew

Agentic AI & LLMs

Multi-Agent Orchestration RAG (Hybrid BM25 + HNSW) LoRA / PEFT LLM Evaluation (RAGAS) Intent Classification Semantic Caching pgvector Mastra AI Prompt Engineering Hallucination Prevention Transformer Fine-tuning

Compliance & Systems

OPA / Rego JSON DSL Compilers SEBI Regulations Sandbox Isolation Feature Stores PSI Drift Monitoring Deterministic Execution

Data Engineering

ClickHouse TimescaleDB PostgreSQL Redis Snowflake AWS S3 · Apache Hive Parquet WebSocket Pipelines

NLP & Research

BERT / RoBERTa Legal-BERT Domain Adaptation Multi-class Classification Token Classification Hugging Face PyTorch

Languages & Tools

Python SQL FastAPI LightGBM PyTorch VectorBT TA-Lib Scikit-learn Docker Git

Experimentation & Validation

Out-of-Time Validation User-Level A/B Splits Power Analysis Bayesian A/B Testing MTC Correction SRM Detection

Publication

Research that
ships.

My M.Tech thesis on domain-specific transformer adaptation for legal NLP — the same principle I apply in production today.

JURISIN 2022 Workshop · JSAI International Symposium on AI · Published: Springer LNAI 2025

"Comparative Study of BERT and Legal-BERT for Predicting Indian Legal Case Judgements"

1st Author · 2 faculty co-authors · Peer-reviewed workshop proceedings

Demonstrated that domain-specific pre-training (Legal-BERT) substantially outperforms general BERT on Indian legal case judgment prediction tasks. Established a benchmark for domain-adapted transformer models in legal NLP, validating the intrinsic dimensionality hypothesis: domain-specific adaptations occupy a low-rank subspace of the weight space — the same principle behind LoRA fine-tuning, which I apply at KotiLabs for intent classification.

GitHub ↗ Springer LNAI ↗

Aman
Jain.

Production ML.
Measurable outcomes.

Where I've
built things.

Prefer the
concise version.

Systems I've
architected.

What I work
with.

Research that
ships.

Let's talk
about ML.

Send me a message

AmanJain.

Production ML.Measurable outcomes.

Where I'vebuilt things.

Prefer theconcise version.

Systems I'vearchitected.

What I workwith.

Research thatships.

Let's talkabout ML.

Send me a message

Aman
Jain.

Production ML.
Measurable outcomes.

Where I've
built things.

Prefer the
concise version.

Systems I've
architected.

What I work
with.

Research that
ships.

Let's talk
about ML.