2026 / Independent ML project
March Madness Prediction Pipeline
Top 1% Kaggle-style tournament model with leakage-safe features, calibrated ensembles, and Stage 2 submissions

Overview
Built an end-to-end March Machine Learning Mania pipeline for men's and women's NCAA tournament prediction, combining point-in-time feature engineering, calibrated model sweeps, strict walk-forward validation, and final Stage 2 submission generation. The project reached a top 1% result while keeping the modeling workflow auditable and reproducible.
Problem
Tournament prediction is a small-sample, high-variance forecasting problem where leakage is easy: seeds, ratings, injuries, and late-season signals must be available only as of the prediction date. The goal was to maximize calibrated win probabilities without letting future tournament outcomes contaminate training.
Role
Individual project - data pipeline, feature engineering, model selection, validation, submission strategy, and reporting
Timeline
2026
Tools
Python / pandas / CatBoost / LightGBM / XGBoost / scikit-learn / pytest
Data
- Kaggle men's and women's NCAA regular-season, tournament, seed, slot, conference, coach, city, and Massey ordinal files
- Stage 2 sample submission with 132,133 matchup rows validated against exact sample IDs
- Point-in-time feature snapshots cached by division, season, and cutoff day
- Optional external 2026 injury, prospect, and bracket projection snapshots for inference experiments
Approach
- Created leakage-safe team-season and matchup features from regular-season results only, including Elo, Glicko-like ratings, rating uncertainty, ORtg, DRtg, NetRtg, pace, eFG, turnover, rebounding, free-throw, 3PA, opponent-adjusted margin, conference strength, seed priors, and Massey aggregates
- Ran multi-family sweeps across logistic elastic net, HistGB, CatBoost, LightGBM, XGBoost, and OOF stacking
- Compared Platt, isotonic, and beta calibration under walk-forward validation
- Built distinct final strategies: balanced, chalk-leaning, upset-leaning, and uncertainty-robust, then generated Stage 2 A/B submissions
Evaluation
- Strict validation folds: 2022, 2023, 2024, and 2025 with train seasons strictly earlier than the validation season
- Primary metric: Brier score; secondary metrics: LogLoss and expected calibration error
- Best stable Stage 2 candidate: CatBoost rating-focused model with mean Brier 0.164668, mean LogLoss 0.494439, and mean ECE 0.032421 across four folds
- Final strategy audit reported robust strategy mean Brier 0.164916 and balanced strategy mean Brier 0.165139 under four-fold validation
Results
- Achieved top 1% performance with a fully reproducible prediction workflow
- Produced final Stage 2 submissions A and B plus four strategy submissions for balanced, chalk, upset, and robust risk profiles
- Built automated reports, figures, validation audits, submission checks, and tests for parsing, leakage guardrails, and output format
Deployment
- One-command training and report pipeline via Python modules
- Generated CSV submissions, model summaries, experiment leaderboards, calibration figures, and PDF reports
- Validation checks ensure exact sample ID alignment and probability bounds before submission
Limitations
- Tournament sample size remains limited and season-to-season variance is high
- External injury mapping is noisy, especially for women's coverage
- Some strategy-level metrics include additional calibration-selection layers and are treated directionally rather than as direct single-run comparisons
Evidence




Repro Steps
- Install project requirements and place Kaggle competition CSVs under data/raw
- Run python -m src.experiments.stage2_finalize --asof 2026-02-21 --budget 35 --rebuild_features
- Validate submissions against SampleSubmissionStage2.csv and inspect outputs/reports
Next Steps
- Add minute-level player availability priors mapped to possession-level impact
- Expand uncertainty modeling with bootstrap and fold variance
- Increase sweep budget with early stopping and run-time pruning
- Add explicit monotonic constraints for seed features in LightGBM variants