All Projects

A comprehensive collection of projects showcasing my expertise in data engineering, machine learning, and cloud infrastructure.

Rainfall Prediction

Top 0.2%

Feature engineering, K-Folds cross-validation, and ensemble methods. Discovered that simpler algorithms (KNN) can outperform complex ensembles when properly optimized.

KNN Ensemble K-Folds Feature Engineering Hyperparameter Tuning

House Prices Prediction

Top 1%

Regression techniques with domain knowledge and SHAP for feature importance analysis. Discovered and exploited data leakage for near-perfect score.

Regression SHAP XGBoost Data Leakage Domain Knowledge

Loan Payback Prediction

Top 4%

Binary classification for financial risk assessment. Feature engineering combining credit metrics, debt-to-income ratios, and payment history.

Classification Ensemble SHAP XGBoost LightGBM CatBoost

Music BPM Prediction

Top 5%

Predicting song tempo from audio features. Combined signal processing with machine learning, leveraging music theory for feature engineering.

Audio ML Regression Signal Processing Gradient Boosting Feature Engineering

Road Accident Risk

Top 8%

Ensemble methods predicting accident risk. Created temporal and weather interaction features. Containerized training pipeline using Docker.

Ensemble Optuna Docker Feature Engineering Hyperparameter Optimization

Bank Marketing

Top 17%

Customer response prediction in 7 days. YAML configuration management for rapid prototyping and experimentation.

Classification Optuna YAML Config Rapid Prototyping

Podcast Listening Time

Top 16%

Time series regression with temporal features and advanced stacking techniques for improved predictions.

Time Series Stacking Temporal Features Regression

Optimal Fertilizer

Top 28%

Multi-label classification with MAP@3 optimization and agricultural domain knowledge for crop recommendation.

Multi-label MAP@3 Domain Knowledge Classification

Titanic Survival

Top 15%

First Kaggle experience. Feature engineering fundamentals and XGBoost for binary classification.

XGBoost Feature Engineering Classification

Spaceship Titanic

Top 34%

Lessons on overfitting and feature correlation. Improved model through careful feature selection.

Classification Feature Selection Overfitting

Credit Card Fraud

-

Imbalanced data handling with SMOTE. Learned proper application within K-Folds cross-validation.

Imbalanced Data SMOTE K-Folds Classification

Personality Type

Top 34%

Limited data with advanced oversampling and Bayesian optimization for hyperparameter tuning.

SMOTE Bayesian Opt Classification Small Data
K-Folds

House Prices Prediction

Top 1%

Major breakthrough in regression techniques. Incorporated domain knowledge and used SHAP for feature importance analysis. Achieved rank 37 out of 3,935 participants, ultimately discovering and exploiting data leakage for near-perfect score.

Regression SHAP XGBoost

Predicting Loan Payback

Top 4%

Binary classification for financial risk assessment. Applied sophisticated feature engineering combining credit metrics, debt-to-income ratios, and payment history. Ensemble of XGBoost, LightGBM, and CatBoost emphasized interpretability using SHAP values. Rank 172 / 3,724.

Classification Ensemble SHAP