Back to Portfolio
Playground Series S5E9

Predicting Beats-per-Minute of Songs

A regression challenge predicting the tempo (BPM) of music tracks based on audio features, combining signal processing with machine learning for music information retrieval.

Competition Rank
131 / 2,581
Percentile
Top 5%
Evaluation Metric
MAE
Achievement
🏆 Silver

Problem Overview

This fascinating competition combined audio signal processing with machine learning to predict song tempo. The dataset included various acoustic features extracted from audio files such as spectral characteristics, rhythm patterns, energy levels, and harmonic content. BPM prediction is crucial for music recommendation systems, DJ software, and automated playlist generation.

Technical Approach

Key Insight

The breakthrough came from understanding that BPM relates non-linearly to audio features. Songs at half-tempo or double-tempo can have similar spectral characteristics but different BPM values. Creating features that captured these tempo harmonics (multipliers/divisors of base tempo) significantly improved predictions, especially for edge cases. Additionally, genre-aware feature engineering helped distinguish between genres with naturally different tempo ranges.

Technology Stack

Python Pandas NumPy Scikit-learn XGBoost LightGBM CatBoost SciPy Matplotlib Seaborn

Lessons Learned

This competition highlighted the value of domain knowledge in feature engineering. While I'm not a music expert, understanding basic concepts like tempo doubling/halving, rhythm patterns, and genre conventions made a significant difference. It also demonstrated the importance of exploratory data analysis - visualizing feature relationships with the target revealed non-linear patterns that informed both feature engineering and model selection.

The top 5% finish validated the approach of combining domain expertise with robust ML techniques, a pattern I now apply across different domains in my work as a Data Engineer.