About This Library
This is a collection of optimized utility functions that I've developed and refined through my machine learning projects. These functions handle common tasks in data preprocessing, visualization, and model evaluation, allowing me to maintain consistency and efficiency across different projects.
Model Ensemble & Evaluation
generate_stack()
Stacking ensemble function for combining multiple models with cross-validation. Implements meta-learning to optimize the combination of base models for improved performance.
plot_model_performance()
Comprehensive model performance evaluation with multiple metrics visualization. Generates detailed performance reports including accuracy, precision, recall, and F1-score.
oof_cross_val()
Out-of-fold cross-validation function with averaged predictions. Provides robust model validation while generating predictions for the entire training set.
Data Visualization
plot_nums()
Visualization function for numerical data distributions with histograms and KDE. Creates publication-ready plots for understanding data distributions and patterns.
plot_cats()
Visualization function for categorical data using pie charts and bar plots. Handles categorical variable analysis with automatic styling and labeling.
heatmap_nums()
Correlation heatmap generation with high correlation pair detection. Identifies and highlights strong correlations in datasets for feature selection.
plot_feature_importance()
Feature importance visualization for tree-based and linear models. Creates clear, interpretable plots showing which features contribute most to model predictions.
Feature Engineering & Interpretability
create_combination_features()
Feature engineering function for creating feature combinations and interactions. Automatically generates polynomial features and interaction terms to improve model performance.
plot_shap_values()
SHAP values visualization for model interpretability and explainability. Creates comprehensive plots showing how each feature contributes to individual predictions.