Rob Kras

Data Engineer

Rob Kras

About Me

I'm a DevOps and Data Engineer at Rabobank, building and operating a data factory that handles market and financial data.

My work bridges platform engineering and data engineering, from CI/CD and infrastructure-as-code to reliable data pipelines and observability, making complex systems reproducible, scalable, and trustworthy.

I hold 2 degrees in CS and primarily focus on Python (big data, machine learning), with hands-on research experience in modern AI architectures.

Education

MSc Computer Science Universiteit Leiden Data Science & AI · 2024–2025
BSc Computer Science Vrije Universiteit Amsterdam Minor: Data Science · 2020–2023

Technical Skills

Cloud & Data

Microsoft Azure Databricks CI/CD Infrastructure-as-Code Git Linux Delta Lakehouse Observability

Machine Learning

Classification Regression Ensemble Methods Supervised Learning Unsupervised Learning

AI

HuggingFace Transformers Vision-Language Models LLaMA SAM2 OpenAI API spaCy Multimodal AI

Languages

Python PySpark SQL Scala C/C++ Bash

Experience

Oct 2025 – Present

Data Engineer

Rabobank — W&R Markets and Treasury Tribe

  • Build and operate cloud and DevOps infrastructure on Azure and Databricks.
  • Design and maintain reliable data pipelines with a focus on observability, lineage, and recovery.
  • Lead ML engineering on an innovation pilot to automate operational tasks end-to-end.
  • Own non-functional concerns — reliability, security, cost, and disaster recovery.
Azure Databricks CI/CD Infrastructure-as-Code Observability

Kaggle Projects

A selection of Kaggle competition entries.

Music BPM Prediction

Top 5% — 131 / 2,581

Predicting song tempo from audio features using gradient boosting and music-theory-informed feature engineering.

Loan Payback Prediction

Top 4% — 172 / 3,724

Binary classification for financial risk using an ensemble of XGBoost, LightGBM, and CatBoost with credit and payment history features.

Road Accident Risk

Top 8% — 313 / 4,082

Ensemble model predicting accident severity using temporal and weather interaction features, with a containerised training pipeline.

Bank Marketing

Top 17% — 576 / 3,367

Customer response prediction built in 7 days using Optuna-tuned classifiers and YAML-driven configuration for rapid iteration.