my face

Robin (R.P.M.) Kras

Computer Science | Data Science | Artificial Intelligence

About Me

Hello there!

My name is Robin. I am passionate about the intersection of mathematics, algorithms, and programming, with a special focus on Data Science and Artificial Intelligence.

Kaggle Competitor

Top rankings in machine learning competitions with proven track record

AI Research

Master's thesis on VLM capabilities and cognitive perception

Problem Solver

Enthusiastic about tackling complex algorithmic challenges

Kaggle Enthusiast
2
Computer Science Degrees
8+
Programming Languages

Education

Master of Science, Computer Science

2024 - 2025

Rijksuniversiteit Leiden

Specialization: Data Science and Artificial Intelligence
Dissertation: Cross-Modal Sound Symbolism in Cutting-Edge Vision-Language Models Grade: 8/10

Bachelor of Science, Computer Science

2020 - 2023

Vrije Universiteit Amsterdam

Minor: Data Science
Dissertation: Exploring the Efficacy of Different Machine Learning Techniques Across Diverse Classification Tasks Grade: 7/10

Research & Achievements

Cross-Modal Sound Symbolism in Vision-Language Models

2024-2025 Grade: 8/10 Universiteit Leiden

Research Focus

Investigated how AI vision-language models perceive and associate phonetic sounds with visual shapes, specifically exploring the "bouba-kiki effect" in multimodal artificial intelligence systems.

AI Models Studied

LLaMA 3.2-11B-Vision-Instruct Molmo 7B-D-0924 SAM2 (Segmentation)

Experimental Approach

Cross-Modal Probability Analysis

Quantified sound-shape associations in AI model responses

Image-to-Text Matching

Evaluated cross-modal understanding capabilities

Visual Grounding & Segmentation

Integrated phonetic prompts with visual space analysis

Key Findings & Impact

Both LLaMA3.2 and Molmo showed inconsistent congruency patterns, providing weak support for the Bouba-Kiki hypothesis
LLaMA's behavior appeared closer to random, while Molmo demonstrated slightly more stable but still inconclusive preferences
Attention pattern analysis revealed moderate tendency (61%) to pair curved images with sonorant-rounded pseudowords
Established comprehensive experimental framework for cross-modal association testing in vision-language models

Interdisciplinary Contributions

Cognitive Science Artificial Intelligence Computational Linguistics Computer Vision

Skills

Programming Languages

Python
C/C++
Scala
HTML5/CSS/JS
Assembly

Data Science & AI

Machine Learning
Data Analysis
Deep Learning
NLP
Reinforcement Learning

Tools & Frameworks

TensorFlow
PyTorch
Keras
Pandas
NumPy
SQL/MySQL
HuggingFace

Personal Qualities

Love of Learning
Time Management
Communication
Excellent Swimmer
Adaptability

Projects

In my free time I enjoy participating in Kaggle competitions and tinkering with open datasets!

Titanic Spaceship

Rank 613/1816
Practice problem - results cleared after one month
Approach & Results

Applied machine learning techniques to predict passenger transportation outcomes with 4.95% accuracy improvement between versions.

Key Methods: Feature engineering, XGBoost optimization, cross-validation

Impact: Demonstrated systematic approach to model improvement and validation techniques.

Titanic

Rank 2331/15346
Practice problem - results cleared after one month
Approach & Results

Classic machine learning competition focused on survival prediction using passenger data and advanced feature selection.

Key Methods: Data analysis, feature correlation, XGBoost classification

Impact: Achieved significant performance improvement through strategic feature engineering and algorithm selection.

House Prices

Rank 37/3935
Practice problem - results cleared after one month
Approach & Results

Regression analysis project achieving top 1% ranking through progressive model improvements and advanced feature engineering.

Key Methods: Linear regression, categorical encoding, ensemble methods, data leakage analysis

Impact: Demonstrated understanding of both competitive ML techniques and real-world model applicability.

Rainfall Prediction

Rank 5/2529
Peak rank 5th place - didn't submit correct file...
Approach & Results

Weather prediction challenge achieving top 5 ranking with dramatic 28.2% improvement through algorithm optimization.

Key Methods: k-Nearest Neighbors, feature normalization, cross-validation, algorithm comparison

Impact: Proved that simpler algorithms can outperform complex models when properly optimized for specific data patterns.

Fraud Detection

Performance Metrics
Metric Score Performance
Accuracy
99%
Precision
79%
Recall
85%
F1-score
82%
AUC
98%
View Full Analysis
Approach & Results

Fraud detection system for highly imbalanced financial data achieving 99% accuracy with strong recall performance.

Key Methods: SMOTE sampling, XGBoost classification, model interpretability, performance optimization

Impact: Delivered production-ready solution balancing fraud detection accuracy with explainable AI requirements.

Podcast Listening Behavior

Rank 536/3310
Approach & Results

User behavior prediction system for podcast listening patterns using ensemble learning and feature engineering.

Key Methods: Model stacking, feature engineering, time-series validation, ensemble optimization

Impact: Demonstrated advanced ensemble techniques combining multiple algorithms for improved prediction accuracy.

Introvert/Extrovert Prediction

Rank 1379/4067
Limited dataset - worthless to improve on due to extremely limited data. Also submitted wrong file for final submission. Shame!
Approach & Results

Binary classification project predicting personality type from social behavior and traits using advanced ML techniques.

Key Methods: Feature engineering, SMOTE balancing, stacking ensemble, XGBoost optimization

Impact: Comprehensive analysis of personality prediction with model interpretability and feature importance insights.

Top 3 Fertilizers

Rank 732/2650
Participated only in first 5 days of competition
Approach & Results

Agricultural optimization project for fertilizer recommendation using automated hyperparameter tuning within tight time constraints.

Key Methods: Multi-class classification, automated optimization, ensemble methods, domain-specific features

Impact: Showcased efficient model development and optimization techniques under competitive time pressure.

Development Tools & Resources

Utilities & Documentation

๐Ÿ“‹ ML Workflow Guide

Systematic Methodology Documentation

PDF

๐ŸŽฏ Purpose: End-to-end machine learning process documentation

๐Ÿ“Š Data Exploration ๐Ÿ”ง Feature Engineering
โš™๏ธ Model Selection ๐Ÿ“ˆ Performance Evaluation
๐Ÿš€ Deployment Strategy ๐Ÿ“‹ Best Practices

๐Ÿ› ๏ธ Utility Functions Library

Reusable ML Components

Code

โšก Purpose: Optimized functions for streamlined ML workflows

๐Ÿงน Data Preprocessing ๐Ÿ“Š Visualization Tools
๐Ÿ“ Model Evaluation ๐Ÿ”„ Pipeline Utilities
๐ŸŽจ Custom Plotting โš™๏ธ Config Management

Languages I speak

Dutch Native
English Bilingual
German Limited
French Limited