Hi, I'm Kristy Natasha Yohanes

I am an AI/ML developer who loves to immerse myself in technology and hands-on development: experimenting with Machine Learning, Natural Language Processing, Cloud Deployments, Software Development, and Arduino projects. To unwind, I find joy in the occasional thrill of wall climbing and archery during my free time.Here's my contact info, CV, and portfolio of my qualifications. You can download the PDF version of my portfolio (with certifications) below:

# ABOUT ME

I hold a Bachelor's degree from Institute of Technology Bandung (ITB) in Indonesia, specializing in Data Science & AI.

My expertise lies in Python programming, specializing in ML, NLP, CV, time-series analysis, GNNs, and GenAI.

I have implemented advanced models for anomaly detection, fraud prevention, personalized recommendations, and built scalable backend systems.

I am also proficient in SQL, Javascript, C++, BI tools (Looker Studio, Tableau, Power BI), and Cloud Deployment (Compute Engine VM, Cloud Functions, Docker, and K8s.)

Projects & Case Studies

Throughout my academic and professional journey, I've actively participated in projects focused on developing forecasting models and conducting weather data analysis utilizing machine learning techniques, notably the hybrid ANN-ARIMA for predicting monsoonal patterns.Additionally, I contributed to community service initiatives by integrating the Weather Research and Forecasting (WRF) computational model to develop advanced flood risk assessment tools and designed user-centric web applications to distribute early warning alerts, enhancing proactive disaster mitigation strategies.

Achievements & Publications

  • Semifinalist AI Hackathon Bank Indonesia - FEKDI 2024

  • Top 10 Scientific Paper Physics Fair Padjadjaran University

  • Top 10 Scientific Paper Economic Finance Study Club Diponegoro University

PUBLISHED: Propagation Characteristics of Madden Julian Oscillation in the Indonesian Maritime Continent: Case Studies for 2020-2022, Agromet Journal, doi: 10.29244/j.agromet.38.1.1-12PREPRINT / UNDER REVIEW: An Elementary Approach to Predicting Indonesian Monsoon Index: Combining Ann-Arima Hybrid Method and Practical Use

GitHub: Mock Projects

One of my enjoyable side projects I'm proud of is the YouTube video recommendation insights project. I utilized the YouTube Data API and NLP to analyze users' watch histories, uncovering insights into video recommendations. Key features include OAuth 2.0 integration, transcript analysis, keyword extraction, visualization, and data export capabilities.

Another one is on Customer Goods Data Modeling project that addresses industrial challenges through predictive modeling for daily sales quantity and customer segmentation. It utilizes PostgreSQL and DBeaver for data ingestion, Tableau Public for interactive dashboards, Python in Google Colab for predictive modeling, including time series ARIMA, and clustering techniques.

Professional Certifications

  • AML and Data Governance: Risk-Based Mentoring Program for Crimes of Money Laundering and Terrorism Financing in Human Trafficking and Financial Technology Crimes - PPATK

  • Data Science: Certificate of Competencies - Kalbe Nutritionals Data Scientist Project Based Internship Program

  • Full-Stack Development: Certificate of Competencies - BTPN Syariah Fullstack Developer Project Based Internship Program

Coursework Certifications

  • Computer Science for Artificial Intelligence (CS50) - Harvard University

  • Google Cloud Professional Machine Learning Engineer Cert Prep - Google Developers

  • Artificial Intelligence on Microsoft Azure - Microsoft

  • The Full Stack - Meta

  • SQL for Data Science - UC Davis

  • Python for Data Science, AI & Development - IBM

  • Bootcamp E-Learning: Data Science, Website Development/Backend (Python, Flask) - MySkill

  • Intro to Data Analytics - RevoU

  • Data Programming - Sololearn

  • Data Visualization ShortClass - MySkill

  • Intensive Bootcamp: Data Analysis - MySkill x Deloitte

  • Power BI Essential Training - LinkedIn Learning

  • Companies and Climate Change - ESSEC Business School

  • The Science of The Solar System - Caltech

  • The Evolving Universe - Caltech

WORK EXPERIENCE

Model development: Achieved 4/4 project completion across model development cycles (1 model per 2-4 sprints), including:
• Implemented PCA and Isolation Forest for anomaly detection in merchant-customer transactions;
• Applied Random Forest and Network Analysis to map illegal online gambling transactions;
• Developed semi-supervised Relational Graph Convolution Network (RGCN) model for collusion risk assessment;
• Created a gradient boosting model that utilizes the Heterogeneous Graph Transformer (HGT) architecture to detect
syndicate fraudsters within clustered graphs, contributing to best practices in Anti-Money Laundering (AML).

ML Deployment: Delivered 100% successful execution in model deployment in collaboration with AI/ML engineers, including:
• Utilized GCP cloud automation tools including Cloud Functions and Virtual Machine for model deployment.

BAU: Contributed to up to an 11.2% reduction in fraud incidents and saved an average of 95% in potential losses each month through:
• Utilizing device intelligence and computer vision to monitor tampering, cyberattacks, and account takeovers.
• Optimizing SQL queries in BigQuery data marts, reducing query execution times by an average of 180 seconds (saved storage costs).
• Collaborating on real-time data streaming with Apache Kafka and creating Looker dashboards for efficient monitoring.

• Utilized PostgreSQL in DBeaver for daily data exploration and developed interactive Tableau dashboards for monitoring.
• Accomplished 100% completion in 2 sprints: Employed ARIMA time-series regression in Python for daily product quantity estimation and applied K-Means Clustering to optimize marketing strategies and provide personalized promotions based on customer segments.

• Designed and deployed a Flask-based recommendation system API incorporating NLP techniques such as topic modeling, named entity recognition, and sentiment analysis, improving personalized job matching between seekers and listers by 30%.
• Attained 100% project delivery by utilizing Docker and Kubernetes for deployment, enhancing system scalability.

• Machine Learning Researcher - FITB Research Grant ITB
• Full-Stack Developer Internship - BTPN Syariah
• Front-End Developer Internship - Core Initiative Studio
• Data Science Research Internship - LAPAN / Indonesian National Aeronautics & Space Administration
• GIS Data Analyst Internship - Garda Caah

TECH STACKS
Programming Languages: Python, SQL, Javascript, C++
Libraries/Tools: ML (Scikit-learn, TensorFlow, PyTorch, Keras, XGBoost, OpenCV, ARIMA, LSTM), NLP (NLTK, HuggingFace, spaCy), Visualization (Matplotlib, Seaborn, Pandas, NumPy, Plotly), Graph Networks (NetworkX, RGCNconv, HGTconv), Generative AI (GPT-4, Llama, Stable Diffusion (SDXL)), LLM Framework (OpenAI)
Databases: PostgreSQL, BigQuery, MySQL, DBMS (DBeaver)
Development: Code Editor (VS Code, Google Colab), Version Control (Git), Backend Frameworks (Flask, Django, Gin)
Deployment: Compute Engine VM, Cloud Functions, Docker, Kubernetes