Alexander Karpekov

PhD Student in Computer Science. Data Scientist.

Political Science and Economics student turned Big Tech guy turned Computer Science grad student. After spending almost ten years as a Data Scientist in the industry (seven years at Google, and three years at a startup), I decided to go back to school to pursue a PhD in Computer Science with a focus on AI and Machine Learning.

I have a particular interest in the field of Explainable Artificial Intelligence, and the intersection of AI and social sciences. I am also always looking to explore how to use data visualizations for storytelling.

I am always open to new opportunities to collaborate on projects at the intersection of AI and other fields, so feel free to reach out!

Education

Present

PhD in Computer Science

Current research areas focus on Interactive Computing, Explainable AI, and Ubiquitous Computing.

Advisors: Sonia Chernova and Thomas Plötz .

2024

MSc in Computer Science | GPA: 3.9/4.0

Completed 2nd Master’s Degree remotely while working full time at Google. Focused on Machine Learning and Artificial Intelligence.

2015

MA in Economics | GPA: 3.8/4.0

Worked as a Teaching Assistant for 3 graduate-level classes in Statistics and Econometrics, leading sessions for 120+ students. Received the best TA award. Regional focus was on China. Studied Mandarin Chinese.

2013

MGIMO University

Moscow, Russia

BA in Political Science | GPA: 92/100

Studied Comparative Politics. Languages: English, French. Thesis on History of Migration in the United Kingdom.

Industry Experience

2017 - 2024

Google

Dublin, Ireland & San Francisco, CA

Senior Data Scientist (L5)

Worked as a Data Scientist in Google Search and YouTube Music, with the main focus on statistical data analysis and A/B experiment design and evaluation to improve search quality and music recommendations. Was promoted twice to L5. Presented my work and findings at regular director, VP, and executive level meetings, including YouTube CEO Susan Wojcicki. A few notable projects:

  • Developed a pathfinding algorithm in song embedding space, improving music recommendations that led to 3% boost in user engagement and music discovery rates. This work was presented at Google-level Data Science Conference in 2023.
  • Implemented a new methodology to cluster YouTube multi-billion music corpus using text, sound, search, and co-watch embeddings, which led to a 30% reduction in harmful watchtime and a 0.5% increase in music revenue ($100s millions).
  • Created a new counterfactual causal impact methodology to evaluate the impact of the new feature launch on user engagement and conversion that helped establish no statistically significant long-term effects on key business metrics. The analysis was instrumental to halt the global rollout at Engineering and Product VP-level.
2015 - 2017

Dataminr

London, UK & New York, NY

Data Analyst

Worked as a Data Analyst in the Data Science team, focusing on Twitter data analysis and news discovery algorithms.

  • Built statistical models to automatically classify Twitter user handles.
  • Conducted Twitter user clustering and unsupervised learning using networks analysis methodologies to improve news discovery algorithms.
  • Led company-wide effort for reporting automation using Python instead of Excel.

Publications

2024

Transformer Explainer: Interactive Learning of Text-Generative Models

IEEE Viz, AAAI Demo Track

Aeree Cho, Grace Kim, Alexander Karpekov, et al.

An interactive visualization tool that helps users understand how transformer models work through hands-on experimentation and real-time feedback.

  • Best Poster Award at IEEE Viz 2024
  • Went Viral: 150K+ visitors in the first 3 months
2023

Is Attention Truly All We Need?

Deep Learning for Text: Final Project

Alexander Karpekov, Sidney Miller

This project investigates the use of Transformer attention weights for deriving feature importance in NLP tasks, demonstrating that combining attention weights with gradient information improves explainability and providing an open-source GitHub tool for applying this method to any Transformer model.

2014

Double-Relocation Policy Evaluation in Guangdong, China using Night Lights Data

ArcGIS: Final Project

Alexander Karpekov

This project examines Guangdong's shifting economic growth using Night Lights data from satelites, focusing on development beyond the Pearl River Delta and the impact of 2008 government policies.

Skills

Programming

  • Python Python
  • SQL SQL
  • TypeScript TypeScript
  • R R
  • Stata Stata
  • C C
  • Java Java

ML & DS

  • PyTorch PyTorch
  • Hugging Face Hugging Face
  • TensorFlow TensorFlow
  • Keras Keras
  • Scikit-learn Scikit-learn
  • Statsmodels Statsmodels
  • XGBoost XGBoost

Data & Viz

  • NumPy NumPy
  • Pandas Pandas
  • SciPy SciPy
  • Jupyter Jupyter
  • Colab Colab
  • Matplotlib Matplotlib
  • Altair Altair
  • Plotnine Plotnine

Frontend

  • Svelte Svelte
  • D3 D3
  • HTML HTML
  • Tailwind CSS Tailwind CSS
  • Figma Figma
  • Illustrator Illustrator

Languages

  • English English
  • Russian Russian
  • French French
  • German German
  • Mandarin Mandarin
  • Latin Latin
  • Ancient Greek Ancient Greek