Experience

Ongoing first (rest sorted by end date)

Max Planck Institute for Software Systems (MPI-SWS)

Research Software Engineer • April, 2024 — Present [Full Time] | Nov, 2023 — March, 2024 [Part time] | Aug, 2023 — Oct,2023 [Intern]

Working under Dr Krishna Gummadi to explore different aspects of LLMs. Some areas we've explored/are exploring are

  • Memorization in LLMs
  • Evaluating factual knowledge in LLMs
  • Can LLMs learn regular grammars?

EleutherAI

Open Source Contributor • Dec, 2022 — Present

Working on predicting memorization behavior in LLMs by finding which strings from the training data will be memorized. Previously worked on the Pythia model suite.

Laboratory for Computational Social Systems (LCS2)

Undergraduate Student Researcher • June, 2021 — May, 2024

I've worked on a variety of projects, from hate speech normalization to designing recommendations for fine-tuning improved hate speech detectors. I also led the QUENCH project, a benchmark aimed at evaluating advanced reasoning abilities in large language models, with a particular emphasis on Indic contexts.

Goldman Sachs

Summer Analyst • May, 2023 — July, 2023

Worked in the Finance, Planning & Analysis Engineering division towards revamping the central hub of the department. Also built POCs based on user feedback to improve the search and access experience on the webapp. Also recieved a return offer to join full time as an Analyst.

Google Summer of Code - TensorFlow

Open Source Developer • May, 2022 — Sept, 2022

Worked with Matthew Watson & Chen Qian towards adding support for data augmentation layers to KerasNLP a library under the Keras/TensorFlow Ecosystem which aims to build industry oriented NLP Solutions. I also contributed to several bug fixes and other utilities such as tokenizers and transformer encoder & decoder.

Education

Indraprastha Institute of Information Technology (IIIT-D)

B.Tech. in Computer Science and Engineering • 2020 — 2024

  • Dean's List for Academic Excellence (2022-23)
  • Dean's List for Innovation in Research and Development (2022-23)
  • Dean's List for Academic Excellence (2021-22)

GPA - 9.63/10 [Dept. Rank 2 & Batch Rank 3]

Lal Bahadur Shastri School

Senior-Secondary Education (12th Grade) • 2020

Secured 95% in All India Senior School Certificate Examination

Banyan Tree School

Secondary Education (10th Grade) • 2018

Secured 95.8% in All India Secondary School Examination

Publications

* indicates equal contribution

Qinyuan Wu, Mohammad Aflah Khan, Soumi Das, Vedant Nanda, Bishwamittra Ghosh, Camila Kolling, Till Speicher, Laurent Bindschaedler, Krishna P Gummadi, Evimaria Terzi • 2024

The 18th ACM International Conference on Web Search and Data Mining (ACM WSDM 2025)

USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, Naomi Saphra • 2024

Under Review

Mohammad Aflah Khan, Neemesh Yadav, Sarah Masud, Md Shad Akhtar • 2024

Under Review

Till Speicher, Mohammad Aflah Khan, Qinyuan Wu, Vedant Nanda, Soumi Das, Bishwamittra Ghosh, Krishna P. Gummadi, Evimaria Terzi • 2024

Under Review

Mohammad Aflah Khan*, Neemesh Yadav*, Diksha Sethi*, Raghav Sahni* • 2024

The Second Tiny Papers Track at Eleventh International Conference on Learning Representations (ICLR)

Sarah Masud*, Mohammad Aflah Khan*, Vikram Goyal, Md Shad Akhtar, Tanmoy Chakraborty • 2024

Proceedings of The 18th Conference of the European Chapter of the Association for Computational Linguistics (EACL)

Shrey Satapara, Sarah Masud, Hiren Madhu, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty, Sandip Modha, Thomas Mandl • 2023

Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (FIRE)

Sarah Masud, Mohammad Aflah Khan, Md. Shad Akhtar, Tanmoy Chakraborty • 2023

Proceedings of the 15th Annual Meeting of the Forum for Information Retrieval Evaluation (FIRE)

Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O’Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal • 2023

Proceedings of The 40th International Conference on Machine Learning (ICML)

Mohammad Aflah Khan*, Neemesh Yadav*, Mohit Jain, Sanyam Goyal • 2023

The First Tiny Papers Track at Eleventh International Conference on Learning Representations (ICLR)

Neemesh Yadav*, Mohammad Aflah Khan*, Diksha Sethi, Raghav Sahni • 2023

The First Tiny Papers Track at Eleventh International Conference on Learning Representations (ICLR)

Sarah Masud, Manjot Bedi, Mohammad Aflah Khan, Md Shad Akhtar, Tanmoy Chakraborty • 2022

Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

Skills

Machine Learning/Deep Learning

PyTorch, TensorFlow, Keras, Scikit-learn, HuggingFace

Back-end Development

Flask, FastAPI, Spring Boot

Front-end Development

HTML, CSS, JavaScript, ReactJS, Bootstrap, Tailwind CSS, Streamlit

Programming Languages

Python, Java, JavaScript

Awards, Achievements, and Certifications

Selected for Amazon ML Summer School

Amazon • 2022

Finalist

Anveshan Hackathon • 2022

Runner-Up

Byld Hackathon • 2021

Runner-Up

Association for Computing Machinery (ACM) IIITD Induction Ideathon • 2021

All India Rank 491

JEE Mains Paper 2 • 2020

Top 0.66 Percentile

JEE Mains Paper 1 • 2020

All India Rank 130

Undegraduate Entrance Examination (UGEE) • 2020

Organizing, Reviewing, Volunteering & Talks

Max Planck Institute for Software Systems: MPI SWS

Talk on Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling • July, 2023

Goldman Sachs Internal NLP Paper Reading Club

Talk on Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling • June, 2023

Other Involvements

  • Teaching Assistant - Machine Learning under Dr. Anubha Gupta
  • Teaching Assistant - Data Structures and Algorithms under Dr. Piyus Kedia
  • Research Event Organizing Team Lead - Esya 2023 - IIITD's Annual Technical Festival
  • Coordinator - BioBytes: The Computational Biology and Data Science Club at IIITD
  • Core Member - Byld: The Development Club at IIITD
  • Mentor - Undergraduate Research Club

Outside Interests

  • Chess
  • Webtoons, Mangas, Manhwas, Manhuas and Anime