Research Software Engineer @ MPI-SWS · OSS @ EleutherAI
Hi, I’m Aflah, a research software engineer at the Max Planck Institute for Software Systems. My work centers on deepening our understanding of large language models (LLMs) and rigorously evaluating their capabilities. I’m also passionate about the systems side of LLMs, with hands-on experience in large-scale pretraining and inference. In the past, I’ve contributed to projects targeting hate speech reduction and other NLP applications for social good.
Open to roles in research, research engineering, or backend engineering
Working under Dr Krishna Gummadi to explore different aspects of LLMs. Some areas we've explored/are exploring are
Optimizing pre-training and inference for LLMs
LLM memorization and the impact of Parameter-Efficient Fine-Tuning (PEFT) on memorization
Knowledge acquisition and evaluation of factual knowledge in LLMs
Built and currently maintain key internal tools OpenChat (An internal chatbot), MaxCast (A research paper-to-podcast conversion service) & MaxChat (A document-based chat service). These services were developed from scratch, including hosting models on-premises and fine-tuning for optimal performance.
Published and submitted research to top-tier (A*) conferences
EleutherAI
Open Source Contributor • Dec, 2022 — Present
Currently working on the Multilingual Natural Instructions project to build a massive instruction tuning corpus for Hindi. Previously worked on -
Pythia - A Suite for Analyzing Large Language Models Across Training and Scaling (Accepted ICML'23) - Majorly contributed to the gender bias evals and intervention case study. The models have over 18 million downloads (as of April 2025)
Recite, Reconstruct, Recollect - Memorization in LMs as a Multifaceted Phenomenon (Accepted ICLR'25) - An intuitive taxonomy to classify memorized sequences and then build predictors based on these classes
Laboratory for Computational Social Systems (LCS2)
Undergraduate Student Researcher • June, 2021 — May, 2024
I've worked on a variety of projects, from hate speech normalization to designing recommendations for fine-tuning improved hate speech detectors. I also led the QUENCH project, a benchmark aimed at evaluating advanced reasoning abilities in large language models, with a particular emphasis on Indic contexts.
Goldman Sachs
Summer Analyst • May, 2023 — July, 2023
Worked in the Finance, Planning & Analysis Engineering division towards revamping the central hub of the department. Also built POCs based on user feedback to improve the search and access experience on the webapp. Also recieved a return offer to join full time as an Analyst.
Google Summer of Code - TensorFlow
Open Source Developer • May, 2022 — Sept, 2022
Worked with Matthew Watson & Chen Qian towards adding support for data augmentation layers to KerasNLP a library under the Keras/TensorFlow Ecosystem which aims to build industry oriented NLP Solutions. I also contributed to several bug fixes and other utilities such as tokenizers and transformer encoder & decoder.
Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander
ICLR 2026 - The Fourteenth International Conference on Learning Representations
IASEAI 2026 - The International Association for Safe & Ethical AI (Non-Archival)
(An earlier version of this work was presented at R2-FM, ICML 2025)
Johnny Tian-Zheng Wei*, Ameya Godbole*, Mohammad Aflah Khan*, Ryan Wang, Xiaoyuan Zhu, James Flemings, Nitya Kashyap, Krishna P. Gummadi, Willie Neiswanger, Robin Jia
[Oral] ICLR 2026 - The Fourteenth International Conference on Learning Representations
Data-FM @ ICLR 2026 - Workshop on Navigating and Addressing Data Problems for Foundation Models (Non-Archival)
Qinyuan Wu, Soumi Das, Mahsa Amani, Bishwamittra Ghosh, Mohammad Aflah Khan, Krishna P. Gummadi, Muhammad Bilal Zafar
ICLR 2026 - The Fourteenth International Conference on Learning Representations
IASEAI 2026 - The International Association for Safe & Ethical AI (Non-Archival)
(An earlier version of this work was presented at MemFM, ICML 2025)
USVSN Sai Prashanth, Alvin Deng, Kyle O'Brien, Jyothir S V, Mohammad Aflah Khan, Jaydeep Borkar, Christopher A. Choquette-Choo, Jacob Ray Fuehne, Stella Biderman, Tracy Ke, Katherine Lee, Naomi Saphra
ICLR 2025 - The Thirteenth International Conference on Learning Representations
Stella Biderman, Hailey Schoelkopf, Quentin Gregory Anthony, Herbie Bradley, Kyle O'Brien, Eric Hallahan, Mohammad Aflah Khan, Shivanshu Purohit, USVSN Sai Prashanth, Edward Raff, Aviya Skowron, Lintang Sutawika, Oskar van der Wal
[Oral] ICML 2023 - The Fortieth International Conference on Machine Learning
ACL Rolling Review (ARR), International Conference on Computational Linguistics (COLING), Workshop on Online Abuse and Harms (WOAH), The Technical Symposium on Computer Science Education (SIGCSE TS)
Served as a reviewer for the above mentioned conferences • 2023 Onwards