Stella Li

Email
stelli@cs.washington.edu
Google Scholar
CRfOlOEAAAAJ
GitHub
stellalisy
Location

Seattle, Washington, USA

About me

I am a second year Ph.D. student in the Allen School of Computer Science and Engineering at the University of Washington, advised by Yulia Tsvetkov. I do research in Natural Language Processing, and I'm particularly interested in using computational methods to model and potentially discover cognitive processes.

Before grad school, I received my B.S. and M.S.E. at Johns Hopkins with majors in Computer Science, Cognitive Science (linguistics focus), and Applied Mathematics (statistics focus). I worked as a research assistant at the Center for Language and Speech Processing advised by Philipp Koehn and Kenton Murray.

My research interests: NLP, Clinical Reasoning, Social Reasoning, Human-Centered NLP, Multilinguality, and more!

Please contact me at stelli [at] cs.washington.edu if you are interested in my work!

Click here to view my CV (updated Dec. 24)

Current Projects

Proactive Reasoning

I'm thinking about how to identify and proactively seek information using LLMs to improve model safety & reliability with statistical guarantee. How to make LLMs ask good questions? How do we model "intuition" in expert domains?
Socially-Intelligent Personalization

How do people in different social groups interact differently? How do unspoken rules shape behaviors and interactions? We need to first learn these explicit & implicit social preferences of users, then personalize models to interact accordingly.

News

2025-06

Presenting Spurious Rewards at Cohere Labs [YouTube] [Slides].
2025-06

Prompt engineering can elicit similar behaviors in models as RLVR does. We show that "Spurious Prompt" can boost Qwen2.5-Math MATH-500 performance by 20% as well‼️ Check out our new blogpost "Spurious Rewards and Spurious Prompts."
2025-05

Doing RLVR on incorrect and even random rewards can boost Qwen2.5-Math MATH-500 performance by 20%🤯 We explore how and why this happens in our new paper "Spurious Rewards: Rethinking Training Signals in RLVR" (blogpost).
2025-04

Standard privacy evaluations focus mainly on surface-level lexical identifiers, but we show that semantic leakage is a more serious threat🚨. Check out our new paper "A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage."
2025-02

Check out our new paper "ALFA: Aligning LLMs to Ask Good Questions A Case Study in Clinical Reasoning" by decomposing a complex goal into attributes and synthesizing paired data for preference learning.
2024-12

Presenting MediQ at Neurips2024 in Vancouver 🍁.
2024-11

Presenting ValueScope and Multilingual Abstention at EMNLP2024 in Miami 🌴.
2024-09

Joining Meta FAIR as a visiting researcher.
2024-08

Giving a lightening talk at JHU Responsible AI for Health Symposium (RAIHS) on MediQ.
2024-08

Giving a talk at MSR Real-world Evidence Lab on MediQ.
2024-07

Check out our new paper "ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions to see how we "read the room" using LLMs.
2024-06

Check out our new paper "MEDIQ: Question-Asking LLMs for Adaptive and Reliable Clinical Reasoning in which we teach LLMs to ask--instead of answer--clinical questions!
2023-12

Presenting Condensing Multilingual Knowledge with Lightweight Language-Specific Modules at EMNLP2023 in Singapore!
2023-09

Starting my Ph.D. at the University of Washington, excited to be part of Tsvetshop!
2022-12

Check out our new paper "A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors."
2022-11

Check out our new paper "Language Agnostic Code-Mixing Data Augmentation by Predicting Linguistic Patterns."
2022-10

Check out our new paper "A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters."
2022-10

Check out our new paper "PQLM - Multilingual Decentralized Portable Quantum Language Model for Privacy Protection."
2022-09

Check out our new paper "End-to-End Lyrics Recognition with Self-supervised Learning."
2022-07

Check out our workshop paper "Genetic improvement in the shackleton framework for optimizing LLVM pass sequences" accepted to GECCO'22. Best Presentation Award
2022-03

Check out our paper "Optimizing LLVM Pass Sequences with Shackleton: A Linear Genetic Programming Framework" accepted to GECCO'22.

Experience

Click here to view my CV (updated Dec. 24)

Education

University of Washington
2023 — present | Seattle, WA
Ph.D. in Computer Science and Engineering

Advised by Yulia Tsvetkov.
Johns Hopkins University
2022 — 2023 | Baltimore, MD
M.S.E. in Computer Science with Human Language Technology Concentration

Advised by Philipp Koehn and Kenton Murray.

Thesis: Learning from Gibberish: Code-Mixing Data Augmentation for Sentiment Analysis
Johns Hopkins University
2019 — 2022 | Baltimore, MD
B.S. in Applied Mathemtics and Statistics

Advised by Philipp Koehn and Ed Scheinerman.

Other Majors: Computer Science, Cognitive Science (linguistics focus)

Minor: Mathematics
Stanford Online High School
2018 — 2019 | Palo Alto, CA
Dual enrollment program with a focus in advanced mathematics
Robert Louis Stevenson School
2016 — 2019 | Pebble Beach, CA
Awards & Leadership: Cum Laude Society, USABO Semifinalist, USAMO Qualified, Bausch & Lomb National Science Award, Math Madness Silver Medalist, Math Team Captain, Spanish National Honor Society, Varsity Volleyball

Teaching/TA Experience

Ethics in AI: Teaching Assistant
2025 Winter
CSE 582
Introduction to Statistics: Teaching Assistant
2020 Spring, 2021 Fall, 2022 Spring, 2023 Spring
EN.503.430 (undergrad) & EN.503.630 (grad) & EN.503.431 (honors)
Artificial Intelligence: Course Assistant
2023 Spring
EN.601.464 (undergrad) & EN.601.664 (grad)
Human-Computer Interaction: Course Assistant
2022 Fall
EN.601.490 (undergrad) & EN.601.690 (grad)
Computer Ethics: Head Course Assistant
2022 Summer
EN.601.104
Intermediate Programming: Course Assistant
2020 Spring, 2021 Fall, 2022 Spring
EN.601.220

Work Experience

Meta FAIR: Visiting Researcher
2024 - Current | Seattle, WA
Advised by Asli Celikyilmaz.

Working on Social Alignment on the SAGE Team at Meta FAIR.
Yext: Software Engineering Intern
2022 Summer | Arlington, VA
Integrated client data to Yext platform for real-time site information updates using Go.

Created a Figma Style Picker to improve developer workflow and scalability using ReactJS.
Michigan State University: Research Intern
2021 Summer | East Lansing, MI
Advised by Wolfgang Banzhaf.

Designed and implemented novel GP algorithm for LLVM compiler flag optimization (20%).

Published work at GECCO; second author of GP paper; first author of GI paper.
Bytedance AI Lab: Research Intern
2020 Summer | Beijing, China
Trained neural networks for text normalization in text-to-speech tasks.

Implemented statistical information-retrieval algorithms for theme clustering and complexity ranking for TikTok videos.
Johns Hopkins Language and Cognition Lab: Research Assistant
2020 - 2022 | Baltimore, MD
Advised by Barbara Landau.

Investigated developmental spatial cognition using Lego Block building.

Created ML model for movement prediction and stability analysis using motion sensor data.

Publications

Below is a list of projects for which I was very involved in (lead/co-lead/contributed significantly). For a more comprehensive list of papers, check out my Google Scholars page. I also try to record the time that I spent on each project in case anyone finds it helpful!

RL & Post Training
Jan. 25 - May. 25

Spurious Rewards: Rethinking Training Signals in RLVR

Preprint

Rulin Shao*, Shuyue Stella Li*, Rui Xin*, Scott Geng*, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer

We show that RLVR on incorrect and even random rewards can boost model performance on some models but not others, and investigate why.
Privacy & Safety
Jan. 24 - May. 25

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

Preprint

Rui Xin, Niloofar Mireshghallah, Shuyue Stella Li, Michael Duan, Hyunwoo Kim, Yejin Choi, Yulia Tsvetkov, Sewoong Oh, Pang Wei Koh

We evaluate the effectiveness of sanitization methods in removing sensitive information from text data and show previously undetected semantic leakage.
Clinical Reasoning, Post Training
Mar. 24 - Feb. 25

ALFA: Aligning LLMs to Ask Good Questions - A Case Study in Clinical Reasoning

Preprint

Shuyue Stella Li*, Jimin Mun*, Faeze Brahman, Jonathan S. Ilgen, Yulia Tsvetkov, Maarten Sap

Guided by attributes from clinical communications and psychology, we generate synthetic paired data to align LLMs to ask good questions.
Social Reasoning
Oct. 23 - June 24

ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions

EMNLP 2024

Chan Young Park*, Shuyue Stella Li*, Hayoung Jung*, Svitlana Volkova, Tanushree Mitra, David Jurgens, Yulia Tsvetkov

We developed a computational framework to model and discover implicit social norms and values in online communities at scale.
Clinical Reasoning
Oct. 23 - May 24

MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning

Neurips 2024

Shuyue Stella Li, Vidhisha Balachandran, Shangbin Feng, Jonathan Ilgen, Emma Pierson, Pang Wei Koh, Yulia Tsvetkov

We establish a novel framework for interactive information seeking to enhance reliable medical reasoning abilities in LLMs.
Multilinguality
Jan. 23 - May 23

Learning from Mistakes: Towards Robust Neural Machine Translation for Disfluent L2 Sentences

MT Summit 2023

Shuyue Stella Li, Philipp Koehn

We develop a multilingual MT system from non-native source sentences to investigate the differences between human language acquisition and machine language acquisition. We create synthetic L2 data to enhance the robustness of L2 MT systems.
Speech Processing
Feb. 23 - May 23

Simple yet Effective Code-Switching Language Identification with Multitask Pre-Training and Transfer Learning

Arxiv

Shuyue Stella Li, Cihan Xiao, Tianjian Li, Bismarck Odoom

We propose a residual deep CNN model and a multi-task learning model to identify segment-level languages in English-Mandarin code-switched utterances. Data augmentation and fine-tuning methods are also explored to improve performance.
Multilinguality
Feb. 23 - May 23

Condensing Multilingual Knowledge with Lightweight Language-Specific Modules

EMNLP 2023

Haoran Xu, Weiting Tan*, Shuyue Stella Li*, Yunmo Chen*, Benjamin Van Durme, Philipp Koehn, Kenton Murray

We introduce Language-Specific Matrix Synthesis (LMS) and Fuse Distillation (FS) to condense knowledge and improve model efficiency in multilingual language models.
Speech Processing
May 22 - May. 23

A Quantitative Approach to Understand Self-Supervised Models as Cross-lingual Feature Extractors

ICNLSP 2023

Shuyue Stella Li*, Beining Xu*, Xiangyu Zhang*, Hexin Liu, Wenhan Chao, Leibny Paola Garcia

We propose a linguistically motivated metric: Phonetic-Syntax Ratio (PSR) to quantify the composition of phonetic and syntactic contents in learned representations.
Multilinguality
May 22 - Nov. 22

Language Agnostic Code-Mixing Data Augmentation by Predicting Linguistic Patterns

Arxiv

Shuyue Stella Li & Kneton Murray

We establish a language-agnostic synthetic code-mixing algorithm that generates code-mixing datasets in a zero-cost fashion.
Language Processing
Mar. 22 - Oct. 22

PQLM: Multilingual Decentralized Portable Quantum Language Model for Privacy Protection

ICASSP 2023

Shuyue Stella Li*, Xiangyu Zhang*, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola Garcia

We propose a quantum language model that can easily transmit learned information to downstream classical models, demonstrating ad hoc model portability.
Signal Processing
Mar. 22 - Oct. 22

A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters

ICASSP 2023

Yu Xuan, Xiangyu Zhang, Shuyue Stella Li, Zihan Shen, Leibny Paola Garcia, Roberto Togneri

This project uses a convex combination of adaptive filters to detect the fetal heart rate from the ECG signals from the mother's abdominal and thorax. Our algorithm out-performs both single adaptive filters and convex combination of RLS filters
Speech Processing
Feb. 22 - Oct. 22

End-to-End Lyrics Recognition with Self-supervised Learning

Arxiv

Xiangyu Zhang, Shuyue Stella Li, Zhanhong He, Roberto Togneri, Leibny Paola Garcia

We design an end-to-end lyrics recognition model using self-supervised models trained on spoken speech and out-performed previous SOTA.
Genetic Programming
May 21 - Jul. 22

Genetic improvement in the shackleton framework for optimizing LLVM pass sequences

GECCO 2022 Best Presentation Award

Shuyue Stella Li, Hannah Peeler, Andrew Sloss, Kenneth Reid, Yuan Yuan, Wolfgang Banzhaf

We present the novel use of genetic improvement to find problem-specific optimized LLVM Pass sequences. This demonstrates the flexibility of the Shackleton Framework in being applied to a variety of use cases.
Genetic Programming
May 21 - Jul. 22

Optimizing LLVM pass sequences with shackleton: a linear genetic programming framework

GECCO 2022

Hannah Peeler, Shuyue Stella Li, Andrew Sloss, Kenneth Reid, Yuan Yuan, Wolfgang Banzhaf

We design and implement a novel GP algorithm for LLVM compiler flag optimization, achieving a 20% runtime improvement over strong default baseline.
Machine Learning
Sep. 20 - Jan. 22

Community Detection in Real World Complex Mobility Networks

JHU Design Synposium 2022

Weicheng Hu, Shuyue Stella Li, Shanelle Cao, Bohan Hou, TJ Bai, Anton Dahbura

We design an expansion-based clustering algorithm for community detection using covid-19 mobility data to optimize disease simulations model.
Comptutational Biology
May. 18 - Aug. 18

Analyzing the Topological Transformation Probability of DNA using Models of Cre Recombinase

UC Davis COSMOS Synposium 2018

Shuyue Stella Li, Janani Sekar, Jeffrey Yang

We model the activity of DNA recombinase CRE-lox and predict the topological transformations of DNA knots treated with Cre Recombinase. This research has potential pharmaceutical applications that can be furthered to improve the efficiency of enzyme activity in transforming circular DNA chains into the unknotted form.