About me

I am a third year Ph.D. student in the Allen School of Computer Science and Engineering at the University of Washington, advised by Yulia Tsvetkov. I do research in Natural Language Processing and Cognitive Modeling, with a growing emphasis on personalization and proactive learning (question-asking).

I'm particularly interested in using computational methods to model cognitive processes, including how humans reason, communicate uncertainty, and make decisions in complex domains like healthcare. My long-term goal is to build socially and cognitively aligned AI systems that support safer, more personalized, and equitable care.

Research interests: Proactive Learning, Social Reasoning, AI for Health, Safety & Reliability, and more!

Before grad school, I received my B.S. and M.S.E. at Johns Hopkins with majors in Computer Science, Cognitive Science (linguistics focus), and Applied Mathematics (statistics focus). I worked as a research assistant at the Center for Language and Speech Processing advised by Philipp Koehn and Kenton Murray.

Please contact me at stelli [at] cs.washington.edu if you are interested in my work!

  • Click here to view my CV (updated July 25)

     
  • I'm thinking about...

    • design icon

      Proactive Reasoning

      How to identify and proactively seek information using LLMs to improve model safety & reliability with statistical guarantee. How to make LLMs ask good questions? How do we model "intuition" in expert domains like medicine?

    • design icon

      Socially-Intelligent Personalization

      Modeling how different social groups express health concerns and interpret medical advice. Aiming to personalize AI systems for more equitable, culturally-aware health communication.

    News

    1. 2026-03

      Guest lecture at UBC NLP: "Proactive Question Asking for Reliable and Personalized LLMs." [Slides].

    2. 2026-02

      Check out our new paper "Cold-Start Personalization via Training-Free Priors from Structured World Models" that learns priors from population preferences for interactive personalization.

    3. 2026-01

      Our paper "🪩PrefDisco: Benchmarking Proactive Personalized Reasoning" got accepted to ICLR 2026🇧🇷!

    4. 2025-11

      Check out our new paper "Cognitive Foundations for Reasoning and Their Manifestation in LLMs" that extracts and analyze patterns in LLM and human reasoning.

    5. 2025-11

      Guest lecture at UT Austin Computational Discourse and NLG class on PrefPalette [Slides].

    6. 2025-08

      "PrefPalette: Personalized Preference Modeling with Latent Attributes" won a Spotlight at COLM 2025🏆!

    7. 2025-06

      Presenting Spurious Rewards at Cohere Labs [YouTube] [Slides].

    8. 2025-06

      Prompt engineering can elicit similar behaviors in models as RLVR does. We show that "Spurious Prompt" can boost Qwen2.5-Math MATH-500 performance by 20% as well‼️ Check out our new blogpost "Spurious Rewards and Spurious Prompts."

    9. 2025-05

      Doing RLVR on incorrect and even random rewards can boost Qwen2.5-Math MATH-500 performance by 20%🤯 We explore how and why this happens in our new paper "Spurious Rewards: Rethinking Training Signals in RLVR" (blogpost).

    Experience

  • Click here to view my CV (updated July 25)

     
  • Education

    1. University of Washington

      2023 — present | Seattle, WA

      Ph.D. in Computer Science and Engineering

      Advised by Yulia Tsvetkov.

    2. Johns Hopkins University

      2022 — 2023 | Baltimore, MD

      M.S.E. in Computer Science with Human Language Technology Concentration

      Advised by Philipp Koehn and Kenton Murray.

      Thesis: Learning from Gibberish: Code-Mixing Data Augmentation for Sentiment Analysis

    3. Johns Hopkins University

      2019 — 2022 | Baltimore, MD

      B.S. in Applied Mathemtics and Statistics

      Advised by Philipp Koehn and Ed Scheinerman.

      Other Majors: Computer Science, Cognitive Science (linguistics focus)

      Minor: Mathematics

    4. Stanford Online High School

      2018 — 2019 | Palo Alto, CA

      Dual enrollment program with a focus in advanced mathematics

    5. Robert Louis Stevenson School

      2016 — 2019 | Pebble Beach, CA

      Awards & Leadership: Cum Laude Society, USABO Semifinalist, USAMO Qualified, Bausch & Lomb National Science Award, Math Madness Silver Medalist, Math Team Captain, Spanish National Honor Society, Varsity Volleyball

    Teaching/TA Experience

    1. Ethics in AI: Teaching Assistant

      2025 Winter

      CSE 582

    2. Introduction to Statistics: Teaching Assistant

      2020 Spring, 2021 Fall, 2022 Spring, 2023 Spring

      EN.503.430 (undergrad) & EN.503.630 (grad) & EN.503.431 (honors)

    3. Artificial Intelligence: Course Assistant

      2023 Spring

      EN.601.464 (undergrad) & EN.601.664 (grad)

    4. Human-Computer Interaction: Course Assistant

      2022 Fall

      EN.601.490 (undergrad) & EN.601.690 (grad)

    5. Computer Ethics: Head Course Assistant

      2022 Summer

      EN.601.104

    6. Intermediate Programming: Course Assistant

      2020 Spring, 2021 Fall, 2022 Spring

      EN.601.220

    Work Experience

    1. Meta FAIR: Visiting Researcher

      2024 - Current | Seattle, WA

      Advised by Asli Celikyilmaz.

      Working on Social Alignment on the SAGE Team at Meta FAIR.

    2. Yext: Software Engineering Intern

      2022 Summer | Arlington, VA

      Integrated client data to Yext platform for real-time site information updates using Go.

      Created a Figma Style Picker to improve developer workflow and scalability using ReactJS.

    3. Michigan State University: Research Intern

      2021 Summer | East Lansing, MI

      Advised by Wolfgang Banzhaf.

      Designed and implemented novel GP algorithm for LLVM compiler flag optimization (20%).

      Published work at GECCO; second author of GP paper; first author of GI paper.

    4. Johns Hopkins Language and Cognition Lab: Research Assistant

      2020 - 2022 | Baltimore, MD

      Advised by Barbara Landau.

      Investigated developmental spatial cognition using Lego Block building.

      Created ML model for movement prediction and stability analysis using motion sensor data.

    Publications

    Below is a list of projects for which I was very involved in (lead/co-lead/contributed significantly). For a more comprehensive list of papers, check out my Google Scholars page. I also try to record the time that I spent on each project in case anyone finds it helpful!

    • Cold-Start Personalization via Training-Free Priors from Structured World Models

      Personalization

      Cold-Start Personalization via Training-Free Priors from Structured World Models

      Preprint

      Avinandan Bose*, Shuyue Stella Li*, Faeze Brahman, Pang Wei Koh, Simon Shaolei Du, Yulia Tsvetkov, Maryam Fazel, Lin Xiao, Asli Celikyilmaz

      When no user-specific data is available, we propose to learn priors from population preferences for interactive personalization.

    • Cognitive Foundations for Reasoning and Their Manifestation in LLMs

      Cognitive Reasoning

      Cognitive Foundations for Reasoning and Their Manifestation in LLMs

      Preprint

      Priyanka Kargupta*, Shuyue Stella Li*, Haocheng Wang, Jinu Lee, Shan Chen, Orevaoghene Ahia, Dean Light, Thomas L. Griffiths, Max Kleiman-Weiner, Jiawei Han, Asli Celikyilmaz, Yulia Tsvetkov

      What is reasoning? We introduce a taxonomy for cognitive elements used in human reasoning and analyze how LLMs exhibit these elements in their reasoning processes.

    • 🪩PrefDisco: Benchmarking Proactive Personalized Reasoning

      Personalization

      🪩PrefDisco: Benchmarking Proactive Personalized Reasoning

      ICLR 2026

      Shuyue Stella Li*, Avinandan Bose*, Faeze Brahman, Simon Shaolei Du, Pang Wei Koh, Maryam Fazel, Yulia Tsvetkov

      We propose PrefDisco, a benchmark for proactive personalized reasoning where models need to ask questions to the user to learn their preferences then adapt their reasoning and response accordingly.

    • PrefPalette: Personalized Preference Modeling with Latent Attributes

      Personalization

      PrefPalette: Personalized Preference Modeling with Latent Attributes

      COLM 2025 Spotlight 🏆

      Shuyue Stella Li, Melanie Sclar, Hunter Lang, Ansong Ni, Jacqueline He, Puxin Xu, Andrew Cohen, Chan Young Park, Yulia Tsvetkov, Asli Celikyilmaz

      Grounded in multi-attribute decision making from cognitive science, we propose PrefPalette, a framework for learning preference models with additional signals from latent social attributes (e.g., humor, cultural values).

    • Spurious Rewards: Rethinking Training Signals in RLVR

      RL & Post Training

      Spurious Rewards: Rethinking Training Signals in RLVR

      Preprint

      Rulin Shao*, Shuyue Stella Li*, Rui Xin*, Scott Geng*, Yiping Wang, Sewoong Oh, Simon Shaolei Du, Nathan Lambert, Sewon Min, Ranjay Krishna, Yulia Tsvetkov, Hannaneh Hajishirzi, Pang Wei Koh, Luke Zettlemoyer

      We show that RLVR on incorrect and even random rewards can boost model performance on some models but not others, and investigate why.

    • A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

      Privacy & Safety

      A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

      SaTML 2026

      Rui Xin, Niloofar Mireshghallah, Shuyue Stella Li, Michael Duan, Hyunwoo Kim, Yejin Choi, Yulia Tsvetkov, Sewoong Oh, Pang Wei Koh

      We evaluate the effectiveness of sanitization methods in removing sensitive information from text data and show previously undetected semantic leakage.

    • ALFA: attribute-guided alignment for question-asking

      Clinical Reasoning, Post Training

      ALFA: Aligning LLMs to Ask Good Questions - A Case Study in Clinical Reasoning

      COLM 2025

      Shuyue Stella Li*, Jimin Mun*, Faeze Brahman, Jonathan S. Ilgen, Yulia Tsvetkov, Maarten Sap

      Guided by attributes from clinical communications and psychology, we generate synthetic paired data to align LLMs to ask good questions.

    • ValueScope: social norms and values detector

      Social Reasoning

      ValueScope: Unveiling Implicit Norms and Values via Return Potential Model of Social Interactions

      EMNLP 2024

      Chan Young Park*, Shuyue Stella Li*, Hayoung Jung*, Svitlana Volkova, Tanushree Mitra, David Jurgens, Yulia Tsvetkov

      We developed a computational framework to model and discover implicit social norms and values in online communities at scale.

    • MediQ: interactive medical consultation framework

      Clinical Reasoning

      MediQ: Question-Asking LLMs and a Benchmark for Reliable Interactive Clinical Reasoning

      Neurips 2024

      Shuyue Stella Li, Vidhisha Balachandran, Shangbin Feng, Jonathan Ilgen, Emma Pierson, Pang Wei Koh, Yulia Tsvetkov

      We establish a novel framework for interactive information seeking to enhance reliable medical reasoning abilities in LLMs.

    Photography