I am a sixth (and final!) year PhD candidate at the Paul G. Allen School of Computer Science & Engineering at The University of Washington, where I am advised by Tim Althoff. I am affiliated with UW NLP.
Previously, I was a Student Researcher at Google Research, an ML Research Intern at Apple Health AI and a data scientist and the second full-time employee at HealthRhythms.
My work focuses on methods, datasets, and benchmarks for training and evaluating large models (including language models) on time series data and code. Recently I have been teaching multimodal language models to reason about time series, and building LLM agents that can make decisions and take actions based off of these inputs.
Language Models Still Struggle to Zero-shot Reason about Time Series
, Mingtian Tan, Vinayak Gupta, Tom Hartvigsen, and Tim Althoff
EMNLP, 2024 [PDF] [Data & Code]
BLADE: Benchmarking Language Model Agents for Data-Driven Science
Ken Gu, Ruoxi Shang, Ruien Jiang, Keying Kuang, Richard-John Lin, Donghe Lyu, Yue Mao, Youran Pan, Teng Wu, Jiaqian Yu, Yikun Zhang, Tianmai M. Zhang, Lanyi Zhu,
, Jeffrey Heer, and Tim AlthoffEMNLP, 2024 [PDF] [Data & Code]
Are Language Models Actually Useful for Time Series Forecasting?
Mingtian Tan,
, Vinayak Gupta, Tim Althoff, and Thomas HartvigsenNeurIPS [Spotlight 🔎], 2024 [PDF]
Transforming Wearable Data into Health Insights using Large Language Model Agents
, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, and Xin Liu
Preprint, 2024 [PDF] [Google Research Blog]
Homekit2020: A Benchmark for Time Series Classification on a Large Mobile Sensing Dataset with Laboratory Tested Ground Truth of Influenza Infections
, Esteban Safranchik, Arinbjorn Kolbeinsson, Piyusha Gade, Ernesto Ramirez, Ludwig Schmidt, Luca Foshchini, and Tim Althoff
CHIL, 2023 [PDF]
Self-supervised Pretraining and Transfer Learning Enable Flu and COVID-19 Predictions in Small Mobile Sensing Datasets
and Tim Althoff
CHIL, 2023 [PDF]
CORAL: COde RepresentAtion Learning with Weakly-Supervised Transformers for Analyzing Data Analysis
Ge Zhang,
, Yang Liu, Jeffrey Heer, and Tim AlthoffEPJ Data Science, 2022 [PDF] *Co-First Author
Globem dataset: Multi-year datasets for longitudinal human behavior modeling generalization
Xuhai Xu, Han Zhang, Yasaman Sefidgar, Yiyi Ren, Xin Liu, Woosuk Seo, Jennifer Brown, Kevin Kuehn,
, Paula Nurius, Shwetak Patel, Tim Althoff, Margaret E. Morris, Eve Riskin, Jennifer Mankoff, and Anind K. DeyNeurIPS, 2022 [PDF]
MULTIVERSE: Mining Collective Data Science Knowledge from Code on the Web to Suggest Alternative Analysis Approaches
, Ge Zhang, and Tim Althoff
KDD, 2021 [PDF]
CrossCheck: Integrating self-report, behavioral sensing, and smartphone use to identify digital indicators of psychotic relapse
Dror Ben-Zeev, Rachel Brian, Rui Wang, Weichen Wang, Andrew T. Campbell, Min S. H. Aung,
, Vincent W. S. Tseng, Tanzeem Choudhury, Marta Hauser, John M. Kane, and Emily A. SchererPsychiatric Rehabilitation Journal, 2017 [PDF]
CrossCheck: toward passive sensing and detection of mental health changes in people with schizophrenia
Rui Wang, Min S. H. Aung, Saeed Abdullah, Rachel Brian, Andrew T. Campbell, Tanzeem Choudhury, Marta Hauser, John Kane,
, Emily A. Scherer, Vincent W. S. Tseng, and Dror Ben-ZeevUbicomp, 2016 [PDF]
Assessing mental health issues on college campuses: Preliminary findings from a pilot study
Vincent W. S. Tseng,
, Franziska Wittleder, Saeed Abdullah, Min Hane Aung, and Tanzeem ChoudhuryUbicomp, 2016 [PDF]