My current research aims to highlight how we can leverage novel data sources and algorithms to address pressing issues around young adults’ academic success and well-being, especially for those from disadvantaged communities. I bring together methods from machine learning, natural language processing, causal inference and network science for different inquiries. Specific topics are described below.

Online Learning and Instruction

Through an NSF-funded project, this strand identifies learning-conducive behavioral patterns and instructional designs from system logs of virtual learning environments, in order to advance learning sciences and inform personalized, just-in-time support especially for meta-cognitive skills.

Student Life and College Success

Supported by the NSF project above and a Mellon-funded project, this strand investigates digitized records of students’ everyday life to understand how achievement gaps accumulate through day-to-day experience and develop data products to facilitate institutional effort to support college success.

Education, Occupation and Future of Work

In collaboration with IBM Research and IE University, this recent strand measures the alignment between curricular content and occupational requirements and its relationship with labor market outcomes. The practical goal is to support more informed decision making for students, institutions and employers.

Ethics of Predictive Analytics (in Education)

This umbrella theme bridges decades of research on educational inequality and the recent literature on AI ethics. It is embedded in all previous strands mostly through the scrutiny of fairness and interpretability of predictive models and through the focus on equity outcomes.