Xujie Si

Affiliate Member

Canada CIFAR AI Chair

Assistant Professor, University of Toronto, Department of Computer Science

Biography

Xujie Si is an assistant professor in the Department of Computer Science, University of Toronto. He is also an affiliate member at Vector Institute and an affiliate member at Mila – Quebec Artificial Intelligence Institute, where he holds a Canada CIFAR AI Chair.

Si received his PhD from the University of Pennsylvania in 2020, his master's degree from Vanderbilt University, and his bachelor's degree (with Honors) from Nankai University.

Si’s research lies at the intersection of programming languages and AI. He is generally interested in developing learning-based techniques to help programmers build better software with less effort, integrating logic programming with differentiable learning systems for interpretable and scalable reasoning, and leveraging programming abstractions for reliable and data-efficient learning.

His work has been honoured with an ACM Special Interest Group on Programming Languages (SIGPLAN) Distinguished Paper Award, as well as highlighted at top conferences on programming languages and machine learning.

Current Students

Chuqin Geng

PhD - McGill University

chuqin.geng@mila.quebec

Github

Considine Breandan

Postdoctorate - McGill University

Principal supervisor :

Xiaojie Xu

Master's Research - McGill University

xiaojie.xu@mila.quebec

Jaylene Yihan

Master's Research - McGill University

yihan.zhang@mila.quebec

Rebecca Wang

Master's Research - McGill University

zhaoyue.wang@mila.quebec

Ray Luo

PhD - McGill University

Co-supervisor :

Doina Precup

luo.ziyan@mila.quebec

Website

Github

Google Scholar

Publications

Can ChatGPT Pass An Introductory Level Functional Language Programming Course?

Chuqin Geng

Yihan Zhang

Brigitte Pientka

Xujie Si

The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in sol… (see more)ving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer science education. This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course. In our systematic evaluation, we treated ChatGPT as one of our students and demonstrated that it can achieve a grade B- and its rank in the class is 155 out of 314 students overall. Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives. Additionally, we identify several potential benefits that ChatGPT can offer to both groups. Overall, we believe that this study significantly clarifies and advances our understanding of ChatGPT's capabilities and potential impact on computer science education.

2023-04-29

ArXiv (preprint)

doi.org

arxiv.org

Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students

Chuqin Geng

Wenwen Xu

Yingjie Xu

Brigitte Pientka

Xujie Si

Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering… (see more) the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.

2023-03-03

Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (published)

doi.org

arxiv.org

Towards Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

Xujie Si

2023-01-01

ICML (published)

proceedings.mlr.press

openreview.net

Towards Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

Xujie Si

2023-01-01

ICML (published)

openreview.net

Novice Type Error Diagnosis with Natural Language Models

Chuqin Geng

Haolin Ye

Yixuan Li

Tianyu Han

Brigitte Pientka

Xujie Si

Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibili… (see more)ty makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.

2022-10-07

ArXiv (preprint)

doi.org

arxiv.org

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Xujie Si

Biography

Current Students

Publications

AI Research Driven by Real-World Problems

AI Policy Compass

Student Life and Resources

Popular keywords:

Xujie Si

Biography

Current Students

Publications