Portrait de Xujie Si

Xujie Si

Membre affilié
Chaire en IA Canada-CIFAR
Professeur adjoint, University of Toronto, Département d'informatique

Biographie

Xujie Si est professeur adjoint au Département d'informatique de l'Université de Toronto. Il est également membre affilié de la faculté de l'Institut Vector et membre affilié de Mila – Institut québécois d’intelligence artificielle, où il est titulaire d'une chaire en IA Canada-CIFAR. Il a obtenu un doctorat de l'Université de Pennsylvanie en 2020. Il est également détenteur d’une maîtrise de l'Université Vanderbilt et d’une licence (avec mention) de l'Université de Nankai. Ses recherches se situent à l'intersection des langages de programmation et de l'intelligence artificielle. Il s'intéresse au développement de techniques basées sur l'apprentissage pour aider les programmeurs à construire plus facilement de meilleurs logiciels, à l'intégration de la programmation logique à des systèmes d'apprentissage différentiables afin de permettre un raisonnement interprétable et évolutif, et à l'exploitation des abstractions de programmation pour un apprentissage fiable et efficace en matière de données. Ses travaux ont été récompensés par le Prix du service distingué ACM-SIGPLAN et ont été présentés lors de conférences sur les langages de programmation et l'apprentissage automatique.

Étudiants actuels

Doctorat - McGill University
Maîtrise recherche - McGill University
Maîtrise recherche - McGill University
Maîtrise recherche - McGill University

Publications

Can ChatGPT Pass An Introductory Level Functional Language Programming Course?
Chuqin Geng
Yihan Zhang
Brigitte Pientka
The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in sol… (voir plus)ving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer science education. This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course. In our systematic evaluation, we treated ChatGPT as one of our students and demonstrated that it can achieve a grade B- and its rank in the class is 155 out of 314 students overall. Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives. Additionally, we identify several potential benefits that ChatGPT can offer to both groups. Overall, we believe that this study significantly clarifies and advances our understanding of ChatGPT's capabilities and potential impact on computer science education.
Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students
Chuqin Geng
Wenwen Xu
Yingjie Xu
Brigitte Pientka
Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering… (voir plus) the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.
Towards Reliable Neural Specifications
Chuqin Geng
Nham Le
Xiaojie Xu
Zhaoyue Wang
Arie Gurfinkel
Towards Reliable Neural Specifications
Chuqin Geng
Nham Le
Xiaojie Xu
Zhaoyue Wang
Arie Gurfinkel
Novice Type Error Diagnosis with Natural Language Models
Chuqin Geng
Haolin Ye
Yixuan Li
Tianyu Han
Brigitte Pientka
Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibili… (voir plus)ty makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.