Portrait of Xujie Si

Xujie Si

Affiliate Member
Assistant Professor, University of Toronto, Department of Computer Science
Research Topics
Learning to Program
Reasoning
Representation Learning

Biography

Xujie Si is an assistant professor in the Department of Computer Science, University of Toronto. He is also an affiliate member at Vector Institute and an affiliate member at Mila – Quebec Artificial Intelligence Institute where he held a Canada CIFAR AI Chair from 2021 to 2025.

Si received his PhD from the University of Pennsylvania in 2020, his master's degree from Vanderbilt University, and his bachelor's degree (with Honors) from Nankai University.

Si’s research lies at the intersection of programming languages and AI. He is generally interested in developing learning-based techniques to help programmers build better software with less effort, integrating logic programming with differentiable learning systems for interpretable and scalable reasoning, and leveraging programming abstractions for reliable and data-efficient learning.

His work has been honoured with an ACM Special Interest Group on Programming Languages (SIGPLAN) Distinguished Paper Award, as well as highlighted at top conferences on programming languages and machine learning.

Current Students

PhD - McGill University
PhD - McGill University
Co-supervisor :

Publications

Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students
Wenwen Xu
Yingjie Xu
Brigitte Pientka
Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering… (see more) the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.
Scalar Invariant Networks with Zero Bias
Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are th… (see more)ought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe dropping bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.
Towards Reliable Neural Specifications
Nham Le
Zhaoyue Wang
Arie Gurfinkel
Novice Type Error Diagnosis with Natural Language Models
Haolin Ye
Tianyu Han
Brigitte Pientka
Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibili… (see more)ty makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.
Toward Reliable Neural Specifications
Nham Le
Zhaoyue Wang
Arie Gurfinkel
We propose a new family of specifications called neural as specification , which uses the intrinsic information of neural networks — neu… (see more)ral activation patterns (NAP), rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to mining dominant neural activation patterns. We analyze NAPs from a statistical point of view and find that a single can cover a large number of training and testing data points whereas ad hoc data-as-specification only covers the given reference data point. To show the effectiveness of discovered NAPs, we formally important properties, as various types of misclassifications happen for a and is no-ambiguity between different We show that by using we can verify the prediction of the space , of the we is a and for abstract the state of each neuron to only activated and deactivated by leveraging NAPs. We would like to explore more refined abstractions such as { ( −∞ ] , (0 , 1] , (1 , ∞ ] } in future work.
Techniques for Symbol Grounding with SATNet
Sever Topan
Many experts argue that the future of artificial intelligence is limited by the field's ability to integrate symbolic logical reasoning into… (see more) deep learning architectures. The recently proposed differentiable MAXSAT solver, SATNet, was a breakthrough in its capacity to integrate with a traditional neural network and solve visual reasoning problems. For instance, it can learn the rules of Sudoku purely from image examples. Despite its success, SATNet was shown to succumb to a key challenge in neurosymbolic systems known as the Symbol Grounding Problem: the inability to map visual inputs to symbolic variables without explicit supervision ("label leakage"). In this work, we present a self-supervised pre-training pipeline that enables SATNet to overcome this limitation, thus broadening the class of problems that SATNet architectures can solve to include datasets where no intermediary labels are available at all. We demonstrate that our method allows SATNet to attain full accuracy even with a harder problem setup that prevents any label leakage. We additionally introduce a proofreading method that further improves the performance of SATNet architectures, beating the state-of-the-art on Visual Sudoku.