Chuqin Geng

SAT-DIFF: A Tree Diffing Framework Using SAT Solving

Chuqin Geng

Haolin Ye

Yihan Zhang

Brigitte Pientka

Computing differences between tree-structured data is a critical but challenging problem in software analysis. In this paper, we propose a n… (voir plus)ovel tree diffing approach called SatDiff, which reformulates the structural diffing problem into a MaxSAT problem. By encoding the necessary transformations from the source tree to the target tree, SatDiff generates correct, minimal, and type safe low-level edit scripts with formal guarantees. We then synthesize concise high-level edit scripts by effectively merging low-level edits in the appropriate topological order. Our empirical results demonstrate that SatDiff outperforms existing heuristic-based approaches by a significant margin in terms of conciseness while maintaining a reasonable runtime.

2024-04-06

ArXiv (prépublication)

Scalar Invariant Networks with Zero Bias

Chuqin Geng

Xiaojie Xu

Haolin Ye

Just like weights, bias terms are the learnable parameters of many popular machine learning models, including neural networks. Biases are th… (voir plus)ought to enhance the representational power of neural networks, enabling them to solve a variety of tasks in computer vision. However, we argue that biases can be disregarded for some image-related tasks such as image classification, by considering the intrinsic distribution of images in the input space and desired model properties from first principles. Our findings suggest that zero-bias neural networks can perform comparably to biased networks for practical image classification tasks. We demonstrate that zero-bias neural networks possess a valuable property called scalar (multiplication) invariance. This means that the prediction of the network remains unchanged when the contrast of the input image is altered. We extend scalar invariance to more general cases, enabling formal verification of certain convex regions of the input space. Additionally, we prove that zero-bias neural networks are fair in predicting the zero image. Unlike state-of-the-art models that may exhibit bias toward certain labels, zero-bias networks have uniform belief in all labels. We believe dropping bias terms can be considered as a geometric prior in designing neural network architecture for image classification, which shares the spirit of adapting convolutions as the transnational invariance prior. The robustness and fairness advantages of zero-bias neural networks may also indicate a promising path towards trustworthy and ethical AI.

2023-11-28

NeurIPS.cc/2023/Workshop/NeurReps (poster)

openreview.net

TorchProbe: Fuzzing Dynamic Deep Learning Compilers

Qidong Su

Chuqin Geng

Gennady G. Pekhimenko

Static and dynamic computational graphs represent two distinct approaches to constructing deep learning frameworks. The former prioritizes c… (voir plus)ompiler-based optimizations, while the latter focuses on programmability and user-friendliness. The recent release of PyTorch 2.0, which supports compiling arbitrary deep learning programs in Python, signifies a new direction in the evolution of deep learning infrastructure to incorporate compiler techniques in a more dynamic manner and support more dynamic language features like dynamic control flows and closures. Given PyTorch's seamless integration with Python, its compiler aims to support arbitrary deep learning code written in Python. However, the inherent dynamism of Python poses challenges to the completeness and robustness of the compiler. While recent research has introduced fuzzing to test deep learning compilers, there is still a lack of comprehensive analysis on how to test dynamic features. To address this issue, we propose several code transformations to generate test cases involving dynamic features. These transformations preserve the program's semantics, ensuring that any discrepancy between the transformed and original programs indicates the presence of a bug. Through our approach, we have successfully identified twenty previously unknown bugs in the PyTorch compiler and its underlying tensor compiler Triton.

2023-10-30

ArXiv (prépublication)

Can ChatGPT Pass An Introductory Level Functional Language Programming Course?

Chuqin Geng

Yihan Zhang

Brigitte Pientka

The recent introduction of ChatGPT has drawn significant attention from both industry and academia due to its impressive capabilities in sol… (voir plus)ving a diverse range of tasks, including language translation, text summarization, and computer programming. Its capability for writing, modifying, and even correcting code together with its ease of use and access is already dramatically impacting computer science education. This paper aims to explore how well ChatGPT can perform in an introductory-level functional language programming course. In our systematic evaluation, we treated ChatGPT as one of our students and demonstrated that it can achieve a grade B- and its rank in the class is 155 out of 314 students overall. Our comprehensive evaluation provides valuable insights into ChatGPT's impact from both student and instructor perspectives. Additionally, we identify several potential benefits that ChatGPT can offer to both groups. Overall, we believe that this study significantly clarifies and advances our understanding of ChatGPT's capabilities and potential impact on computer science education.

2023-04-29

ArXiv (prépublication)

Identifying Different Student Clusters in Functional Programming Assignments: From Quick Learners to Struggling Students

Chuqin Geng

Wenwen Xu

Yingjie Xu

Brigitte Pientka

Instructors and students alike are often focused on the grade in programming assignments as a key measure of how well a student is mastering… (voir plus) the material and whether a student is struggling. This can be, however, misleading. Especially when students have access to auto-graders, their grades may be heavily skewed. In this paper, we analyze student assignment submission data collected from a functional programming course taught at McGill university incorporating a wide range of features. In addition to the grade, we consider activity time data, time spent, and the number of static errors. This allows us to identify four clusters of students: "Quick-learning", "Hardworking", "Satisficing", and "Struggling" through cluster algorithms. We then analyze how work habits, working duration, the range of errors, and the ability to fix errors impact different clusters of students. This structured analysis provides valuable insights for instructors to actively help different types of students and emphasize different aspects of their overall course design. It also provides insights for students themselves to understand which aspects they still struggle with and allows them to seek clarification and adjust their work habits.

2023-03-03

Proceedings of the 54th ACM Technical Symposium on Computer Science Education V. 1 (publié)

Towards Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

2023-01-01

ICML (publié)

proceedings.mlr.press

openreview.net

Towards Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

2023-01-01

ICML (publié)

openreview.net

Novice Type Error Diagnosis with Natural Language Models

Chuqin Geng

Haolin Ye

Yixuan Li

Tianyu Han

Brigitte Pientka

Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibili… (voir plus)ty makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.

2022-10-07

ArXiv (prépublication)