Chuqin Geng

Towards Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

Xujie Si

2022-12-31

ICML (published)

proceedings.mlr.press

Novice Type Error Diagnosis with Natural Language Models

Chuqin Geng

Haolin Ye

Yixuan Li

Tianyu Han

Brigitte Pientka

Xujie Si

Strong static type systems help programmers eliminate many errors without much burden of supplying type annotations. However, this flexibili… (see more)ty makes it highly non-trivial to diagnose ill-typed programs, especially for novice programmers. Compared to classic constraint solving and optimization-based approaches, the data-driven approach has shown great promise in identifying the root causes of type errors with higher accuracy. Instead of relying on hand-engineered features, this work explores natural language models for type error localization, which can be trained in an end-to-end fashion without requiring any features. We demonstrate that, for novice type error diagnosis, the language model-based approach significantly outperforms the previous state-of-the-art data-driven approach. Specifically, our model could predict type errors correctly 62% of the time, outperforming the state-of-the-art Nate's data-driven model by 11%, in a more rigorous accuracy metric. Furthermore, we also apply structural probes to explain the performance difference between different language models.

2022-10-06

ArXiv (preprint)

doi.org

arxiv.org

Toward Reliable Neural Specifications

Chuqin Geng

Nham Le

Xiaojie Xu

Zhaoyue Wang

Arie Gurfinkel

Xujie Si

We propose a new family of speciﬁcations called neural as speciﬁcation , which uses the intrinsic information of neural networks — neu… (see more)ral activation patterns (NAP), rather than input data to specify the correctness and/or robustness of neural network predictions. We present a simple statistical approach to mining dominant neural activation patterns. We analyze NAPs from a statistical point of view and ﬁnd that a single can cover a large number of training and testing data points whereas ad hoc data-as-speciﬁcation only covers the given reference data point. To show the effectiveness of discovered NAPs, we formally important properties, as various types of misclassiﬁcations happen for a and is no-ambiguity between different We show that by using we can verify the prediction of the space , of the we is a and for abstract the state of each neuron to only activated and deactivated by leveraging NAPs. We would like to explore more reﬁned abstractions such as { ( −∞ ] , (0 , 1] , (1 , ∞ ] } in future work.

2021-12-31

arXiv.org (preprint)

doi.org

openreview.net

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Chuqin Geng

Publications

Mila on Udemy

AI Policy Fellowship Publications

Mila Ventures Launchpad

Popular keywords:

Chuqin Geng

Publications