The next cohort of our program, designed to empower policy professionals with a comprehensive understanding of AI, will take place in Ottawa on November 28 and 29.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Privacy-Aware Compression for Federated Learning Through Numerical Mechanism Design
Current protein language models (PLMs) learn protein representations mainly based on their sequences, thereby well capturing co-evolutionary… (see more) information, but they are unable to explicitly acquire protein functions, which is the end goal of protein representation learning. Fortunately, for many proteins, their textual property descriptions are available, where their various functions are also described. Motivated by this fact, we first build the ProtDescribe dataset to augment protein sequences with text descriptions of their functions and other important properties. Based on this dataset, we propose the ProtST framework to enhance Protein Sequence pre-training and understanding by biomedical Texts. During pre-training, we design three types of tasks, i.e., unimodal mask prediction, multimodal representation alignment and multimodal mask prediction, to enhance a PLM with protein property information with different granularities and, at the same time, preserve the PLM’s original representation power. On downstream tasks, ProtST enables both supervised learning and zero-shot prediction. We verify the superiority of ProtST-induced PLMs over previous ones on diverse representation learning benchmarks. Under the zero-shot setting, we show the effectiveness of ProtST on zero-shot protein classification, and ProtST also enables functional protein retrieval from a large-scale database without any function annotation.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (published)
Multivariate probabilistic time series forecasts are commonly evaluated via proper scoring rules, i.e., functions that are minimal in expect… (see more)ation for the ground-truth distribution. However, this property is not sufficient to guarantee good discrimination in the non-asymptotic regime. In this paper, we provide the first systematic finite-sample study of proper scoring rules for time-series forecasting evaluation. Through a power analysis, we identify the"region of reliability"of a scoring rule, i.e., the set of practical conditions where it can be relied on to identify forecasting errors. We carry out our analysis on a comprehensive synthetic benchmark, specifically designed to test several key discrepancies between ground-truth and forecast distributions, and we gauge the generalizability of our findings to real-world tasks with an application to an electricity production problem. Our results reveal critical shortcomings in the evaluation of multivariate probabilistic forecasts as commonly performed in the literature.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (published)
We consider the problem of performing Bayesian inference in probabilistic models where observations are accompanied by uncertainty, referred… (see more) to as "uncertain evidence.'' We explore how to interpret uncertain evidence, and by extension the importance of proper interpretation as it pertains to inference about latent variables. We consider a recently-proposed method "distributional evidence'' as well as revisit two older methods: Jeffrey's rule and virtual evidence. We devise guidelines on how to account for uncertain evidence and we provide new insights, particularly regarding consistency. To showcase the impact of different interpretations of the same uncertain evidence, we carry out experiments in which one interpretation is defined as "correct.'' We then compare inference results from each different interpretation illustrating the importance of careful consideration of uncertain evidence.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (published)
Slot attention is a powerful method for object-centric modeling in images and videos. However, its set-equivariance limits its ability to ha… (see more)ndle videos with a dynamic number of objects because it cannot break ties. To overcome this limitation, we first establish a connection between slot attention and optimal transport. Based on this new perspective we propose **MESH** (Minimize Entropy of Sinkhorn): a cross-attention module that combines the tiebreaking properties of unregularized optimal transport with the speed of regularized optimal transport. We evaluate slot attention using MESH on multiple object-centric learning benchmarks and find significant improvements over slot attention in every setting.
2023-07-03
Proceedings of the 40th International Conference on Machine Learning (published)
Stochastic min-max optimization has gained interest in the machine learning community with the advancements in GANs and adversarial training… (see more). Although game optimization is fairly well understood in the deterministic setting, some issues persist in the stochastic regime. Recent work has shown that stochastic gradient descent-ascent methods such as the optimistic gradient are highly sensitive to noise or can fail to converge. Although alternative strategies exist, they can be prohibitively expensive. We introduce Omega, a method with optimistic-like updates that mitigates the impact of noise by incorporating an EMA of historic gradients in its update rule. We also explore a variation of this algorithm that incorporates momentum. Although we do not provide convergence guarantees, our experiments on stochastic games show that Omega outperforms the optimistic gradient method when applied to linear players.
The potential of automatic code generation through Model-Driven Engineering (MDE) frameworks has yet to be realized. Beyond their ability to… (see more) help software professionals write more accurate, reusable code, MDE frameworks could make programming accessible for a new class of domain experts. However, domain experts have been slow to embrace these tools, as they still need to learn how to specify their applications' requirements using the concrete syntax (i.e., textual or graphical) of the new and unified domain-specific language. Conversational interfaces (chatbots) could smooth the learning process and offer a more interactive way for domain experts to specify their application requirements and generate the desired code. If integrated with MDE frameworks, chatbots may offer domain experts with richer domain vocabulary without sacrificing the power of agnosticism that unified modelling frameworks provide. In this paper, we discuss the challenges of integrating chatbots within MDE frameworks and then examine a specific application: the auto-generation of smart contract code based on conversational syntax. We demonstrate how this can be done and evaluate our approach by conducting a user experience survey to assess the usability and functionality of the chatbot framework. The paper concludes by drawing attention to the potential benefits of leveraging Language Models (LLMs) in this context.
2023-07-01
2023 IEEE International Conference on Software Services Engineering (SSE) (published)