We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Publications
Explainable Machine Learning Model to Predict COVID-19 Severity Among Older Adults in the Province of Quebec.
Context: Patients over the age of 65 years are more likely to experience higher severity and mortality rates than other populations from COV… (see more)ID-19. Clinicians need assistance in supporting their decisions regarding the management of these patients. Artificial Intelligence (AI) can help with this regard. However, the lack of explainability-defined as "the ability to understand and evaluate the internal mechanism of the algorithm/computational process in human terms"-of AI is one of the major challenges to its application in health care. We know little about application of explainable AI (XAI) in health care. Objective: In this study, we aimed to evaluate the feasibility of the development of explainable machine learning models to predict COVID-19 severity among older adults. Design: Quantitative machine learning methods. Setting: Long-term care facilities within the province of Quebec. Participants: Patients 65 years and older presented to the hospitals who had a positive polymerase chain reaction test for COVID-19. Intervention: We used XAI-specific methods (e.g., EBM), machine learning methods (i.e., random forest, deep forest, and XGBoost), as well as explainable approaches such as LIME, SHAP, PIMP, and anchor with the mentioned machine learning methods. Outcome measures: Classification accuracy and area under the receiver operating characteristic curve (AUC). Results: The age distribution of the patients (n=986, 54.6% male) was 84.5□19.5 years. The best-performing models (and their performance) were as follows. Deep forest using XAI agnostic methods LIME (97.36% AUC, 91.65 ACC), Anchor (97.36% AUC, 91.65 ACC), and PIMP (96.93% AUC, 91.65 ACC). We found alignment with the identified reasoning of our models' predictions and clinical studies' findings-about the correlation of different variables such as diabetes and dementia, and the severity of COVID-19 in this population. Conclusions: The use of explainable machine learning models, to predict the severity of COVID-19 among older adults is feasible. We obtained a high-performance level as well as explainability in the prediction of COVID-19 severity in this population. Further studies are required to integrate these models into a decision support system to facilitate the management of diseases such as COVID-19 for (primary) health care providers and evaluate their usability among them.
Disentangling poststroke cognitive deficits and their neuroanatomical correlates through combined multivariable and multioutcome lesion‐symptom mapping
When fine-tuning large neural networks, it is common to use multiple nodes and to communicate gradients at each optimization step. By contra… (see more)st, we investigate completely local fine-tuning, which we refer to as lo-fi. During lo-fi, each node fine-tunes independently without any communication. Then, the weights are averaged across nodes at the conclusion of fine-tuning. When fine-tuning DeiT-base and DeiT-large on ImageNet, this procedure matches accuracy in-distribution and improves accuracy under distribution shift compared to the baseline, which observes the same amount of data but communicates gradients at each step. We also observe that lo-fi matches the baseline's performance when fine-tuning OPT language models (up to 1.3B parameters) on Common Crawl. By removing the communication requirement, lo-fi reduces resource barriers for fine-tuning large models and enables fine-tuning in settings with prohibitive communication cost.
A Framework for Obtaining Accurate Posteriors of Strong Gravitational Lensing Parameters with Flexible Priors and Implicit Likelihoods Using Density Estimation
We report the application of implicit likelihood inference to the prediction of the macroparameters of strong lensing systems with neural ne… (see more)tworks. This allows us to perform deep-learning analysis of lensing systems within a well-defined Bayesian statistical framework to explicitly impose desired priors on lensing variables, obtain accurate posteriors, and guarantee convergence to the optimal posterior in the limit of perfect performance. We train neural networks to perform a regression task to produce point estimates of lensing parameters. We then interpret these estimates as compressed statistics in our inference setup and model their likelihood function using mixture density networks. We compare our results with those of approximate Bayesian neural networks, discuss their significance, and point to future directions. Based on a test set of 100,000 strong lensing simulations, our amortized model produces accurate posteriors for any arbitrary confidence interval, with a maximum percentage deviation of 1.4% at the 21.8% confidence level, without the need for any added calibration procedure. In total, inferring 100,000 different posteriors takes a day on a single GPU, showing that the method scales well to the thousands of lenses expected to be discovered by upcoming sky surveys.
Current machine learning algorithms are successful in learning clearly defined tasks from large i.i.d. data. Continual learning (CL) require… (see more)s learning without iid-ness and developing algorithms capable of knowledge retention and transfer, the latter can be boosted through systematic generalization. Dropping the i.i.d. assumption requires replacing it with another hypothesis. While there are several candidates, here we advocate that the independent mechanism assumption (IM) (Sch¨olkopf et al., 2012) is a useful hypothesis for representing knowledge in a form, that makes it easy to adapt to new tasks in CL. Specifically, we review several types of distribution shifts that are common in CL and point out in which way a system that represents knowledge in the form of causal modules may outperform monolithic counterparts in CL. Intuitively, the efficacy of IM solution emerges since (i) causal modules learn mechanisms invariant across domains; (ii) if causal mechanisms must be updated, modularity can enable efficient and sparse updates.
Logging is a common practice in traditional software development. Several research works have been done to investigate the different charact… (see more)eristics of logging practices in traditional software systems (e.g., Android applications, JAVA applications, C/C++ applications). Nowadays, we are witnessing more and more development of Machine Learning-based applications (ML-based applications). Today, there are many popular libraries that facilitate and contribute to the development of such applications, among which we can mention: Pytorch, Tensorflow, Theano, MXNet, Scikit-Learn, Caffe, and Keras. Despite the popularity of ML, we don't have a clear understanding of logging practices in ML applications. In this paper, we aim to fill this knowledge gap and help ML practitioners understand the characteristics of logging in ML-based applications. In particular, we conduct an empirical study on 110 open-source ML-based applications. Through a quantitative analysis, we find that logging practice in ML-based applications is less pervasive than in traditional applications including Android, JAVA, and C/C++ applications. Furthermore, the majority of logging statements in ML-based applications are in info and warn levels, compared to traditional applications where info is the majority of logging statement in C/C++ application and debug, error levels constitute the majority of logging statement in Android application. We also perform a quantitative and qualitative analysis of a random sample of logging statements to understand where ML developers put most of logging statements and examine why and how they are using logging. These analyses led to the following observations: (i) ML developers put most of the logging statements in model training, and in non-ML components. (ii) Data and model management appear to be the main reason behind the introduction of logging statements in ML-based applications.
The BigCode project is an open-scientific collaboration working on the responsible development of large language models for code. This tech … (see more)report describes the progress of the collaboration until December 2022, outlining the current state of the Personally Identifiable Information (PII) redaction pipeline, the experiments conducted to de-risk the model architecture, and the experiments investigating better preprocessing methods for the training data. We train 1.1B parameter models on the Java, JavaScript, and Python subsets of The Stack and evaluate them on the MultiPL-E text-to-code benchmark. We find that more aggressive filtering of near-duplicates can further boost performance and, surprisingly, that selecting files from repositories with 5+ GitHub stars deteriorates performance significantly. Our best model outperforms previous open-source multilingual code generation models (InCoder-6.7B and CodeGen-Multi-2.7B) in both left-to-right generation and infilling on the Java, JavaScript, and Python portions of MultiPL-E, despite being a substantially smaller model. All models are released under an OpenRAIL license at https://hf.co/bigcode.