Un incubateur à temps plein de 4 mois à Mila, conçu spécifiquement pour les fondateurs et fondatrices de la deep tech issus de parcours d'élite en STIM.
Avantage IA : productivité dans la fonction publique
Apprenez à tirer parti de l’IA générative pour soutenir et améliorer votre productivité au travail. La prochaine cohorte se déroulera en ligne les 28 et 30 avril 2026.
Nous utilisons des témoins pour analyser le trafic et l’utilisation de notre site web, afin de personnaliser votre expérience. Vous pouvez désactiver ces technologies à tout moment, mais cela peut restreindre certaines fonctionnalités du site. Consultez notre Politique de protection de la vie privée pour en savoir plus.
Paramètre des cookies
Vous pouvez activer et désactiver les types de cookies que vous souhaitez accepter. Cependant certains choix que vous ferez pourraient affecter les services proposés sur nos sites (ex : suggestions, annonces personnalisées, etc.).
Cookies essentiels
Ces cookies sont nécessaires au fonctionnement du site et ne peuvent être désactivés. (Toujours actif)
Cookies analyse
Acceptez-vous l'utilisation de cookies pour mesurer l'audience de nos sites ?
Lecteur Multimédia
Acceptez-vous l'utilisation de cookies pour afficher et vous permettre de regarder les contenus vidéo hébergés par nos partenaires (YouTube, etc.) ?
Spoken language understanding system is traditionally designed as a pipeline of a number of components. First, the audio signal is processed… (voir plus) by an automatic speech recognizer for transcription or n-best hypotheses. With the recognition results, a natural language understanding system classifies the text to structured data as domain, intent and slots for down-streaming consumers, such as dialog system, hands-free applications. These components are usually developed and optimized independently. In this paper, we present our study on an end-to-end learning system for spoken language understanding. With this unified approach, we can infer the semantic meaning directly from audio features without the intermediate text representation. This study showed that the trained model can achieve reasonable good result and demonstrated that the model can capture the semantic attention directly from the audio features.
2018-04-14
2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (publié)
Recent advance of deep learning has been transforming the landscape in many domains. However, understanding the predictions of a deep networ… (voir plus)k remains a challenge, which is especially sensitive in health care domains as interpretability is key. Techniques that rely on saliency maps -highlighting the region of an image that influence the classifier’s decision the mostare often used for that purpose. However, gradients fluctuation make saliency maps noisy and thus difficult to interpret at a human level. Moreover, models tend to focus on one particular influential region of interest (ROI) in the image, even though other regions might be relevant for the decision. We propose a new framework that refines those saliency maps to generate segmentation masks over the ROI on the initial image. In a second contribution, we propose to apply those masks over the original inputs, then evaluate our classifier on the masked inputs to identify previously overlooked ROI. This iterative procedure allows us to emphasize new region of interests by extracting meaningful information from the saliency maps.
Deep networks have achieved impressive results across a variety of important tasks. However a known weakness is a failure to perform well wh… (voir plus)en evaluated on data which differ from the training distribution, even if these differences are very small, as is the case with adversarial examples. We propose Fortified Networks, a simple transformation of existing networks, which fortifies the hidden layers in a deep network by identifying when the hidden states are off of the data manifold, and maps these hidden states back to parts of the data manifold where the network performs well. Our principal contribution is to show that fortifying these hidden states improves the robustness of deep networks and our experiments (i) demonstrate improved robustness to standard adversarial attacks in both black-box and white-box threat models; (ii) suggest that our improvements are not primarily due to the gradient masking problem and (iii) show the advantage of doing this fortification in the hidden layers instead of the input space.
We extend the neural Turing machine (NTM) model into a dynamic neural Turing machine (D-NTM) by introducing trainable address vectors. This … (voir plus)addressing scheme maintains for each memory cell two separate vectors, content and address vectors. This allows the D-NTM to learn a wide variety of location-based addressing strategies, including both linear and nonlinear ones. We implement the D-NTM with both continuous and discrete read and write mechanisms. We investigate the mechanisms and effects of learning to read and write into a memory through experiments on Facebook bAbI tasks using both a feedforward and GRU controller. We provide extensive analysis of our model and compare different variations of neural Turing machines on this task. We show that our model outperforms long short-term memory and NTM variants. We provide further experimental results on the sequential [Formula: see text]MNIST, Stanford Natural Language Inference, associative recall, and copy tasks.
Although deep nets have resulted in high accuracies for various visual tasks, their computational and space requirements are prohibitively h… (voir plus)igh for inclusion on devices without high-end GPUs. In this paper, we introduce a neuron/filter level pruning framework based on Fisher's LDA which leads to high accuracies for a wide array of facial trait classification tasks, while significantly reducing space/computational complexities. The approach is general and can be applied to convolutional, fully-connected, and module-based deep structures, in all cases leveraging the high decorrelation of neuron activations found in the pre-decision layer and cross-layer deconv dependency. Experimental results on binary and multi-category facial traits from the LFWA and Adience datasets illustrate the framework's comparable/better performance to state-of-the-art pruning approaches and compact structures (e.g. SqueezeNet, MobileNet). Ours successfully maintains comparable accuracies even after discarding most parameters (98%-99% for VGG-16, 82% for GoogLeNet) and with significant FLOP reductions (83% for VGG-16, 64% for GoogLeNet).
With deep learning's success, a limited number of popular deep nets have been widely adopted for various vision tasks. However, this usually… (voir plus) results in unnecessarily high complexities and possibly many features of low task utility. In this paper, we address this problem by introducing a task-dependent deep pruning framework based on Fisher's Linear Discriminant Analysis (LDA). The approach can be applied to convolutional, fully-connected, and module-based deep network structures, in all cases leveraging the high decorrelation of neuron motifs found in the pre-decision layer and cross-layer deconv dependency. Moreover, we examine our approach's potential in network architecture search for specific tasks and analyze the influence of our pruning on model robustness to noises and adversarial attacks. Experimental results on datasets of generic objects, as well as domain specific tasks (CIFAR100, Adience, and LFWA) illustrate our framework's superior performance over state-of-the-art pruning approaches and fixed compact nets (e.g. SqueezeNet, MobileNet). The proposed method successfully maintains comparable accuracies even after discarding most parameters (98%-99% for VGG16, up to 82% for the already compact InceptionNet) and with significant FLOP reductions (83% for VGG16, up to 64% for InceptionNet). Through pruning, we can also derive smaller, but more accurate and more robust models suitable for the task.
Learning distributed sentence representations remains an interesting problem in the field of Natural Language Processing (NLP). We want to l… (voir plus)earn a model that approximates the conditional latent space over the representations of a logical antecedent of the given statement. In our paper, we propose an approach to generating sentences, conditioned on an input sentence and a logical inference label. We do this by modeling the different possibilities for the output sentence as a distribution over the latent representation, which we train using an adversarial objective. We evaluate the model using two state-of-the-art models for the Recognizing Textual Entailment (RTE) task, and measure the BLEU scores against the actual sentences as a probe for the diversity of sentences produced by our model. The experiment results show that, given our framework, we have clear ways to improve the quality and diversity of generated sentences.
Statistical methods protecting sensitive information or the identity of the data owner have become critical to ensure privacy of individuals… (voir plus) as well as of organizations. This paper investigates anonymization methods based on representation learning and deep neural networks, and motivated by novel information theoretical bounds. We introduce a novel training objective for simultaneously training a predictor over target variables of interest (the regular labels) while preventing an intermediate representation to be predictive of the private labels. The architecture is based on three sub-networks: one going from input to representation, one from representation to predicted regular labels, and one from representation to predicted private labels. The training procedure aims at learning representations that preserve the relevant part of the information (about regular labels) while dismissing information about the private labels which correspond to the identity of a person. We demonstrate the success of this approach for two distinct classification versus anonymization tasks (handwritten digits and sentiment analysis).
Exploring why stochastic gradient descent (SGD) based optimization methods train deep neural networks (DNNs) that generalize well has become… (voir plus) an active area of research. Towards this end, we empirically study the dynamics of SGD when training over-parametrized DNNs. Specifically we study the DNN loss surface along the trajectory of SGD by interpolating the loss surface between parameters from consecutive \textit{iterations} and tracking various metrics during training. We find that the loss interpolation between parameters before and after a training update is roughly convex with a minimum (\textit{valley floor}) in between for most of the training. Based on this and other metrics, we deduce that during most of the training, SGD explores regions in a valley by bouncing off valley walls at a height above the valley floor. This 'bouncing off walls at a height' mechanism helps SGD traverse larger distance for small batch sizes and large learning rates which we find play qualitatively different roles in the dynamics. While a large learning rate maintains a large height from the valley floor, a small batch size injects noise facilitating exploration. We find this mechanism is crucial for generalization because the valley floor has barriers and this exploration above the valley floor allows SGD to quickly travel far away from the initialization point (without being affected by barriers) and find flatter regions, corresponding to better generalization.
This paper introduces a novel measure-theoretic theory for machine learning that does not require statistical assumptions. Based on this the… (voir plus)ory, a new regularization method in deep learning is derived and shown to outperform previous methods in CIFAR-10, CIFAR-100, and SVHN. Moreover, the proposed theory provides a theoretical basis for a family of practically successful regularization methods in deep learning. We discuss several consequences of our results on one-shot learning, representation learning, deep learning, and curriculum learning. Unlike statistical learning theory, the proposed learning theory analyzes each problem instance individually via measure theory, rather than a set of problem instances via statistics. As a result, it provides different types of results and insights when compared to statistical learning theory.