The Mila AI Policy Fellowship translates deep AI expertise into rigorous, public-interest policy. Read the newest publication Bridging the Expertise Gap: Knowledge Transfer Mechanisms for AI Regulation by Moritz von Knebel
This program supports AI startups at any time of the year. Benefit from cutting-edge resources and tailored support to accelerate your technology's development.
We use cookies to analyze the browsing and usage of our website and to personalize your experience. You can disable these technologies at any time, but this may limit certain functionalities of the site. Read our Privacy Policy for more information.
Setting cookies
You can enable and disable the types of cookies you wish to accept. However certain choices you make could affect the services offered on our sites (e.g. suggestions, personalised ads, etc.).
Essential cookies
These cookies are necessary for the operation of the site and cannot be deactivated. (Still active)
Analytics cookies
Do you accept the use of cookies to measure the audience of our sites?
Multimedia Player
Do you accept the use of cookies to display and allow you to watch the video content hosted by our partners (YouTube, etc.)?
Global biodiversity is declining at an unprecedented rate, yet little information is known about most species and how their populations are … (see more)changing. Indeed, some 90% of Earth's species are estimated to be completely unknown. Machine learning has recently emerged as a promising tool to facilitate long-term, large-scale biodiversity monitoring, including algorithms for fine-grained classification of species from images. However, such algorithms typically are not designed to detect examples from categories unseen during training -- the problem of open-set recognition (OSR) -- limiting their applicability for highly diverse, poorly studied taxa such as insects. To address this gap, we introduce Open-Insect, a large-scale, fine-grained dataset to evaluate unknown species detection across different geographic regions with varying difficulty. We benchmark 38 OSR algorithms across three categories: post-hoc, training-time regularization, and training with auxiliary data, finding that simple post-hoc approaches remain a strong baseline. We also demonstrate how to leverage auxiliary data to improve species discovery in regions with limited data. Our results provide insights to guide the development of computer vision methods for biodiversity monitoring and species discovery.
Question understanding is an important issue to the success of a Knowledge-based Question Answering (KBQA) system.However, the existing stud… (see more)y does not pay enough attention to this issue given that the questions in the existing KBQA datasets are usually expressed in simple and straightforward way. This is not in line with the actual linguistic conventions, which often use a lot of modifiers. To facilitate the study on evaluating and enhancing the question understanding ability of the KBQA systems, this paper proposes to construct a complex-modified question-answering (XMQAs) dataset based on existing KBQA datasets. With the help of knowledge bases and dictionaries, three kinds of modifiers are defined and applied to original simple-expressed questions. These modifiers could make the expression of these questions complex without changing their semantics. Based on XMQAs, we then propose a novel question understanding algorithm upon existing KBQA models, which greatly improves the robustness of their question understanding abilities. We conduct extensive experiments on XMQAs and two widely acknowledged KBQA datasets. The empirical results demonstrate that our proposed algorithm can improve the performance of KBQA models on not only the complex-modified questions, but also simple-expressed questions.
2023-08-09
IEEE Transactions on Knowledge and Data Engineering (unknown)
Interpreting the predictions of existing Question Answering (QA) models is critical to many real-world intelligent applications, such as QA … (see more)systems for healthcare, education, and finance. However, existing QA models lack interpretability and provide no feedback or explanation for end-users to help them understand why a specific prediction is the answer to a question. In this research, we argue that the evidences of an answer is critical to enhancing the interpretability of QA models. Unlike previous research that simply extracts several sentence(s) in the context as evidence, we are the first to explicitly define the concept of evidence as the supporting facts in a context which are informative, concise, and readable. Besides, we provide effective strategies to quantitatively measure the informativeness, conciseness and readability of evidence. Furthermore, we propose Grow-and-Clip Evidence Distillation (GCED) algorithm to extract evidences from the contexts by trade-off informativeness, conciseness, and readability. We conduct extensive experiments on the SQuAD and TriviaQA datasets with several baseline models to evaluate the effect of GCED on interpreting answers to questions. Human evaluation are also carried out to check the quality of distilled evidences. Experimental results show that automatic distilled evidences have human-like informativeness, conciseness and readability, which can enhance the interpretability of the answers to questions.
2022-05-08
2022 IEEE 38th International Conference on Data Engineering (ICDE) (published)