Learning from Stochastic Teacher Representations Using Student-Guided Knowledge Distillation
Muhammad Haseeb Aslam
Clara Martinez
Alessandro Lameiras Koerich
Ali Etemad
Eric Granger
Advances in self-distillation have shown that when knowledge is distilled from a teacher to a student using the same deep learning (DL) arch… (see more)itecture, the student performance can surpass the teacher particularly when the network is overparameterized and the teacher is trained with early stopping. Alternatively, ensemble learning also improves performance, although training, storing, and deploying multiple models becomes impractical as the number of models grows. Even distilling an ensemble to a single student model or weight averaging methods first requires training of multiple teacher models and does not fully leverage the inherent stochasticity for generating and distilling diversity in DL models. These constraints are particularly prohibitive in resource-constrained or latency-sensitive applications such as wearable devices. This paper proposes to train only one model and generate multiple diverse teacher representations using distillation-time dropout. However, generating these representations stochastically leads to noisy representations that are misaligned with the learned task. To overcome this problem, a novel stochastic self-distillation (SSD) training strategy is introduced for filtering and weighting teacher representation to distill from task-relevant representations only, using student-guided knowledge distillation (SGKD). The student representation at each distillation step is used as authority to guide the distillation process. Experimental results on real-world affective computing, wearable/biosignal datasets from the UCR Archive, the HAR dataset, and image classification datasets show that the proposed SSD method can outperform state-of-the-art methods without increasing the model size at both training and testing time, and incurs negligible computational complexity compared to state-of-the-art ensemble learning and weight averaging methods.
Leveraging Machine Learning Techniques in Intrusion Detection Systems for Internet of Things
Saeid Jamshidi
Amin Nikanjam
Kawser Wazed Nafi
As the Internet of Things (IoT) continues to expand, ensuring the security of connected devices has become increasingly critical. Traditiona… (see more)l Intrusion Detection Systems (IDS) often fall short in managing the dynamic and large-scale nature of IoT networks. This paper explores how Machine Learning (ML) and Deep Learning (DL) techniques can significantly enhance IDS performance in IoT environments. We provide a thorough overview of various IDS deployment strategies and categorize the types of intrusions common in IoT systems. A range of ML methods -- including Support Vector Machines, Naive Bayes, K-Nearest Neighbors, Decision Trees, and Random Forests -- are examined alongside advanced DL models such as LSTM, CNN, Autoencoders, RNNs, and Deep Belief Networks. Each technique is evaluated based on its accuracy, efficiency, and suitability for real-world IoT applications. We also address major challenges such as high false positive rates, data imbalance, encrypted traffic analysis, and the resource constraints of IoT devices. In addition, we highlight the emerging role of Generative AI and Large Language Models (LLMs) in improving threat detection, automating responses, and generating intelligent security policies. Finally, we discuss ethical and privacy concerns, underscoring the need for responsible and transparent implementation. This paper aims to provide a comprehensive framework for developing adaptive, intelligent, and secure IDS solutions tailored for the evolving landscape of IoT.
LitLLMs, LLMs for Literature Review: Are we there yet?
Shubham Agarwal
Gaurav Sahu
Abhay Puri
Issam Hadj Laradji
Krishnamurthy Dj Dvijotham
Jason Stanley
LLMs are Greedy Agents: Effects of RL Fine-tuning on Decision-Making Abilities
Thomas Schmied
Jorg Bornschein
Jordi Grau-Moya
Markus Wulfmeier
Locate 3D: Real-World Object Localization via Self-Supervised Learning in 3D
Sergio Arnaud
Paul McVay
Ada Martin
Arjun Majumdar
Krishna Murthy
Phillip Thomas
Ruslan Partsey
Daniel Dugas
Abha Gejji
Alexander Sax
Vincent-Pierre Berges
Mikael Henaff
Ayush Jain
Ang Cao
Ishita Prasad
Mrinal Kalakrishnan
Nicolas Ballas
Mahmoud Assran
Oleksandr Maksymets … (see 2 more)
Aravind Rajeswaran
Franziska Meier
Long Range Navigator (LRN): Extending robot planning horizons beyond metric maps
Matt Schmittle
Rohan Baijal
Nathan Hatch
Rosario Scalise
Mateo Guaman Castro
Sidharth Talia
Byron Boots
S. Srinivasa
Lugha-Llama: Adapting Large Language Models for African Languages
Happy Buzaaba
Alexander Wettig
Christiane Fellbaum
Min-Max Optimisation for Nonconvex-Nonconcave Functions Using a Random Zeroth-Order Extragradient Algorithm
Amir Ali Farzin
Yuen-Man Pun
Philipp Braun
Youssef Diouane
Iman Shames
Multiple-model coding scheme for electrical signal compression
Corentin Presvôts
Michel Kieffer
Thibault Prevost
Patrick Panciatici
Zuxing Li
Multiple-model coding scheme for electrical signal compression
Corentin Presvôts
Michel Kieffer
Thibault Prevost
Patrick Panciatici
Zuxing Li
Negative Language Transfer Identification in the English Writing of Chinese and Farsi Native Speakers
Mohammad Karimiabdolmaleki
Leticia Farias Wanderley
Mohsen Rezazadeh
Carrie Demmans Epp
Neural Kinematic Bases for Fluids
Yibo Liu
Paul Kry
Kenny Erleben
Sune Darkner
Teseo Schneider