Long-term survival and functional outcomes of critically ill patients with hematologic malignancies: a Canadian multicenter prospective study
Laveena Munshi
Bram Rochwerg
Farah Shoukat
Michael Detsky
Dean A. Fergusson
Bruno Ferreyro
Paul Heffernan
Margaret Herridge
Sheldon Magder
Mark Minden
Rakesh Patel
Salman Qureshi
Aaron Schimmer
Santhosh Thyagu
Han Ting Wang
Sangeeta Mehta
Perspectives on Robotic Systems for the Visually Impaired
Christopher Yee Wong
Rahatul Amin Ananto
Tanaka Akiyama
Joseph Paul Nemargut
Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science
Xiangru Tang
Qiao Jin
Kunlun Zhu
Tongxin Yuan
Yichi Zhang
Wangchunshu Zhou
Meng Qu
Yilun Zhao
Zhuosheng Zhang
Arman Cohan
Zhiyong Lu
Mark Gerstein
Stealing Part of a Production Language Model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dvijotham
Thomas Steinke
Jonathan Hayase
A. Feder Cooper
Katherine Lee
Matthew Jagielski
Milad Nasr
Arthur Conmy
Eric Wallace
Florian Tramèr
Stealing Part of a Production Language Model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dj Dvijotham
Thomas Steinke
Jonathan Hayase
A. Feder Cooper
Katherine Lee
Matthew Jagielski
Milad Nasr
Arthur Conmy
Eric Wallace
Florian Tramèr
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like Op… (see more)enAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \
Stealing Part of a Production Language Model
Nicholas Carlini
Daniel Paleka
Krishnamurthy Dj Dvijotham
Thomas Steinke
Jonathan Hayase
A. Feder Cooper
Katherine Lee
Matthew Jagielski
Milad Nasr
Arthur Conmy
Eric Wallace
Florian Tramèr
We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like Op… (see more)enAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \
Stochastic gradient descent-based inference for dynamic network models with attractors
Hancong Pan
Xiaojing Zhu
Cantay Caliskan
Dino P. Christenson
Konstantinos Spiliopoulos
Dylan Walker
In Coevolving Latent Space Networks with Attractors (CLSNA) models, nodes in a latent space represent social actors, and edges indicate thei… (see more)r dynamic interactions. Attractors are added at the latent level to capture the notion of attractive and repulsive forces between nodes, borrowing from dynamical systems theory. However, CLSNA reliance on MCMC estimation makes scaling difficult, and the requirement for nodes to be present throughout the study period limit practical applications. We address these issues by (i) introducing a Stochastic gradient descent (SGD) parameter estimation method, (ii) developing a novel approach for uncertainty quantification using SGD, and (iii) extending the model to allow nodes to join and leave over time. Simulation results show that our extensions result in little loss of accuracy compared to MCMC, but can scale to much larger networks. We apply our approach to the longitudinal social networks of members of US Congress on the social media platform X. Accounting for node dynamics overcomes selection bias in the network and uncovers uniquely and increasingly repulsive forces within the Republican Party.
Stochastic gradient descent-based inference for dynamic network models with attractors
Hancong Pan
Xiaojing Zhu
Cantay Caliskan
Dino P. Christenson
Konstantinos Spiliopoulos
Dylan Walker
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lu
Zdeněk Kasner
We propose the problem of conversational web navigation, where a digital agent controls a web browser and follows user instructions to solve… (see more) real-world tasks in a multi-turn dialogue fashion. To support this problem, we introduce WEBLINX - a large-scale benchmark of 100K interactions across 2300 expert demonstrations of conversational web navigation. Our benchmark covers a broad range of patterns on over 150 real-world websites and can be used to train and evaluate agents in diverse scenarios. Due to the magnitude of information present, Large Language Models (LLMs) cannot process entire web pages in real-time. To solve this bottleneck, we design a retrieval-inspired model that efficiently prunes HTML pages by ranking relevant elements. We use the selected elements, along with screenshots and action history, to assess a variety of models for their ability to replicate human behavior when navigating the web. Our experiments span from small text-only to proprietary multimodal LLMs. We find that smaller finetuned decoders surpass the best zero-shot LLMs (including GPT-4V), but also larger finetuned multimodal models which were explicitly pretrained on screenshots. However, all finetuned models struggle to generalize to unseen websites. Our findings highlight the need for large multimodal models that can generalize to novel settings. Our code, data and models are available for research: https://mcgill-nlp.github.io/weblinx
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lu
Zdeněk Kasner
We propose the problem of conversational web navigation, where a digital agent controls a web browser and follows user instructions to solve… (see more) real-world tasks in a multi-turn dialogue fashion. To support this problem, we introduce WebLINX - a large-scale benchmark of 100K interactions across 2300 expert demonstrations of conversational web navigation. Our benchmark covers a broad range of patterns on over 150 real-world websites and can be used to train and evaluate agents in diverse scenarios. Due to the magnitude of information present, Large Language Models (LLMs) cannot process entire web pages in real-time. To solve this bottleneck, we design a retrieval-inspired model that efficiently prunes HTML pages by ranking relevant elements. We use the selected elements, along with screenshots and action history, to assess a variety of models for their ability to replicate human behavior when navigating the web. Our experiments span from small text-only to proprietary multimodal LLMs. We find that smaller finetuned decoders surpass the best zero-shot LLMs (including GPT-4V), but also larger finetuned multimodal models which were explicitly pretrained on screenshots. However, all finetuned models struggle to generalize to unseen websites. Our findings highlight the need for large multimodal models that can generalize to novel settings. Our code, data and models are available for research: https://mcgill-nlp.github.io/weblinx.
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Massimo Caccia
Issam Hadj Laradji
Manuel Del Verme
Tom Marty
Léo Boisvert
Megh Thakkar
David Vazquez
Alexandre Lacoste
We study the use of large language model-based agents for interacting with software via web browsers. Unlike prior work, we focus on measuri… (see more)ng the agents' ability to perform tasks that span the typical daily work of knowledge workers utilizing enterprise software systems. To this end, we propose WorkArena, a remote-hosted benchmark of 29 tasks based on the widely-used ServiceNow platform. We also introduce BrowserGym, an environment for the design and evaluation of such agents, offering a rich set of actions as well as multimodal observations. Our empirical evaluation reveals that while current agents show promise on WorkArena, there remains a considerable gap towards achieving full task automation. Notably, our analysis uncovers a significant performance disparity between open and closed-source LLMs, highlighting a critical area for future exploration and development in the field.
Towards a Generic Representation of Combinatorial Problems for Learning-Based Approaches
Léo Boisvert
Hélène Verhaeghe
In recent years, there has been a growing interest in using learning-based approaches for solving combinatorial problems, either in an end-t… (see more)o-end manner or in conjunction with traditional optimization algorithms. In both scenarios, the challenge lies in encoding the targeted combinatorial problems into a structure compatible with the learning algorithm. Many existing works have proposed problem-specific representations, often in the form of a graph, to leverage the advantages of \textit{graph neural networks}. However, these approaches lack generality, as the representation cannot be easily transferred from one combinatorial problem to another one. While some attempts have been made to bridge this gap, they still offer a partial generality only. In response to this challenge, this paper advocates for progress toward a fully generic representation of combinatorial problems for learning-based approaches. The approach we propose involves constructing a graph by breaking down any constraint of a combinatorial problem into an abstract syntax tree and expressing relationships (e.g., a variable involved in a constraint) through the edges. Furthermore, we introduce a graph neural network architecture capable of efficiently learning from this representation. The tool provided operates on combinatorial problems expressed in the XCSP3 format, handling all the constraints available in the 2023 mini-track competition. Experimental results on four combinatorial problems demonstrate that our architecture achieves performance comparable to dedicated architectures while maintaining generality. Our code and trained models are publicly available at https://github.com/corail-research/learning-generic-csp.