Publications

On the Duality between Gradient Transformations and Adapters
Paper

by Lucas Torroba-Hennigen, Hunter Lang, Han Guo, Yoon Kim

preprint on arXiv, February 2025

We explore a relationship between training with gradient transformations and with one-sided adapters, and use it to derive more memory-efficient single-node and distributed pretraining methods.

Towards Verifiable Text Generation with Symbolic References
Paper

by Lucas Torroba Hennigen*, Shannon Shen*, Aniruddha Nrusimha, Bernhard Gapp, David Sontag, Yoon Kim

in First Conference on Language Modeling, October 2024

We develop a method for more verifiable text generation by prompting LLMs to generate their output using symbolic references into some source data.

Principled Gradient-based Markov Chain Monte Carlo for Text Generation
Paper

by Li Du, Afra Amini, Lucas Torroba Hennigen, Xinyan Velocity Yu, Jason Eisner, Holden Lee, Ryan Cotterell

in Proceedings of the 41st International Conference on Machine Learning, July 2024

We show that previous gradient-based sampling methods for text generation are unfaithful to the true target distribution, and propose faithful alternatives.

Deriving Language Models from Masked Language Models
Paper Code

by Lucas Torroba Hennigen, Yoon Kim

in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, July 2023

We explore and benchmark ways of deriving a joint distribution from the unary conditionals specified by a masked language model.

A Measure-Theoretic Characterization of Tight Language Models
Paper

by Li Du, Lucas Torroba Hennigen, Tiago Pimentel, Clara Meister, Jason Eisner, Ryan Cotterell

in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, July 2023

We provide a measure-theoretic characterization of language models and use it to characterize the notion of tightness more precisely.

Generalizing Backpropagation for Gradient-based Interpretability
Paper Code

by Kevin Du, Lucas Torroba Hennigen, Niklas Stoehr, Alexander Warstadt, Ryan Cotterell

in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, July 2023

By viewing backpropagation as a particular instance of a semiring algorithm, we explore its generalization to other semirings as a tool for interpretability.

An Ordinal Latent Variable Model of Conflict Intensity
Paper Code

by Niklas Stoehr, Lucas Torroba Hennigen, Josef Valvoda, Robert West, Ryan Cotterell, Aaron Schein

in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics, July 2023

We introduce a latent variable model for that models conflict intensity as an ordinal latent variable.

Learning to grow pretrained models for efficient transformer training
Paper Code

by Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David Daniel Cox, Zhangyang Wang, Yoon Kim

in The Eleventh International Conference on Learning Representations, May 2023

We develop a technique to learn how to grow a pretrained model to a larger size, using the implicit knowledge in the parameters of the smaller model.

A Latent-Variable Model for Intrinsic Probing
Paper Code

by Karolina Stańczak*, Lucas Torroba Hennigen*, Adina Williams, Ryan Cotterell, Isabelle Augenstein

in Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence, February 2023

We introduce a novel latent-variable formulation of intrinsic probing which yields tighter mutual information estimates than previously proposed methods.

Same Neurons, Different Languages: Probing Morphosyntax in Multilingual Pre-trained Models
Paper Code

by Karolina Stańczak, Edoardo Ponti, Lucas Torroba Hennigen, Ryan Cotterell, Isabelle Augenstein

in Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, July 2022

We compare the neurons that are most informative of particular morphosyntactic categories in multilingual representations, and find significant overlap across languages.

Probing as Quantifying the Inductive Bias of Pre-trained Representations
Paper Code

by Alexander Immer*, Lucas Torroba Hennigen*, Vincent Fortuin, Ryan Cotterell

in Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, May 2022

We propose a new method to evaluate the inductive biases of pre-trained NLP representations which addresses limitations of previous work in probing. Our results suggest that fastText may offer a better inductive bias than BERT in certain multilingual morphosyntactic tasks.

Classifying Dyads for Militarized Conflict Analysis
Paper Code

by Niklas Stoehr, Lucas Torroba Hennigen, Samin Ahbab, Robert West, Ryan Cotterell

in Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, November 2021

We compare dyadic and systemic correlates of conflict in terms of their ability to infer if two entities are allies or enemies. Our results suggests that our systemic features appear to be more correlated.

Intrinsic Probing Through Dimension Selection
Paper Code

by Lucas Torroba Hennigen, Adina Williams, Ryan Cotterell

in Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, November 2020

We introduce a novel framework for intrinsic probing that leverages a decomposable multivariate Gaussian probe. We run experiments on 36 languages from the Universal Dependencies treebanks, and find that fastText concentrates its linguistic structure more than BERT.

Machine Reading of Historical Events
Paper Code

by Or Honovich*, Lucas Torroba Hennigen*, Omri Abend, Shay B. Cohen

in Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, July 2020

We introduce the historical event ordering (HEO) task, where a series of short textual descriptions of historical events, potentially alongside some additional information, are ordered chronologically. We compile two datasets for this task, and compare the performance of two models in it.