Sitemap
A list of all the posts and pages found on the site. For you robots out there, there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml and set future: false.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2 
publications
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
Published in ACL Main 2024, 2024
Recommended citation: Philipp Mondorf and Barbara Plank. 2024. Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 9370–9402, Bangkok, Thailand. Association for Computational Linguistics.
Download Paper
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models – A Survey
Published in COLM 2024, 2024
Recommended citation: Philipp Mondorf and Barbara Plank. 2024. Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models - A Survey. In Proceedings of the First Conference on Language Modeling. URL: https://openreview.net/forum?id=Lmjgl2n11u.
Download Paper
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models
Published in EMNLP Main 2024, 2024
Recommended citation: Philipp Mondorf and Barbara Plank. 2024. Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 7114–7137, Miami, Florida, USA. Association for Computational Linguistics.
Download Paper
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models
Published in ACL Main 2025, 2025
Recommended citation: Philipp Mondorf, Sondre Wold, and Barbara Plank. 2025. Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models. In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 14934–14955, Vienna, Austria. Association for Computational Linguistics.
Download Paper
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It
Published in ACL Main 2025, 2025
Recommended citation: Leonardo Bertolazzi, Philipp Mondorf, Barbara Plank, and Raffaella Bernardi. 2025. The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 29387–29424, Suzhou, China. Association for Computational Linguistics.
Download Paper
BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods
Published in BlackboxNLP 2025, 2025
Recommended citation: Philipp Mondorf, Mingyang Wang, Sebastian Gerstner, Ahmad Dawar Hakimi, Yihong Liu, Leonor Veloso, Shijia Zhou, Hinrich Schuetze, and Barbara Plank. 2025. BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods. In Proceedings of the 8th BlackboxNLP Workshop: Analyzing and Interpreting Neural Networks for NLP, pages 537–542, Suzhou, China. Association for Computational Linguistics.
Download Paper
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models
Published in EACL Main 2026, 2026
Recommended citation: Jasmin Orth, Philipp Mondorf, and Barbara Plank. 2026. If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models. In Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics.
Download Paper
Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning
Published in ICLR 2026, 2026
Recommended citation: Philipp Mondorf, Shijia Zhou, Monica Riedler, and Barbara Plank. 2026. Compositional-ARC: Assessing Systematic Generalization in Abstract Spatial Reasoning. In Proceedings of the Fourteenth International Conference on Learning Representations.
Download Paper
talks
Prompting LLMs to Reason: Common Pitfalls
Published:
Human-centric Evaluation of Language Models
Published:
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.
