Item

Responsibility Attribution for AI-Mediated Damages with Mechanistic Interpretability

Date

2025-10-01

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Artificial Intelligence (AI) raises profound ethical, moral, and legal challenges for society. In this paper, we focus on the legal challenge to adequately attribute responsibility for AI-mediated damages. In law, responsibility is usually attributed on the basis that a (human) actor is taking actions which cause harm or damage to some subject or property. While this analysis seems straightforward, it remains unclear (a) what conception of causation it relies on, (b) in what way this can be applied to attribute responsibility when human actions rely on the use of opaque AI systems, and (c) how liability for AI-mediated damages should be handled in practice. The current paper sets out to answer these questions. We shall argue that causation in the legal context is best conceptualized as difference-making. To determine relevant difference-makers for AI-mediated damages, we propose that explainable AI (XAI) methods may serve as important tools. Specifically, we argue, mechanistic interpretability (MI) is well-suited to increase the ex-ante safety of opaque AI systems.

Description

Keywords

Artificial Intelligence, Responsibility Attribution, Liability, Explainable AI, EU AI Act, Ex Ante Safety, Mechanistic Interpretability

Citation

Cordes J., Kästner L. & Zech H. (2025). Responsibility Attribution for AI-Mediated Damages with Mechanistic Interpretability. In B. Steffen (ed.), Bridging the Gap Between AI and Reality: Second International Conference, AISoLA 2024, Selected Papers (pp. 187-202). Springer Nature Switzerland. https://doi.org/10.1007/978-3-032-01377-4_10

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as open access