Responsibility Attribution for AI-Mediated Damages with Mechanistic Interpretability
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Artificial Intelligence (AI) raises profound ethical, moral, and legal challenges for society. In this paper, we focus on the legal challenge to adequately attribute responsibility for AI-mediated damages. In law, responsibility is usually attributed on the basis that a (human) actor is taking actions which cause harm or damage to some subject or property. While this analysis seems straightforward, it remains unclear (a) what conception of causation it relies on, (b) in what way this can be applied to attribute responsibility when human actions rely on the use of opaque AI systems, and (c) how liability for AI-mediated damages should be handled in practice. The current paper sets out to answer these questions. We shall argue that causation in the legal context is best conceptualized as difference-making. To determine relevant difference-makers for AI-mediated damages, we propose that explainable AI (XAI) methods may serve as important tools. Specifically, we argue, mechanistic interpretability (MI) is well-suited to increase the ex-ante safety of opaque AI systems.
