Digitale Technologien in der Gesellschaft

Dauerhafte URI für die Sammlung

https://www.weizenbaum-library.de/handle/id/305

In diesem Forschungsschwerpunkt sollen der Zusammenhang zwischen Digitalisierung, Teilhabe und Ungleichheit erforscht, die Nutzung digitaler Technologien für Teilhabechancen gestaltend erprobt und gegen neue Ungleichheiten interveniert werden. Dafür werden Perspektiven der Wirtschaftsinformatik, der Designforschung und der Informatik zusammengeführt.

Listen

Gerade angezeigt 1 - 8 von 8

Articulation Work and Tinkering for Fairness in Machine Learning
(2024) Fahimi, Miriam; Russo, Mayra; Scott, Kristen M.; Vidal, Maria-Esther; Berendt, Bettina; Kinder-Kurlanda, Katharina
The field of fair AI aims to counter biased algorithms through computational modelling. However, it faces increasing criticism for perpetuating the use of overly technical and reductionist methods. As a result, novel approaches appear in the field to address more socially-oriented and interdisciplinary (SOI) perspectives on fair AI. In this paper, we take this dynamic as the starting point to study the tension between computer science (CS) and SOI research. By drawing on STS and CSCW theory, we position fair AI research as a matter of 'organizational alignment': what makes research 'doable' is the successful alignment of three levels of work organization (the social world, the laboratory, and the experiment). Based on qualitative interviews with CS researchers, we analyze the tasks, resources, and actors required for doable research in the case of fair AI. We find that CS researchers engage with SOI research to some extent, but organizational conditions, articulation work, and ambiguities of the social world constrain the doability of SOI research for them. Based on our findings, we identify and discuss problems for aligning CS and SOI as fair AI continues to evolve.
Diversity and bias in DBpedia and Wikidata as a challenge for text-analysis tools
(2023) Berendt, Bettina; Karadeniz, Oğuz Özgür; Kıyak, Sercan; Mertens, Stefan; d’Haenens, Leen
Diversity Searcher is a tool originally developed to help analyse diversity in news media texts. It relies on automated content analysis and thus rests on prior assumptions and depends on certain design choices related to diversity. One such design choice is the external knowledge source(s) used. In this article, we discuss implications that these sources can have on the results of content analysis. We compare two data sources that Diversity Searcher has worked with – DBpedia and Wikidata – with respect to their ontological coverage and diversity, and describe implications for the resulting analyses of text corpora. We describe a case study of the relative over- or underrepresentation of Belgian political parties between 1990 and 2020. In particular, we found a staggering overrepresentation of the political right in the English-language DBpedia.
How Far Can It Go? On Intrinsic Gender Bias Mitigation for Text Classification
(Association for Computational Linguistics, 2023) Tokpo, Ewoenam Kwaku; Delobelle, Pieter; Berendt, Bettina; Calders, Toon
To mitigate gender bias in contextualized language models, different intrinsic mitigation strategies have been proposed, alongside many bias metrics. Considering that the end use of these language models is for downstream tasks like text classification, it is important to understand how these intrinsic bias mitigation strategies actually translate to fairness in downstream tasks and the extent of this. In this work, we design a probe to investigate the effects that some of the major intrinsic gender bias mitigation strategies have on downstream text classification tasks. We discover that instead of resolving gender bias, intrinsic mitigation techniques and metrics are able to hide it in such a way that significant gender information is retained in the embeddings. Furthermore, we show that each mitigation technique is able to hide the bias from some of the intrinsic bias measures but not all, and each intrinsic bias measure can be fooled by some mitigation techniques, but not all. We confirm experimentally, that none of the intrinsic mitigation techniques used without any other fairness intervention is able to consistently impact extrinsic bias. We recommend that intrinsic bias mitigation techniques should be combined with other fairness interventions for downstream tasks.
Proceedings of the Weizenbaum Conference 2023. AI, Big Data, Social Media and People on the Move
(Weizenbaum Institute, 2023) Berendt, Bettina; Krzywdzinski, Martin; Kuznetsova, Elizaveta
The contributions focus on the question of what role different digital technologies play for “people on the move” - with “people on the move” being understood both spatially (migration and flight) and in terms of economic and social change (changing working conditions, access conditions). The authors discuss phenomena such as disinformation and algorithmic bias from different perspectives, and the possibilities, limits and dangers of generative artificial intelligence.
ResumeTailor: Improving Resume Quality Through Co-Creative Tools
(IOS Press, 2023) Delobelle, Pieter; Wang, Sonja Mei; Berendt, Bettina; Lukowicz, Paul; Mayer, Sven; Koch, Janin; Shawe-Taylor, John; Tiddi, Ilaria
Clear and well-written resumes can help jobseekers find better and better-suited jobs. However, many people struggle with writing their resumes, especially if they just entered the job market. Although many tools have been created to help write resumes, an analysis we conducted showed us that these tools focus mainly on layout and only give very limited content-related support. This paper presents a co-creative resume building tool that provides tailored advice to jobseekers. It is based on a comprehensive computational analysis of 444k resumes and the development of a Dutch language model, ResumeRobBERT, to provide contextual suggestions. Through the analysis of the resumes, we found that some expected sections, such as language proficiency, are often missing entirely, while conversely some resumes contain unexpected content, such as negative personality traits. This implies that jobseekers could benefit from more guidance when writing resumes. We aim to support them in the resume-writing process through our tool ResumeTailor, a co-creative resume building tool that gives textual suggestions and provides a template for important resume sections.
Silencing the Risk, Not the Whistle: A Semi-automated Text Sanitization Tool for Mitigating the Risk of Whistleblower Re-Identification
(ACM, 2024) Staufer, Dimitri; Pallas, Frank; Berendt, Bettina
Whistleblowing is essential for ensuring transparency and accountability in both public and private sectors. However, (potential) whistleblowers often fear or face retaliation, even when report- ing anonymously. The specific content of their disclosures and their distinct writing style may re-identify them as the source. Legal measures, such as the EU Whistleblower Directive, are limited in their scope and effectiveness. Therefore, computational methods to prevent re-identification are important complementary tools for encouraging whistleblowers to come forward. However, current text sanitization tools follow a one-size-fits-all approach and take an overly limited view of anonymity. They aim to mitigate identification risk by replacing typical high-risk words (such as person names and other labels of named entities) and combinations thereof with placeholders. Such an approach, however, is inadequate for the whistleblowing scenario since it neglects further re-identification potential in textual features, including the whistleblower’s writing style. Therefore, we propose, implement, and evaluate a novel classification and mitigation strategy for rewriting texts that involves the whistleblower in the assessment of the risk and utility. Our prototypical tool semi-automatically evaluates risk at the word/term level and applies risk-adapted anonymization techniques to produce a grammatically disjointed yet appropriately sanitized text. We then use a Large Language Model (LLM) that we fine-tuned for paraphrasing to render this text coherent and style-neutral. We evaluate our tool’s effectiveness using court cases from the European Court of Human Rights (ECHR) and excerpts from a real-world whistleblower testimony and measure the protection against authorship attribution attacks and utility loss statistically using the popular IMDb62 movie reviews dataset, which consists of 62 individuals. Our method can significantly reduce authorship attribution accuracy from 98.81% to 31.22%, while preserving up to 73.1% of the original content’s semantics, as measured by the established cosine similarity of sentence embeddings.
The AI Act Proposal: Towards the next transparency fallacy? Why AI regulation should be based on principles based on how algorithmic discrimination works
(Mohr Siebeck, 2022) Berendt, Bettina; Bundesministerium für Umwelt, Naturschutz, nukleare Sicherheit und Verbraucherschutz; Rostalski, Frauke
Artificial Intelligence (AI) can entail large benefits as well as risks. The goals of protecting individuals and society and establishing conditions under which citizens find AI “trustworthy” and developers and vendors can produce and sell AI, the ways in which AI works have to be understood better and rules have to be established and enforced to mitigate the risks. This task can only be undertaken in collaboration. Computer scientists are called upon to align data, algorithms, procedures and larger designs with values, ‘ethics’ and laws. Social scientists are called upon to describe and analyse the plethora of interdependent effects and causes in socio-technical systems involving AI. Philosophers are expected to explain values and ethics. And legal experts and scholars as well as politicians are expected to create the social rules and institutions that support beneficial uses of AI and avoid harmful ones. This article starts from a computers-and-society perspective and focuses on the action space of lawmaking. It suggests an approach to AI regulation that starts from a critique of the European Union’s (EU) proposal for a Regulation commonly known as the AI Act Proposal, published by the EU Commission on 21 April 2021.
Tik-to-Tok: Translating Language Models One Token at a Time: An Embedding Initialization Strategy for Efficient Language Adaptation
(2023) Remy, François; Delobelle, Pieter; Berendt, Bettina; Demuynck, Kris; Demeester, Thomas
Training monolingual language models for low and mid-resource languages is made challenging by limited and often inadequate pretraining data. In this study, we propose a novel model conversion strategy to address this issue, adapting high-resources monolingual language models to a new target language. By generalizing over a word translation dictionary encompassing both the source and target languages, we map tokens from the target tokenizer to semantically similar tokens from the source language tokenizer. This one-to-many token mapping improves tremendously the initialization of the embedding table for the target language. We conduct experiments to convert high-resource models to mid- and low-resource languages, namely Dutch and Frisian. These converted models achieve a new state-of-the-art performance on these languages across all sorts of downstream tasks. By reducing significantly the amount of data and time required for training state-of-the-art models, our novel model conversion strategy has the potential to benefit many languages worldwide.

Listen

Auflistung Digitale Technologien in der Gesellschaft nach Autor:in "Berendt, Bettina"

Treffer pro Seite

Sortieroptionen