Enhancing language models with boosting and targeted fine-tuning for real-word error detection

Masanti, Corina; Witschel, Hans Friedrich; Riesen, Kaspar

Enhancing language models with boosting and targeted fine-tuning for real-word error detection

dc.contributor.author	Masanti, Corina
dc.contributor.author	Witschel, Hans Friedrich
dc.contributor.author	Riesen, Kaspar
dc.date.accessioned	2026-06-16T06:52:44Z
dc.date.issued	2026
dc.description.abstract	We propose a boosting-based approach to enhance language models of diverse architectures with the goal of detecting real-word errors in documents. • We thoroughly evaluate the benefits and limitations of our novel framework through experiments on a large real-world data set. Based on a thorough error analysis, we generate additional targeted training data to address identified weaknesses and apply targeted fine-tuning to further improve model performance. Over the past years, extensive research has led to significant advancements in tools for the automatic detection and correction of errors in documents. Despite this progress, several challenges remain unresolved. In particular, the identification of real-word errors – errors involving words that are grammatically valid but contextually inappropriate within a given sentence – continues to pose a considerable difficulty. Addressing such errors requires models with a sophisticated understanding of linguistic context. Transformer-based language models are particularly well-suited for this task due to their contextual modeling capabilities. To further enhance their performance, we propose a boosting-based training approach in conjunction with a synthetically generated data set created via pattern-based noise injection. We evaluate this method across three transformer-based architectures, viz. mBERT, LLaMA 3, and Mistral 7B. Our experimental results show that the boosting-based strategy consistently improves real-word error detection across all models. A subsequent in-depth error analysis reveals limitations in the synthetic training data, prompting the development of a targeted fine-tuning procedure designed to address these shortcomings and further optimize model performance. A comparison with prompt-based inference using a large language model demonstrates that specialized, fine-tuned models yield more reliable performance for this task. Finally, an evaluation under realistic class imbalance highlights practical trade-offs between ranking quality and threshold-based detection, particularly for rare error types.
dc.identifier.doi	10.1016/j.nlp.2026.100202
dc.identifier.issn	2949-7191
dc.identifier.uri	https://irf.fhnw.ch/handle/11645/57047
dc.identifier.uri	https://doi.org/10.26041/fhnw-16512
dc.language.iso	en
dc.publisher	Elsevier
dc.relation.ispartof	Natural Language Processing Journal
dc.rights.uri	https://creativecommons.org/licenses/by/4.0/
dc.subject.ddc	004 - Computer Wissenschaften, Internet
dc.title	Enhancing language models with boosting and targeted fine-tuning for real-word error detection
dc.type	01A - Beitrag in wissenschaftlicher Zeitschrift
dc.volume	14
dspace.entity.type	Publication
fhnw.InventedHere	Yes
fhnw.ReviewType	peer-reviewed
fhnw.oastatus.aurora	Version: Published * Embargo: None * Licence: CC BY *** URL: https://v2.sherpa.ac.uk/id/publication/46861
fhnw.openAccessCategory	Gold
fhnw.pagination	100202-100202
fhnw.publicationState	Published
fhnw.targetcollection	d40e4c67-dd87-4d14-8518-b2f0a855e750
relation.isAuthorOfPublication	4f94a17c-9d05-433c-882f-68f062e0e6ae
relation.isAuthorOfPublication.latestForDiscovery	4f94a17c-9d05-433c-882f-68f062e0e6ae

Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1

Name:: 1-s2.0-S2949719126000063-main.pdf
Größe:: 4.24 MB
Format:: Adobe Portable Document Format

Herunterladen

Lizenzbündel

Gerade angezeigt 1 - 1 von 1

Name:: license.txt
Größe:: 2.66 KB
Format:: Item-specific license agreed upon to submission
Beschreibung:

Herunterladen

Sammlung

Institut für Wirtschaftsinformatik