Enhancing language models with boosting and targeted fine-tuning for real-word error detection

Masanti, Corina; Witschel, Hans Friedrich; Riesen, Kaspar

Enhancing language models with boosting and targeted fine-tuning for real-word error detection

Dateien

1-s2.0-S2949719126000063-main.pdf(4.24 MB)

Autor:innen

Masanti, Corina

Witschel, Hans Friedrich

Riesen, Kaspar

Autor:in (Körperschaft)

Publikationsdatum

2026

Typ der Arbeit

Studiengang

Sammlung

Institut für Wirtschaftsinformatik

Komplettanzeige

Typ

01A - Beitrag in wissenschaftlicher Zeitschrift

Herausgeber:innen

Herausgeber:in (Körperschaft)

Betreuer:in

Übergeordnetes Werk

Natural Language Processing Journal

Themenheft

DOI der Originalpublikation

https://doi.org/10.1016/j.nlp.2026.100202

URI

https://irf.fhnw.ch/handle/11645/57047
https://doi.org/10.26041/fhnw-16512

Link

Zugehörige Forschungsdaten

Reihe / Serie

Reihennummer

Jahrgang / Band

14

Ausgabe / Nummer

Seiten / Dauer

100202-100202

Patentnummer

Verlag / Herausgebende Institution

Elsevier

Verlagsort / Veranstaltungsort

Auflage

Version

Programmiersprache

Abtretungsempfänger:in

Praxispartner:in/Auftraggeber:in

Zusammenfassung

We propose a boosting-based approach to enhance language models of diverse architectures with the goal of detecting real-word errors in documents. • We thoroughly evaluate the benefits and limitations of our novel framework through experiments on a large real-world data set. Based on a thorough error analysis, we generate additional targeted training data to address identified weaknesses and apply targeted fine-tuning to further improve model performance. Over the past years, extensive research has led to significant advancements in tools for the automatic detection and correction of errors in documents. Despite this progress, several challenges remain unresolved. In particular, the identification of real-word errors – errors involving words that are grammatically valid but contextually inappropriate within a given sentence – continues to pose a considerable difficulty. Addressing such errors requires models with a sophisticated understanding of linguistic context. Transformer-based language models are particularly well-suited for this task due to their contextual modeling capabilities. To further enhance their performance, we propose a boosting-based training approach in conjunction with a synthetically generated data set created via pattern-based noise injection. We evaluate this method across three transformer-based architectures, viz. mBERT, LLaMA 3, and Mistral 7B. Our experimental results show that the boosting-based strategy consistently improves real-word error detection across all models. A subsequent in-depth error analysis reveals limitations in the synthetic training data, prompting the development of a targeted fine-tuning procedure designed to address these shortcomings and further optimize model performance. A comparison with prompt-based inference using a large language model demonstrates that specialized, fine-tuned models yield more reliable performance for this task. Finally, an evaluation under realistic class imbalance highlights practical trade-offs between ranking quality and threshold-based detection, particularly for rare error types.

Schlagwörter

Fachgebiet (DDC)

004 - Computer Wissenschaften, Internet

Projekt

Veranstaltung

Startdatum der Ausstellung

Enddatum der Ausstellung

Startdatum der Konferenz

Enddatum der Konferenz

Datum der letzten Prüfung

ISBN

ISSN

2949-7191

Sprache

Englisch

Während FHNW Zugehörigkeit erstellt

Ja

Zukunftsfelder FHNW

Publikationsstatus

Veröffentlicht

Begutachtung

peer-reviewed

Open Access-Status

Gold

Lizenz

Zitation

Masanti, C., Witschel, H. F., & Riesen, K. (2026). Enhancing language models with boosting and targeted fine-tuning for real-word error detection. Natural Language Processing Journal, 14, 100202. https://doi.org/10.1016/j.nlp.2026.100202

Komplettanzeige