Detecting Hidden Backdoors in Large Language Models

dc.contributor.authorPeechat, Jibin Mathew
dc.contributor.mentorChristen, Patrik
dc.contributor.partnerHochschule für Wirtschaft FHNW, Institut für Wirtschaftsinformatik, Basel
dc.date.accessioned2025-12-15T13:32:37Z
dc.date.issued2025
dc.description.abstractThe adoption of LLMs in critical systems raises concerns regarding privacy, security and trust, particularly with regard to the risk of hidden backdoors — malicious triggers that cause covert data transfer or altered behaviour. Such threats are difficult to detect, especially when models are black-box systems with concealed triggers. This thesis addresses this issue by empirically evaluating the DeepSeek-R1 model family for potential backdoors during local execution. The aim is to identify anomalies and recommend safer deployment practices.
dc.identifier.urihttps://irf.fhnw.ch/handle/11654/54704
dc.language.isoen
dc.publisherHochschule für Wirtschaft FHNW
dc.spatialBrugg-Windisch
dc.subject.ddc330 - Wirtschaft
dc.titleDetecting Hidden Backdoors in Large Language Models
dc.type11 - Studentische Arbeit
dspace.entity.typePublication
fhnw.InventedHereYes
fhnw.StudentsWorkTypeBachelor
fhnw.affiliation.hochschuleHochschule für Wirtschaft FHNWde_CH
fhnw.affiliation.institutBachelor of Sciencede_CH
relation.isMentorOfPublicationd6fa5f05-5123-4d2f-8e74-79adfe54acc7
relation.isMentorOfPublication.latestForDiscoveryd6fa5f05-5123-4d2f-8e74-79adfe54acc7
Dateien