FLIE with rules

dc.contributor.authorPustulka, Elzbieta
dc.contributor.authorHanne, Thomas
dc.contributor.authorde Espona, Lucía
dc.date.accessioned2024-04-29T09:24:50Z
dc.date.available2024-04-29T09:24:50Z
dc.date.issued2021
dc.description.abstractFLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.
dc.eventSwissText 2021
dc.event.end2021-06-16
dc.event.start2021-06-14
dc.identifier.urihttps://irf.fhnw.ch/handle/11654/43051
dc.language.isoen
dc.spatialOnline
dc.subject.ddc330 - Wirtschaft
dc.titleFLIE with rules
dc.type06 - Präsentation
dspace.entity.typePublication
fhnw.InventedHereYes
fhnw.ReviewTypeAnonymous ex ante peer review of an abstract
fhnw.affiliation.hochschuleHochschule für Wirtschaftde_CH
fhnw.affiliation.institutInstitut für Wirtschaftsinformatikde_CH
relation.isAuthorOfPublication3e7f2a0a-692e-4652-b305-7a7e19e011de
relation.isAuthorOfPublication35d8348b-4dae-448a-af2a-4c5a4504da04
relation.isAuthorOfPublication.latestForDiscovery3e7f2a0a-692e-4652-b305-7a7e19e011de
Dateien
Lizenzbündel
Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
license.txt
Größe:
1.36 KB
Format:
Item-specific license agreed upon to submission
Beschreibung: