FLIE with rules
Loading...
Author (Corporation)
Publication date
2021
Typ of student thesis
Course of study
Collections
Type
06 - Presentation
Editors
Editor (Corporation)
Supervisor
Parent work
Special issue
DOI of the original publication
Link
Series
Series number
Volume
Issue / Number
Pages / Duration
Patent number
Publisher / Publishing institution
Place of publication / Event location
Online
Edition
Version
Programming language
Assignee
Practice partner / Client
Abstract
FLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.
Keywords
Subject (DDC)
Event
SwissText 2021
Exhibition start date
Exhibition end date
Conference start date
14.06.2021
Conference end date
16.06.2021
Date of the last check
ISBN
ISSN
Language
English
Created during FHNW affiliation
Yes
Strategic action fields FHNW
Publication status
Review
Peer review of the abstract
Open access category
License
Citation
Pustulka, E., Hanne, T., & de Espona, L. (2021). FLIE with rules. SwissText 2021. https://irf.fhnw.ch/handle/11654/43051