Information Extraction from Building Insurance Policies

No Thumbnail Available
Author (Corporation)
Publication date
2020
Typ of student thesis
Master
Course of study
Type
11 - Student thesis
Editors
Editor (Corporation)
Parent work
Special issue
DOI of the original publication
Link
Series
Series number
Volume
Issue / Number
Pages / Duration
Patent number
Publisher / Publishing institution
Hochschule für Wirtschaft FHNW
Place of publication / Event location
Olten
Edition
Version
Programming language
Assignee
Practice partner / Client
Abstract
Information extraction from pdf documents such as Swiss insurance policies is a challenge to which no well-established solution yet exists. FHNW has a project running where they made first attempts in extracting information from insurance policies by applying a machine-learning based approach. However, the solution is not ripe for the market yet because the accuracy is not sufficient. The goal of this master thesis was to manually explore what kind of information a building insurance policy contains and to develop a rule-based approach to annotate building insurance policies. We received a data-base containing bounding-boxes of about 3’000 scanned documents containing the term “Gebäudeversicherung”. We programmed an algorithm to automatically annotate OCR-processed building insurance policies of the company Mobiliar. The algorithm returns useful output, most of the text boxes labelled correctly. These results can be of use to further develop the machine-learning-based approach of the FHNW project.
Keywords
Subject (DDC)
330 - Wirtschaft
Project
Event
Exhibition start date
Exhibition end date
Conference start date
Conference end date
Date of the last check
ISBN
ISSN
Language
English
Created during FHNW affiliation
Yes
Strategic action fields FHNW
Publication status
Review
Open access category
License
Citation
SCHMID, Celia Lorena, 2020. Information Extraction from Building Insurance Policies. Olten: Hochschule für Wirtschaft FHNW. Verfügbar unter: https://irf.fhnw.ch/handle/11654/40412