Auflistung nach Autor:in "Biafora, Pasquale"
Gerade angezeigt 1 - 2 von 2
- Treffer pro Seite
- Sortieroptionen
Publikation FLIE: form labeling for information extraction(2021) Pustulka, Elzbieta; Hanne, Thomas; Gachnang, Phillip; Biafora, Pasquale; Arai, Kohei; Kapoor, Supriya; Bhatia, RahulInformation extraction (IE) from forms remains an unsolved problem, with some exceptions, like bills. Forms are complex and the templates are often unstable, due to the injection of advertising, extra conditions, or document merging. Our scenario deals with insurance forms used by brokers in Switzerland. Here, each combination of insurer, insurance type and language results in a new document layout, leading to a few hundred document types. To help brokers extract data from policies, we developed a new labeling method, called FLIE (form labeling for information extraction). FLIE first assigns a document to a cluster, grouping by language, insurer, and insurance type. It then labels the layout. To produce training data, the user annotates a sample document by hand, adding attribute names, i.e. provides a mapping. FLIE applies machine learning to propagate the mapping and extracts information. Our results are based on 24 Swiss policies in German: UVG (mandatory accident insurance), KTG (sick pay insurance), and UVGZ (optional accident insurance). Our solution has an accuracy of around 84-89%. It is currently being extended to other policy types and languages.04B - Beitrag KonferenzschriftPublikation Structured Information Extraction from Unstructured Documents(Hochschule für Wirtschaft FHNW, 2019) Biafora, Pasquale; Pustulka, ElzbietaThe constant development and use of information technology to digitize and digitalize business processes leads to an increasing amount of data available in various formats. This data comes in two main forms, namely structured and unstructured. Nowadays, around 80% of data in organisations is unstructured (Grimes, 2008). Insurance policy documents are a good example of this kind of data, with a lot of text and domain specific language. It is difficult to leverage this data as there is no clear structure and the massive amount of data makes it too time consuming to analyse it manually. Over the past years, the need to handle unstructured data has arisen and that’s why many researchers are trying to tackle this challenge.Currently, it is not possible to automatically extract structured information from insurance policy documents. Insurance brokers analyse every policy by hand and search for relevant terms. This information is then used to compare different insurance quotes and to offer the best combination to the customer. Instead of spending their time in consultancy and helping their clients find the best insurance solution, the brokers lose a lot of time in extracting relevant information....11 - Studentische Arbeit