Pustulka, Elzbieta

Pustulka, Elzbieta

E-Mail-Adresse

Geburtsdatum

Projekt

Organisationseinheiten

Berufsbeschreibung

Nachname

Pustulka

Vorname

Elzbieta

Name

Pustulka, Elzbieta

Komplettanzeige

Suchergebnisse

Gerade angezeigt 1 - 6 von 6

FLIE: form labeling for information extraction
(2021) Pustulka, Elzbieta; Hanne, Thomas; Gachnang, Phillip; Biafora, Pasquale; Arai, Kohei; Kapoor, Supriya; Bhatia, Rahul [in: Proceedings of the Future Technologies Conference (FTC) 2020]
Information extraction (IE) from forms remains an unsolved problem, with some exceptions, like bills. Forms are complex and the templates are often unstable, due to the injection of advertising, extra conditions, or document merging. Our scenario deals with insurance forms used by brokers in Switzerland. Here, each combination of insurer, insurance type and language results in a new document layout, leading to a few hundred document types. To help brokers extract data from policies, we developed a new labeling method, called FLIE (form labeling for information extraction). FLIE first assigns a document to a cluster, grouping by language, insurer, and insurance type. It then labels the layout. To produce training data, the user annotates a sample document by hand, adding attribute names, i.e. provides a mapping. FLIE applies machine learning to propagate the mapping and extracts information. Our results are based on 24 Swiss policies in German: UVG (mandatory accident insurance), KTG (sick pay insurance), and UVGZ (optional accident insurance). Our solution has an accuracy of around 84-89%. It is currently being extended to other policy types and languages.
04B - Beitrag Konferenzschrift
FLIE with rules
(2021) Pustulka, Elzbieta; Hanne, Thomas; de Espona, Lucía
FLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.
06 - Präsentation
Text mining innovation for business
(Springer, 2020) Pustulka, Elzbieta; Hanne, Thomas; Dornberger, Rolf [in: New trends in business information systems and technology. Digital innovation and digital business Transformation]
This chapter reflects on the business innovation supported by developing text-mining solutions to meet the business needs communicated by Swiss companies. Two related projects from different industries and with different challenges are discussed in order to identify common procedures and methodologies that can be used. One of the partners, in the gig work sector, offers a platform solution for employee recruitment for temporary work. The work assessment is performed using short reviews for which a method for sentiment assessment based on machine learning has been developed. The other partner, in the financial advice sector, operates an information extraction service for business documents, including insurance policies. This requires automation in the extraction of structured information from pdf files. The common path to innovation in such projects includes business process modeling and the implementation of novel technological solutions, including text-mining techniques.
04A - Beitrag Sammelband
Sentiment analysis for a swiss gig platform company
(2019) Pustulka, Elzbieta; Hanne, Thomas
We work with a Swiss Gig Platform Company to identify innovative solutions which could strengthen its position as a market leader in Switzerland and Europe. The company mediates between employers and employees in short term work contracts via a platform system. We first looked at the business processes and saw that some process parts were not being controlled by the company, which is now being remedied. Second, we analyzed the job reviews which the employers and employees write, and implemented a prototype which can detect negative statements automatically, even if the review is positive overall. We worked with a dataset of 963 job reviews from employers and employees, in German, French and English. The reviews have a star rating (1 to 4 stars), with some discrepancies between the star rating and the text. We scored the reviews manually as negative or other, as negative reviews are important for business improvement. We tested several machine learning methods and a hybrid method from Lexalytics.
06 - Präsentation
A game teaching population based optimization using teaching-learning-based optimization
(2019) Pustulka, Elzbieta; Hanne, Thomas; Richard, Wetzel; Egemen, Kaba; Benjamin, Adriaensen; Stefan, Eggenschwiler; Adriaensen, Benjamin [in: GSGS'19. 4th Gamification & Serious Game Symposium]
We want to lower the entry barrier to optimization courses. To that aim, we deployed a game prototype and tested it with students who had no previous optimization experience. We found out that the prototype led to an increased student motivation, an intuitive understanding of the principles of optimization, and a strong interaction in a team. We will build on this experience to develop further games for classroom use.
04B - Beitrag Konferenzschrift
An experiment with an optimization game
(2019) Pustulka, Elzbieta; Hanne, Thomas; Adriaensen, Benjamin; Eggenschwiler, Stefan; Kaba, Egemen; Wetzel, Richard; Blashki, Katherine; Xiao, Yingcai [in: IADIS International Conference Interfaces and Human Computer Interaction 2019 (part of MCCSIS 2019)]
We aim to improve the teaching of the principles of optimization, including computational intelligence (CI), to a mixed audience of business and computer science students. Our students do not always have sufficient programming or mathematics experience and may be put off by the expected difficulty of the course. In this context we are testing the potential of games in teaching. We deployed a game prototype (design probe) and found out that the prototype led to increased student motivation, intuitive understanding of the principles of optimization, and strong interaction in a team. Ultimately, with the future work we sketch out, this novel approach could improve the learning and understanding of optimization algorithms and CI in general, contributing to the future of Explainable AI (XAI).
04B - Beitrag Konferenzschrift

Pustulka, Elzbieta

E-Mail-Adresse

Geburtsdatum

Projekt

Organisationseinheiten

Berufsbeschreibung

Nachname

Vorname

Name

Filter

Hochschule

Institut

Autor:in

Typ

Thema

Datum

Enthält Dateien

Item-Typ

Einstellungen

Sortieren nach

Ergebnisse pro Seite

Suchergebnisse