Pustulka, Elzbieta
Lade...
E-Mail-Adresse
Geburtsdatum
Projekt
Organisationseinheiten
Berufsbeschreibung
Nachname
Pustulka
Vorname
Elzbieta
Name
Pustulka, Elzbieta
4 Ergebnisse
Suchergebnisse
Gerade angezeigt 1 - 4 von 4
- PublikationFLIE: form labeling for information extraction(2021) Pustulka, Elzbieta; Hanne, Thomas; Gachnang, Phillip; Biafora, Pasquale; Arai, Kohei; Kapoor, Supriya; Bhatia, Rahul [in: Proceedings of the Future Technologies Conference (FTC) 2020]Information extraction (IE) from forms remains an unsolved problem, with some exceptions, like bills. Forms are complex and the templates are often unstable, due to the injection of advertising, extra conditions, or document merging. Our scenario deals with insurance forms used by brokers in Switzerland. Here, each combination of insurer, insurance type and language results in a new document layout, leading to a few hundred document types. To help brokers extract data from policies, we developed a new labeling method, called FLIE (form labeling for information extraction). FLIE first assigns a document to a cluster, grouping by language, insurer, and insurance type. It then labels the layout. To produce training data, the user annotates a sample document by hand, adding attribute names, i.e. provides a mapping. FLIE applies machine learning to propagate the mapping and extracts information. Our results are based on 24 Swiss policies in German: UVG (mandatory accident insurance), KTG (sick pay insurance), and UVGZ (optional accident insurance). Our solution has an accuracy of around 84-89%. It is currently being extended to other policy types and languages.04B - Beitrag Konferenzschrift
- PublikationA logistics serious game(2021) Pustulka, Elzbieta; Güler, Attila; Hanne, Thomas [in: GSGS'21. 6th International Conference on Gamification & Serious Game]Switzerland is a logistics hub which needs many trained professionals. As logistics does not have a strong public image, the profession does not attract enough young people. A logistics game could help recruit more candidates at the apprenticeship and university level and help in teaching. We have prototyped a logistics game and found out that it raises interest in logistics and successfully teaches about cargo ships. The game test showed that the game is visually appealing but the competitive aspect may interfere with learning.04B - Beitrag Konferenzschrift
- PublikationFLIE with rules(2021) Pustulka, Elzbieta; Hanne, Thomas; de Espona, LucíaFLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.06 - Präsentation
- PublikationText mining innovation for business(Springer, 2020) Pustulka, Elzbieta; Hanne, Thomas; Dornberger, Rolf [in: New trends in business information systems and technology. Digital innovation and digital business Transformation]This chapter reflects on the business innovation supported by developing text-mining solutions to meet the business needs communicated by Swiss companies. Two related projects from different industries and with different challenges are discussed in order to identify common procedures and methodologies that can be used. One of the partners, in the gig work sector, offers a platform solution for employee recruitment for temporary work. The work assessment is performed using short reviews for which a method for sentiment assessment based on machine learning has been developed. The other partner, in the financial advice sector, operates an information extraction service for business documents, including insurance policies. This requires automation in the extraction of structured information from pdf files. The common path to innovation in such projects includes business process modeling and the implementation of novel technological solutions, including text-mining techniques.04A - Beitrag Sammelband