Hanne, Thomas
E-Mail-Adresse
Geburtsdatum
Projekt
Organisationseinheiten
Berufsbeschreibung
Nachname
Vorname
Name
Suchergebnisse
FLIE: form labeling for information extraction
2021, Pustulka, Elzbieta, Hanne, Thomas, Gachnang, Phillip, Biafora, Pasquale, Arai, Kohei, Kapoor, Supriya, Bhatia, Rahul
Information extraction (IE) from forms remains an unsolved problem, with some exceptions, like bills. Forms are complex and the templates are often unstable, due to the injection of advertising, extra conditions, or document merging. Our scenario deals with insurance forms used by brokers in Switzerland. Here, each combination of insurer, insurance type and language results in a new document layout, leading to a few hundred document types. To help brokers extract data from policies, we developed a new labeling method, called FLIE (form labeling for information extraction). FLIE first assigns a document to a cluster, grouping by language, insurer, and insurance type. It then labels the layout. To produce training data, the user annotates a sample document by hand, adding attribute names, i.e. provides a mapping. FLIE applies machine learning to propagate the mapping and extracts information. Our results are based on 24 Swiss policies in German: UVG (mandatory accident insurance), KTG (sick pay insurance), and UVGZ (optional accident insurance). Our solution has an accuracy of around 84-89%. It is currently being extended to other policy types and languages.
Text mining innovation for business
2020, Pustulka, Elzbieta, Hanne, Thomas, Dornberger, Rolf
This chapter reflects on the business innovation supported by developing text-mining solutions to meet the business needs communicated by Swiss companies. Two related projects from different industries and with different challenges are discussed in order to identify common procedures and methodologies that can be used. One of the partners, in the gig work sector, offers a platform solution for employee recruitment for temporary work. The work assessment is performed using short reviews for which a method for sentiment assessment based on machine learning has been developed. The other partner, in the financial advice sector, operates an information extraction service for business documents, including insurance policies. This requires automation in the extraction of structured information from pdf files. The common path to innovation in such projects includes business process modeling and the implementation of novel technological solutions, including text-mining techniques.
An experiment with an optimization game
2019, Pustulka, Elzbieta, Hanne, Thomas, Adriaensen, Benjamin, Eggenschwiler, Stefan, Kaba, Egemen, Wetzel, Richard, Blashki, Katherine, Xiao, Yingcai
We aim to improve the teaching of the principles of optimization, including computational intelligence (CI), to a mixed audience of business and computer science students. Our students do not always have sufficient programming or mathematics experience and may be put off by the expected difficulty of the course. In this context we are testing the potential of games in teaching. We deployed a game prototype (design probe) and found out that the prototype led to increased student motivation, intuitive understanding of the principles of optimization, and strong interaction in a team. Ultimately, with the future work we sketch out, this novel approach could improve the learning and understanding of optimization algorithms and CI in general, contributing to the future of Explainable AI (XAI).
A logistics serious game
2021, Pustulka, Elzbieta, Güler, Attila, Hanne, Thomas
Switzerland is a logistics hub which needs many trained professionals. As logistics does not have a strong public image, the profession does not attract enough young people. A logistics game could help recruit more candidates at the apprenticeship and university level and help in teaching. We have prototyped a logistics game and found out that it raises interest in logistics and successfully teaches about cargo ships. The game test showed that the game is visually appealing but the competitive aspect may interfere with learning.
Sentiment analysis for a swiss gig platform company
2019, Pustulka, Elzbieta, Hanne, Thomas
We work with a Swiss Gig Platform Company to identify innovative solutions which could strengthen its position as a market leader in Switzerland and Europe. The company mediates between employers and employees in short term work contracts via a platform system. We first looked at the business processes and saw that some process parts were not being controlled by the company, which is now being remedied. Second, we analyzed the job reviews which the employers and employees write, and implemented a prototype which can detect negative statements automatically, even if the review is positive overall. We worked with a dataset of 963 job reviews from employers and employees, in German, French and English. The reviews have a star rating (1 to 4 stars), with some discrepancies between the star rating and the text. We scored the reviews manually as negative or other, as negative reviews are important for business improvement. We tested several machine learning methods and a hybrid method from Lexalytics.
Multilingual Sentiment Analysis for a Swiss Gig
2018-08-27, Pustulka, Elzbieta, Hanne, Thomas, Blumer, Eliane, Frieder, Manuel, Wong, Ka Chun
We are developing a multilingual sentiment analysis solution for a Swiss human resource company working in the gig sector. To examine the feasibility of using machine learning in this context, we carried out three sentiment assignment experiments. As test data we use 963 hand annotated comments made by workers and their employers. Our baseline, machine learning (ML) on Twitter, had an accuracy of 0.77 with the Matthews correlation coefficient (MCC) of 0.32. A hybrid solution, Semantria from Lexalytics, had an accuracy of 0.8 with MCC of 0.42, while a tenfold cross-validation on the gig data yielded the accuracy of 0.87, F1 score 0.91, and MCC 0.65. Our solution did not require language assignment or stemming and used standard ML software. This shows that with more training data and some feature engineering, an industrial strength solution to this problem should be possible.
FLIE with rules
2021, Pustulka, Elzbieta, Hanne, Thomas, de Espona, Lucía
FLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.
A game teaching population based optimization using teaching-learning-based optimization
2019, Pustulka, Elzbieta, Hanne, Thomas, Richard, Wetzel, Egemen, Kaba, Benjamin, Adriaensen, Stefan, Eggenschwiler, Adriaensen, Benjamin
We want to lower the entry barrier to optimization courses. To that aim, we deployed a game prototype and tested it with students who had no previous optimization experience. We found out that the prototype led to an increased student motivation, an intuitive understanding of the principles of optimization, and a strong interaction in a team. We will build on this experience to develop further games for classroom use.
Gig work business process improvement
2018, Pustulka, Elzbieta, Telesko, Rainer, Hanne, Thomas, Wong, Ka Chun
We collaborate with a gig work platform company (GPC) in Switzerland. The project aims to improve the business by influencing process management within the GPC, providing automated matching of jobs to workers, improving worker acquisition and worker commitment, and particularly focusing on the prevention of no shows. One expects to achieve financial, organizational and efficiency gains. As research tools we use a combination of text mining and sentiment analysis, Business Process Modeling and Notation (BPMN), interviews with workers and employers, and the design of sociotechnical improvements to the process, including platform improvements and prototypes. Here, we focus on the successful combination of BPMN modelling with sentiment analysis in the identification of problems and generation of ideas for future modifications to the business processes.