Hochschule für Wirtschaft FHNW

Dauerhafte URI für den Bereichhttps://irf.fhnw.ch/handle/11654/60

Listen

Bereich: Suchergebnisse

Gerade angezeigt 1 - 10 von 17
  • Publikation
    Sentiment analysis for a swiss gig platform company
    (2019) Pustulka, Elzbieta; Hanne, Thomas
    We work with a Swiss Gig Platform Company to identify innovative solutions which could strengthen its position as a market leader in Switzerland and Europe. The company mediates between employers and employees in short term work contracts via a platform system. We first looked at the business processes and saw that some process parts were not being controlled by the company, which is now being remedied. Second, we analyzed the job reviews which the employers and employees write, and implemented a prototype which can detect negative statements automatically, even if the review is positive overall. We worked with a dataset of 963 job reviews from employers and employees, in German, French and English. The reviews have a star rating (1 to 4 stars), with some discrepancies between the star rating and the text. We scored the reviews manually as negative or other, as negative reviews are important for business improvement. We tested several machine learning methods and a hybrid method from Lexalytics.
    06 - Präsentation
  • Publikation
    FLIE with rules
    (2021) Pustulka, Elzbieta; Hanne, Thomas; de Espona, Lucía
    FLIE (Form Labelling for Information Extraction) allows us to extract information from Swiss insurance policies. Insurance policies are forms which are weakly aligned and do not lend themselves to automated data extraction without preprocessing. Our preprocessing annotates data with geometry and combined with manual training data generation gives the extraction accuracy of over 80% for a subset of attributes which have been seen 8 times or more. In this paper we extend FLIE with rules. The aim is to compare machine learning used in FLIE to the standard industry approach of using rules to extract data. We hand crafted rules (regular expressions in Python) for the KTG insurance (27 rules), UVG insurance (29 rules), and UVG-Z (23 rules), for each insurance type covering around 20 attributes. We also generated rules for building insurance policies which we were new to (16 rules encoded in SpaCy). In all cases we saw that using rules alone gives us a similar accuracy in data extraction to machine learning (around 80%). In the case of building insurance the accuracy is higher, above 96%, with precision and recall around 89-92%. To support annotation and experimental evaluation, we created an annotation GUI and a GUI which automates the ML experiment. Planned work includes a comparison of rule based and ML approaches and extension to further policy types.
    06 - Präsentation
  • Publikation
    Learning Java Loops and Control Structures by Moving a Ladybird
    (2023) Pustulka, Elzbieta; Spadola, Alessandro
    We adapted an existing Java teaching game called JavaKara to help students learn how to use loops and control statements and tested it in class. Two groups of BSc Students in an introductory Java course played the game for about an hour. The game was evaluated using the MEEGA+ game evaluation method. A questionnaire was administered to get feedback and the game got a score 53.45, i.e. good. Students reported that they lost track of time and were satisfied with this new learning paradigm.
    04B - Beitrag Konferenzschrift
  • Publikation
    An experiment with an optimization game
    (2019) Pustulka, Elzbieta; Hanne, Thomas; Adriaensen, Benjamin; Eggenschwiler, Stefan; Kaba, Egemen; Wetzel, Richard; Blashki, Katherine; Xiao, Yingcai
    We aim to improve the teaching of the principles of optimization, including computational intelligence (CI), to a mixed audience of business and computer science students. Our students do not always have sufficient programming or mathematics experience and may be put off by the expected difficulty of the course. In this context we are testing the potential of games in teaching. We deployed a game prototype (design probe) and found out that the prototype led to increased student motivation, intuitive understanding of the principles of optimization, and strong interaction in a team. Ultimately, with the future work we sketch out, this novel approach could improve the learning and understanding of optimization algorithms and CI in general, contributing to the future of Explainable AI (XAI).
    04B - Beitrag Konferenzschrift
  • Publikation
    FLIE: form labeling for information extraction
    (2021) Pustulka, Elzbieta; Hanne, Thomas; Gachnang, Phillip; Biafora, Pasquale; Arai, Kohei; Kapoor, Supriya; Bhatia, Rahul
    Information extraction (IE) from forms remains an unsolved problem, with some exceptions, like bills. Forms are complex and the templates are often unstable, due to the injection of advertising, extra conditions, or document merging. Our scenario deals with insurance forms used by brokers in Switzerland. Here, each combination of insurer, insurance type and language results in a new document layout, leading to a few hundred document types. To help brokers extract data from policies, we developed a new labeling method, called FLIE (form labeling for information extraction). FLIE first assigns a document to a cluster, grouping by language, insurer, and insurance type. It then labels the layout. To produce training data, the user annotates a sample document by hand, adding attribute names, i.e. provides a mapping. FLIE applies machine learning to propagate the mapping and extracts information. Our results are based on 24 Swiss policies in German: UVG (mandatory accident insurance), KTG (sick pay insurance), and UVGZ (optional accident insurance). Our solution has an accuracy of around 84-89%. It is currently being extended to other policy types and languages.
    04B - Beitrag Konferenzschrift
  • Publikation
    Measuring the benefits of CI/CD practices for database application development
    (2023) Fluri, Jasmin; Fornari, Fabrizio; Pustulka, Elzbieta
    Modern software development practices automate software integration and reduce repetitive software engineering work. Automation reduces the time it takes from defining software requirements to deploying the software in production. However, when it comes to database applications, the database integration and deployment are often executed manually, making it costly and error-prone. To mitigate this, we extended current software development methodologies by designing a CI/CD pipeline that takes into consideration the database setting. We report on two industrial case studies in which we implemented a newly designed pipeline and we measure the benefits of integration and deployment automation in database development projects. From a quantitative perspective, we found that introducing CI/CD pipelines reduces failed deployments, improves stability and increases the number of executed deployments. From a qualitative perspective, we interviewed the developers before and after the implementation of the CI/CD pipeline and the results show the CI/CD pipeline brings clear benefits to the development team (i.e., reduced cognitive load). This finding puts current database release practices driven by business expectations such as fixed release windows in question.
    04B - Beitrag Konferenzschrift
  • Publikation
    Extending SQL Scrolls to teach SQL DML
    (2022) Pustulka, Elzbieta; de Espona, Lucía; Kennel, Andrea
    SQL (Structured Query Language) allows a business user to communicate with a relational database. A learner who wants to master SQL needs practice, patience and motivation, which we support in a game called SQL Scrolls. Student surveys we carried out show that this approach encourages our students to practice and students are enthusiastic and want to see more games in other subjects. We are now extending the game to cover all of SQL DML and offer 500 questions.
    04B - Beitrag Konferenzschrift
  • Publikation
    Building a NoSQL ERP
    (2022) Pustulka, Elzbieta; von Arx, Stefan; Espona, Lucía
    Enterprise Resource Planning (ERP) systems are needed in many business activities. SMEs (small and medium enterprises) are not well served by current ERPs, as such systems are hard to tailor. This prompts us to experiment with building an ERP on top of a NoSQL database, which intends to be more flexible, as it is based on JSON and not on a relational data model. We present a novel ERP solution specifically designed to grow and evolve as the world changes. The ERP is for a service company which bills for time spent on customer projects. The work involves various challenges: data modelling, query specification, write and read performance analysis, versioning, user interface generation, and query optimisation. Here, we report on the performance of a NoSQL ERP using MongoDB and show that writes are fast and queries and reports are fast enough.
    04B - Beitrag Konferenzschrift
  • Publikation
    A game teaching population based optimization using teaching-learning-based optimization
    (2019) Pustulka, Elzbieta; Hanne, Thomas; Richard, Wetzel; Egemen, Kaba; Benjamin, Adriaensen; Stefan, Eggenschwiler; Adriaensen, Benjamin
    We want to lower the entry barrier to optimization courses. To that aim, we deployed a game prototype and tested it with students who had no previous optimization experience. We found out that the prototype led to an increased student motivation, an intuitive understanding of the principles of optimization, and a strong interaction in a team. We will build on this experience to develop further games for classroom use.
    04B - Beitrag Konferenzschrift
  • Publikation
    Automatic indexing for MongoDB
    (Springer, 2023) Espona, Lucía; Vichalkovski, Anton; Steingartner, William; Pustulka, Elzbieta; Abelló, Alberto; Vassiliadis, Panos; Romero, Oscar; Wrembel, Robert; Bugiotti, Francesca; Gamper, Johann; Vargas Solar, Genoveva; Zumpano, Ester
    We present a new method for automated index suggestion for MongoDB, based solely on the queries (called aggregation pipelines), without requiring data or usage information. The solution handles complex aggregations and is suitable for both cloud and standalone databases. We validated the algorithm on TPC-H and showed that all suggested indexes were used. We report on the performance and provide hints for further development of an automated method of index selection. Our algorithm is, to the best of our knowledge, the first query-based solution for automated indexing in MongoDB.
    04B - Beitrag Konferenzschrift