Multimodal Human-Robot Interaction Combining Speech, Facial Expressions, and Eye Gaze

Vorschaubild nicht verfügbar
Autor:in (Körperschaft)
Publikationsdatum
2021
Typ der Arbeit
Master
Studiengang
Typ
11 - Studentische Arbeit
Herausgeber:innen
Herausgeber:in (Körperschaft)
Übergeordnetes Werk
Themenheft
DOI der Originalpublikation
Link
Reihe / Serie
Reihennummer
Jahrgang / Band
Ausgabe / Nummer
Seiten / Dauer
Patentnummer
Verlag / Herausgebende Institution
Hochschule für Wirtschaft FHNW
Verlagsort / Veranstaltungsort
Olten
Auflage
Version
Programmiersprache
Abtretungsempfänger:in
Praxispartner:in/Auftraggeber:in
Zusammenfassung
Human-Robot Interaction (HRI) is being applied more and more frequently in different areas with the continuous uprise of technology. While initially, dialogue systems only involved recognizing a spoken or text input, the shift has changed towards multimodal dialogue systems. In essence, this implies that multiple input channels are retrieved, such as apart from the verbal input, also various nonverbal channels such as the human’s facial expression. The artifact developed during this Master thesis consists of a multimodal dialogue system involving Pepper, a humanoid robot developed by SoftBank Robotics (n.d.-a) and input channels capturing the human’s speech, facial expression, and eye gaze. By establishing a modular architecture using network communication, the collected inputs are combined and sent to Rasa (2021), an open-source conversational agent running on an intermediate server. Upon Pepper receiving the response selected by Rasa, a body language animation is performed, and an emoji matching the social context is displayed on the attached tablet, while simultaneously speaking the response back to the interacting partner. The results of the evaluation phase propose that while speech and eye gaze recognition achieve high levels of accuracy, the facial expression recognition component cannot provide the same reliability. Apart from the facial expression recognition concept proposed by SoftBank Robotics (n.d.-g), two different approaches were defined by the author of this Master thesis and require further evaluation. The overall response times of the multimodal system are kept low with Rasa requiring the majority of the time to select the appropriate response.
Schlagwörter
Fachgebiet (DDC)
330 - Wirtschaft
Projekt
Veranstaltung
Startdatum der Ausstellung
Enddatum der Ausstellung
Startdatum der Konferenz
Enddatum der Konferenz
Datum der letzten Prüfung
ISBN
ISSN
Sprache
Englisch
Während FHNW Zugehörigkeit erstellt
Ja
Publikationsstatus
Begutachtung
Open Access-Status
Lizenz
Zitation
APPLEWHITE, Timothy, 2021. Multimodal Human-Robot Interaction Combining Speech, Facial Expressions, and Eye Gaze. Olten: Hochschule für Wirtschaft FHNW. Verfügbar unter: https://irf.fhnw.ch/handle/11654/40432