Vol.:(0123456789)1 3 Electronic Markets (2023) 33:56 https://doi.org/10.1007/s12525-023-00673-0 RESEARCH PAPER Seeking empathy or suggesting a solution? Effects of chatbot messages on service failure recovery Martin Haupt1,2  · Anna Rozumowski3  · Jan Freidank2 · Alexander Haas1 Received: 7 July 2023 / Accepted: 13 September 2023 / Published online: 4 November 2023 © The Author(s) 2023 Abstract Chatbots as prominent form of conversational agents are increasingly implemented as a user interface for digital customer- firm interactions on digital platforms and electronic markets, but they often fail to deliver suitable responses to user requests. In turn, individuals are left dissatisfied and turn away from chatbots, which harms successful chatbot implementation and ultimately firm’s service performance. Based on the stereotype content model, this paper explores the impact of two univer- sally usable failure recovery messages as a strategy to preserve users’ post-recovery satisfaction and chatbot re-use intentions. Results of three experiments show that chatbot recovery messages have a positive effect on recovery responses, mediated by different elicited social cognitions. In particular, a solution-oriented message elicits stronger competence evaluations, whereas an empathy-seeking message leads to stronger warmth evaluations. The preference for one of these message types over the other depends on failure attribution and failure frequency. This study provides meaningful insights for chatbot technology developers and marketers seeking to understand and improve customer experience with digital conversational agents in a cost-effective way. Keywords Artificial intelligence · Chatbot · Service failure · Failure recovery · Social cognitions · Digital platform JEL Classification C91 · L86 · M15 · M31 Introduction Driven by innovative technological advancements such as artificial intelligence or machine learning, chatbots are widely used nowadays and provide customer service on digi- tal platforms such as social media, enterprise messengers, or websites (Pizzi et al., 2021; Stoeckli et al., 2020). These agents increasingly substitute for human staff in electronic markets (van Pinxteren et al., 2020) and the global chatbot market is predicted to rise substantially from $17 billion in 2020 to over $102 billion in 2026 (Mordor Intelligence, 2021). As a remarkable and most recent example, Open AI’s “ChatGPT” has attracted over 1 million users in 5 days, and is sought to disrupt numerous tasks in marketing, law, or journalism and might even threaten Google by offering more humanlike answers and a smoother experience (Olson, 2022). However, despite these technological advancements and considerable market potential, chatbots often fail in practice to deliver satisfactory responses to users’ requests (Adam et al., 2021; Brandtzaeg & Følstad, 2018; Seeger & Heinzl, 2021). Customers are often left dissatisfied Responsible Editor: Martin Adam * Martin Haupt martin.haupt@w.thm.de Anna Rozumowski anna.rozumowski@fhnw.ch Jan Freidank jan.freidank@w.thm.de Alexander Haas alexander.haas@wirtschaft.uni-giessen.de 1 Department of Marketing and Sales Management, Justus- Liebig University, Licher Strasse 66, 35394 Giessen, Germany 2 THM Business School, Technische Hochschule Mittelhessen, Wiesenstrasse 14, 35390 Giessen, Germany 3 School of Business, Institute for Competitiveness and Communication, FHNW University of Applied Sciences and Arts Northwestern Switzerland, Riggenbachstrasse 16, 4600 Olten, Switzerland http://crossmark.crossref.org/dialog/?doi=10.1007/s12525-023-00673-0&domain=pdf http://orcid.org/0000-0002-1083-0388 http://orcid.org/0000-0003-0501-4180 Electronic Markets (2023) 33:56 1 3 56 Page 2 of 22 after receiving a response failure message from chatbots, which leads firms to risk negative consequences such as usage discontinuance and a decrease in firm performance (Diederich et al., 2020; Weiler et al., 2022). According to a recent survey from the banking industry, four out of five consumers are dissatisfied with chatbot interactions and almost 75% of consumers confirm that chatbots are often unable to provide correct answers (Sporrer, 2021). Con- cerning the consequences, about one-third of consumers (30%) stated that they would turn away from the company or spread negative word of mouth after just one negative experience with the chatbot. Due to that threat and high levels of service failures, numerous companies including Facebook or SAP shut down chatbots on their digital plat- forms (Dilmegani, 2022; Thorbecke, 2022). A chatbot response failure refers to an inadequate answer or no answer at all, which is sometimes also labelled as conversational breakdown (Benner et al., 2021; Weiler et al., 2022). Chatbot response failures reflect a service failure for the company, as the digital agent was unable to deliver satisfying information to support users’ goals. Users evaluate chatbot’s response failures as an insufficient service offer, comparable to response failures from a human frontline service employee, service robots, or other digital self-service technologies (Sungwoo Choi et al., 2021; Col- lier et al., 2017). Service failures have serious impacts on firms as they harm favorable customer reactions such as satisfaction, loyalty, or positive word-of-mouth (Roschk & Gelbrich, 2014). According to research from Qualtrics and ServiceNow (2021), almost half of the respondents consider switching brands already after a single negative customer service interaction, and US companies risk losing around $1.9 trillion of customer spending annually due to such poor experiences. This is particularly relevant for digi- tal platforms (e.g., Airbnb, Uber), as their major basis for value creation resides in providing “efficient and conveni- ent facilitation of transactions” (Hein et al., 2020, p. 91). In contrast to other industries, these platforms highly depend on their service offer (instead of products) and positive user experiences. Thus, providing the option to use chatbots offers large efficiencies for them, but at the same time also pose a threat in case of insufficient implementation. Failure recovery strategies are therefore urgently needed to mitigate negative user responses and financial losses. In this regard, recovery messages are suggested as a viable option for chatbots to mitigate negative responses after self- inflicted response failure by the chatbots (Ashktorab et al., 2019; Benner et al., 2021). Such recovery messages aim to increase the chatbot’s response capabilities to address the response failure, but also to mitigate negative user reactions to reduce the impacts of the perceived service failure. Yet, relatively little is known about the impact of recovery strategies in chatbot conversations on customer responses. Recent studies rather focused on reasons for response fail- ures (Janssen et al., 2021; Reinkemeier & Gnewuch, 2022), identified different recovery strategies (Benner et al., 2021), or assessed user preferences for diverse recovery strategies (Ashktorab et al., 2019). These scholars asserted the poten- tial of such strategies to prevent negative reactions follow- ing a failure (Benner et al., 2021; Reinkemeier & Gnewuch, 2022), but analyses of the effectiveness of recovery mes- sages or comparisons of different types remain scarce. In contrast to such post-failure messages, Weiler et al. (2022) investigated how ex-ante messages (i.e., at the beginning of the chatbot interaction) influence users discontinuance of the chatbot interaction. Based on the stereotype content model (Cuddy et al., 2008) and the results of a pilot study which assessed chat- bot’s failure recovery strategies in real life, this study inves- tigates the effects of two fundamental recovery message types—namely, seeking user’s empathy versus suggesting a solution—on user’s perceived warmth and competence, as well as on post-recovery responses. Although research about recovery messages remains largely unexplored, these two message types reflect two relevant recovery strategies as presented in the literature-based analysis of Benner et al. (2021). To receive initial empirical evidence, the pilot study assessed the failure recovery messages of 101 chatbots from business, education, and public administration and revealed that these two message types (along with a simple error message) reflect the mainly used recovery attempts in busi- ness practice. Furthermore, this research aims to understand under which circumstances which message type is advantageous regarding user satisfaction. Therefore, it considers two situ- ational factors, in particular failure attribution and failure frequency. Results of three experimental studies show that both messages (i.e., empathy and solution) trigger specific social cognitions, more precisely either higher warmth or competence perceptions. In turn, these perceptions were found to influence people’s post-recovery satisfaction and re-use intentions—but they do so to different degrees depend- ing on the context. This research contributes to the growing literature in information systems (IS) related to chatbots as digital con- versational agents and offers relevant implications for firms on how their chatbots should respond to a response failure in different contexts. Thereby, we integrate the technological (and IS) perspective related to chatbots’ limited function- ality and response failures with the service-oriented (and consumer psychology) perspective of recovery attempts to the service failure occurred. Our findings highlight the possibility to use recovery messages as low-cost, easy to program, and universally usable strategy. Furthermore, they reveal the need to design a chatbot conversation carefully, and that the choice of an effective recovery message depends Electronic Markets (2023) 33:56 1 3 Page 3 of 22 56 on situational factors. Recommendations for chatbot soft- ware developers and chatbot-employing firms are provided. Conceptual background Chatbots as digital conversational agents Chatbots are text-based digital conversational agents that use natural language processing (NLP) to interact with users (Gnewuch et al., 2017; Wirtz et al., 2018). These features lead to higher interaction and intelligence levels compared to other IS technologies (Maedche et al., 2019). Chatbots are a cost- effective tool for companies to automate customer-firm inter- actions while maintaining value and personalized service for their clients. Due to the convenient, easy, and fast service and their 24/7 availability, the integration of chatbots is growing exponentially in various industries such as service, hospital- ity, healthcare, or education (van Pinxteren et al., 2020). With the rise of chatbots, research increased tremendously in the last years, and scholars mainly investigated chatbot interac- tions from three perspectives, namely digital agent’s design elements (Diederich et al., 2020; Gnewuch et al., 2018; Kull et al., 2021), consumer responses to the digital interaction (Mozafari et al., 2022), and consumer responses to chatbot failures (see Sands et al., 2022 for an overview). Among these research fields, finding appropriate solutions for recovery of chatbot failures is particularly relevant, as it determines consumers’ continuance decisions and ultimately a chatbot’s success (Adam et al., 2021; Lv et al., 2022a, 2022b; Song et al., 2022). This is because, despite continuous develop- ment and the promising advantages for both customers and companies in service encounters, chatbots often do not live up to customer expectations and fail to understand or process user enquiries (Lv et al., 2022a, 2022b; Weiler et al., 2022; Xu & Liu, 2022). Chatbot response failures Lately, scholars have started to analyze the impacts of chat- bot response failures. For instance, Seeger and Heinzl (2021) showed that digital agent’s failures harm customer trust and stimulate negative word-of-mouth. Chatbot response fail- ures also increase people’s frustration and anger (Gnewuch et al., 2017; Mozafari et al., 2022; van der Goot et al., 2021), and create skepticism and reluctance to follow the bot’s instructions (Adam et al., 2021). As a consequence, users frequently quit the conversation (Akhtar et al., 2019) and might even reject future chatbot interactions (Benner et al., 2021; van der Goot et al., 2021). Chatbots fail frequently, because the processing of natural language input was found to be a complex task for machines due to unpredictable entries (Brendel et al., 2020). Moreover, chatbots are often integrated on digital platforms in wrong use cases and are not connected to relevant data sources (Janssen et al., 2021; Mostafa & Kasamani, 2022). In addi- tion, users were found to have exaggerated expectations of chatbots due to their humanlike design. According to the “computers are social actors” (CASA) paradigm, people ascribe social rules, norms, and expectations to interactions with computers although they are aware that they are inter- acting with a machine (Nass et al., 1996). As such, people expect a chatbot to understand their request and respond with a suitable answer, just as they would expect of a human (Wirtz et al., 2018). Parallel to the increased interest in chatbot technology, research on chatbot failure recovery strategies has gained traction in recent years (see Table 1 for an overview). This literature stream can be divided into three major sub-divisions. First, some scholars reviewed the literature or conducted expert interviews to derive critical success factors for chatbot interac- tions (Janssen et al., 2021) or categories of recovery strategies (Benner et al., 2021; Poser et al., 2021). The second body of research empirically examines how chatbot interaction could be designed pre-failure in order to mitigate negative consumer perceptions due to failures. For instance, research results indi- cate that higher chatbot anthropomorphism (Seeger & Heinzl, 2021; Sheehan et al., 2020) or specific message techniques (Weiler et al., 2022) positively influence consumer responses before the failure occurs. Third, and contrasting this, other scholars investigated the effects of post-failure recovery strat- egies. As one of the first studies, Ashktorab et al. (2019) com- pared user preferences of eight different recovery strategies and found that providing explanations or options of answers are favored as they display chatbot initiative. Mozafari et al. (2022) assessed that the mere disclosure of the chatbot (vs. human) identity has already a mitigating effect following fail- ure. Further scholars found that chatbots are preferred over human agents after a functional failure (but not after a non- functional failure) (Xing et al., 2022), and chatbot self-recov- ery (vs. human agent recovery) leads to more positive user reactions (Song et al., 2022). Scholars have also started to investigate effects of post- failure messages and discovered for instance that some com- munication patterns (e.g., chatbot as “victim” or “helper”) lead to more positive responses than other patterns (e.g., “persecutor”) (Brendel et al., 2020). Other studies revealed that cute or empathic responses (Lv et al., 2021; Lv et al., 2022a, 2022b), expressions of gratitude or apology (Lv et al., 2022a, 2022b), or self-depreciating humor (Xu & Liu, 2022; Yang et al., 2023) lead to more positive consumer reactions. Moreover, messages highlighting the human- chatbot relationship (i.e., appreciation message) were found to be more effective to increase post-recovery satisfaction compared to apology-related messages (Song et al., 2023). Electronic Markets (2023) 33:56 1 3 56 Page 4 of 22 Ta bl e 1 L ite ra tu re re vi ew — C ha tb ot s s er vi ce fa ilu re a nd re co ve ry Re se ar ch st re am s So ur ce To pi c Li te ra tu re fi el d( s) M ed ia to r( s) , m od er at or (s ) M ai n fin di ng s C on ce pt ua l/q ua lit at iv e B en ne r e t a l. (2 02 1) Li te ra tu re re vi ew o n ch at bo t f ai lu re re co ve ry str at eg ie s IS , H C I - Li te ra tu re a na ly si s p re se nt s s ix c at eg or ie s o f c ha tb ot s’ c on ve rs a- tio na l b re ak do w n re co ve ry st ra te gi es Ja ns se n et  a l. (2 02 1) Re as on s o f c ha tb ot fa ilu re s a nd c rit ic al fa c- to rs fo r s uc ce ss D es ig n sc ie nc e, IS , H C I B as ed o n lit er at ur e re vi ew a nd e xp er t i nt er vi ew s, m aj or re as on s fo r c ha tb ot fa ilu re s a nd 1 2 cr iti ca l s uc ce ss fa ct or s a re id en ti- fie d Po se r e t a l. (2 02 1) D es ig n pr in ci pl es fo r h an do ve r f ro m c ha tb ot to h um an a ge nt a fte r f ai lu re D es ig n sc ie nc e, H C I - Ex pe rts c on fir m ed p os iti ve p er ce pt io n of re al -ti m e ha nd -o ve r to h um an a ge nt w he n ch at bo ts fa il to re so lv e re qu es ts . T he pr oc es s o f i ns ta nt h an d- ov er is p ro po se d. E xp er ts c on fir m ed a po si tiv e pe rc ep tio n of th is h yb rid a pp ro ac h fo r s er vi ce re co ve ry Pr e- fa ilu re in te ra ct io n str at eg ie s (to m iti ga te fa ilu re se ve rit y pe rc ep tio n) Sa nd s e t a l. (2 02 2) Eff ec ts o f a ge nt ty pe (h um an v s. ch at bo t) re la te d to fa ilu re ty pe a nd m ag ni tu de C us to m er se rv ic e, IS M od : S er vi ce fa ilu re m ag ni tu de Co ns um er s r es po nd le ss n eg at iv e to a p ro ce ss fa ilu re w he n it oc cu rs in h um an a ge nt in te ra ct io n (v s. ch at bo t a s v irt ua l ag en t). In c as e of a la rg e ou tc om e fa ilu re , c on su m er s r es po nd m or e po sit iv el y w he n th e ag en t i s v irt ua l ( vs . h um an ) Se eg er a nd H ei nz l ( 20 21 ) A nt hr op om or ph ic d es ig n to p ro te ct tr us t le ve ls fr om d am ag es d ue to se rv ic e fa ilu re IS , H C I M ed : P er ce iv ed n or m v io la tio n A nt hr op om or ph ic d es ig n of a c ha tb ot m iti ga te s n eg at iv e eff ec t of a fa ilu re o n tru st an d W or d- of -m ou th Sh ee ha n et  a l. (2 02 0) Eff ec ts o f c la rifi ca tio n te ch ni qu e on a nt hr o- po m or ph is m a nd a do pt io n in te nt io n H C I, cu sto m er se rv ic e M ed : A nt hr op om or ph is m M od : N ee d fo r h um an in te ra ct io n C ha tb ot in te ra ct io ns u si ng c la rifi ca tio n te ch ni qu e ar e si m ila rly eff ec tiv e re ga rd in g ch at bo t a do pt io n co m pa re d to a p er fe ct ly un de rs ta nd in g (i. e. e rr or -f re e) c ha tb ot , a nd b ot h ty pe s a re si gn ifi ca nt ly m or e eff ec tiv e th an a c ha tb ot w hi ch fa ile d to un de rs ta nd a u se r i np ut . T he se e ffe ct s a re m ed ia te d by h ig he r an th ro po m or ph is m c om pa re d to a fa ili ng c ha tb ot W ei le r e t a l. (2 02 2) Eff ec ts o f e x- an te m es sa ge s t o re du ce d is - co nt in ua nc e af te r c ha tb ot fa ilu re s C us to m er se rv ic e M od : P er fo rm an ce le ve l ( lo w v s. hi gh ), Li ng ui sti c fo rm (q ua nt ita - tiv e vs . q ua lit at iv e) In oc ul at io n m es sa ge s ( i.e ., pr ep ar in g us er s f or p os si bl e fa ilu re s) re du ce u se r’s d ec is io n to d is co nt in ue c ha tb ot a fte r s er vi ce fa il- ur e. Q ua nt ita tiv e fo rm o f t he m es sa ge m od er at ed th e eff ec t o f pe rc ei ve d pe rfo rm an ce o n di sc on tin ua nc e, w he re as a q ua lit a- tiv e fo rm m od er at ed th e eff ec t o f t ru sti ng b el ie fs Electronic Markets (2023) 33:56 1 3 Page 5 of 22 56 Ta bl e 1 (c on tin ue d) Re se ar ch st re am s So ur ce To pi c Li te ra tu re fi el d( s) M ed ia to r( s) , m od er at or (s ) M ai n fin di ng s Po st- fa ilu re re co ve ry str at eg ie s A sh kt or ab e t a l. (2 01 9) U se r p re fe re nc es o f d iff er en t c ha tb ot fa ilu re re pa ir str at eg ie s H C I, C on ve rs at io na l de si gn - U se r p re fe re nc es o f e ig ht c ha tb ot n ot ifi ca tio n str at eg ie s a fte r a ch at bo t r es po ns e fa ilu re . N ot ifi ca tio ns w hi ch p ro vi de o pt io ns or e xp la na tio ns a re fa vo re d as th ey re fle ct a c ha tb ot ’s in iti at iv e an d ar e ac tio na bl e to re co ve r f ro m fa ilu re s B re nd el e t a l. (2 02 0) Eff ec ts o f t hr ee c om m un ic at io n pa tte rn s o n us er ’s e m ot io ns a nd re sp on se s IS , H C I - A fte r a se rv ic e fa ilu re , a “ he lp in g” c om m un ic at io n sty le u se r le ad s t o lo w le ve ls o f h ar as sm en t, w he re as b ot h a “p er se cu to r” an d a “v ic tim ” sty le le ad to si gn ifi ca nt ly h ig he r h ar as sm en t. U se r f ru str at io n is n ot d iff er en t a cr os s c om m un ic at io n sty le s, bu t p os t-f ai lu re sa tis fa ct io n is si gn ifi ca nt ly h ig he r f or “ he lp - in g” a nd “ vi ct im ” (v s. “p er se cu to r” ) Lv (L in xi an g) et a l. (2 02 2a ) Eff ec ts o f c om m un ic at io n re co ve ry st ra te - gi es fo r A I d ev ic e fa ilu re s M ed : R el at io na l n ee ds a nd e ffi - ca cy n ee ds W he n ch at bo t m es sa ge e xp re ss es g ra tit ud e (v s. ap ol og y) , c on - su m er s a re m or e lik el y to fo rg iv e a fa ilu re if se rv ic e fa ilu re s in cl ud es b ei ng re je ct ed (v s. be in g ig no re d) Lv (X in gy an g) e t a l. (2 02 1) Eff ec ts o f c ut en es s a fte r A I s er vi ce fa ilu re D es ig n sc ie nc e, c us to m er se rv ic e M ed : t en de rn es s a nd p er fo rm an ce ex pe ct an cy M od : f ai lu re se ve r- ity , t im e pr es su re C ut en es s i nc re as es c on su m er s’ to le ra nc e of A I s er vi ce fa ilu re , m ed ia te d by te nd er ne ss a nd p er fo rm an ce e xp ec ta nc y Lv (X in gy an g) e t a l. (2 02 2b ) Eff ec ts o f e m pa th ic re sp on se o n us er s c on - tin ua nc e in te nt io ns H C I, C us to m er se rv ic e M ed : P sy ch ol og ic al d ist an ce , t ru st H ig hl y em pa th ic A I r es po ns e in cr ea se s c on tin ua nc e in te nt io ns . M ul tis en so ry (i .e ., te xt a nd a ud io ) i nt er ac tio ns a m pl ify th e eff ec t, co m pa re d to te xt -o nl y in te ra ct io ns M oz af ar i e t a l. (2 02 2) Eff ec ts o f c ha tb ot d is cl os ur e de pe nd in g on se rv ic e- re la te d fa ct or s Se rv ic e M ed : T ru st M od : S er vi ce c rit ic al - ity , F ai lu re v s. no fa ilu re In c as e of se rv ic e fa ilu re , c ha tb ot id en tit y di sc lo su re le ad s t o po si tiv e eff ec t o n re te nt io n. W ith ou t a fa ilu re , d is cl os ur e ne ga tiv el y aff ec ts c us to m er re te nt io n th ro ug h re du ce d tru st fo r se rv ic es w ith h ig h le ve ls o f c rit ic al ity So ng e t a l. (2 02 2) Eff ec t o f c ha tb ot se lf- re co ve ry v s. hu m an su pp or t o n re co ve ry sa tis fa ct io n IS , C us to m er se rv ic e M od : C ha tb ot in te lli ge nc e C ha tb ot se lf- re co ve ry h as a m or e po si tiv e eff ec t o n po st- re co ve ry sa tis fa ct io n co m pa re d to h um an -s up po rte d re co ve ry . Eff ec t i s m od er at ed b y ch at bo t i nt el lig en ce So ng e t a l. (2 02 3) Eff ec ts o f t w o po lit en es s s tra te gi es o n po st- re co ve ry sa tis fa ct io n C us to m er se rv ic e M ed : F ac e co nc er n, M od : T im e pr es su re Fo cu si ng o n hu m an -c ha tb ot re la tio ns hi p w ith a n ap pr ec ia tio n m es sa ge le d to m or e po st- re co ve ry sa tis fa ct io n co m pa re d to an a dm itt an ce o f t he li m ite d ab ili tie s w ith a n ap ol og y m es sa ge X in g et  a l. (2 02 2) Eff ec ts o f s er vi ce fa ilu re ty pe s o n co ns um - er s’ re co ve ry st ra te gy c ho ic es H C I, co nv er sa tio na l de si gn M od : C ha tb ot in te lli ge nc e A fte r a fu nc tio na l ( no nf un ct io na l) fa ilu re , c on su m er s t en d to u se th e ch at bo t ( a hu m an ) f or se rv ic e re co ve ry . C ha tb ot in te lli - ge nc e m od er at es th is c ho ic e. C ha tb ot s ( vs . h um an e m pl oy ee s) ha ve h ig he r p er ce iv ed g ov er na nc e X u & L iu (2 02 2) Eff ec ts o f h um or in m es sa ge s a fte r A I se rv ic e fa ilu re H C I, C on ve rs at io na l de si gn M ed : P er ce iv ed w ar m th a nd c om - pe te nc e M od : T im e pr es su re , cu sto m er in oc ul at io n H um or ou s r es po ns e of c ha tb ot a fte r s er vi ce fa ilu re in cr ea se s co ns um er to le ra nc e, m ed ia te d by w ar m th a nd c om pe te nc e Ya ng e t a l. (2 02 3) Im pa ct o f A I’s se lf- de pr ec at in g hu m or o n se rv ic e re co ve ry sa tis fa ct io n C us to m er se rv ic e, c on - ve rs at io na l d es ig n M ed : P er ce iv ed in te lli ge nc e, p er - ce iv ed si nc er ity M od : F ai lu re ex pe rie nc e, S en se o f p ow er Se lf- de pr ec at in g hu m or im pr ov es c on su m er s’ w ill in gn es s t o to le ra te fa ilu re a nd re co ve ry sa tis fa ct io n, m ed ia te d by h ig he r pe rc ep tio ns o f i nt el lig en ce a nd si nc er ity . E ffe ct o f h um or w as pr es en t i n pr oc es s- re la te d fa ilu re s, bu t n ot in o ut co m e- re la te d fa ilu re s O ur S tu dy Eff ec tiv en es s o f t w o fa ilu re -r ec ov er y m es - sa ge st ra te gi es d ep en di ng o n si tu at io na l fa ct or s C us to m er se rv ic e, C on - ve rs at io na l d es ig n M ed : P er ce iv ed w ar m th a nd co m pe te nc e U si ng a n em pa th y- se ek in g (s ol ut io n- su gg es tin g) m es sa ge a s re co ve ry st ra te gy in cr ea se s p os t-r ec ov er y sa tis fa ct io n an d re - us e in te nt io n, m ed ia te d by w ar m th (c om pe te nc e) , r es pe ct iv el y. M or eo ve r, m es sa ge p re fe re nc e de pe nd s o n fa ilu re a ttr ib ut io n (u se r v s. ch at bo t) an d fa ilu re fr eq ue nc y Electronic Markets (2023) 33:56 1 3 56 Page 6 of 22 loyalty. Warmth and competence perceptions are also found to influence human-digital agent collaboration. More pre- cisely, perceptions of these social cognitions predict people’s choice of a particular agent, irrespective of the agent’s objec- tive performance level (McKee et al., 2022). Moreover, Xu and Liu (2022) found that humorous chatbot answers increase consumers’ tolerance after a service failure, mediated by higher warmth and competence. In general, Han et al. (2021) assessed that chatbot service failures trigger consumers’ reac- tance (i.e., anger and negative cognitions). In turn, these neg- ative cognitions reduce competence perceptions and subse- quently decrease service quality and satisfaction. Finally, Kull et al. (2021) found that when chatbots use a warm (vs. com- petent) initial message, people’s brand engagement increased, because they feel closer to the brand in that condition. Despite these initial insights, however, little is known about effects of message-related cues on respondents’ warmth or competence evaluations and subsequent service assessments. This gap is relevant because many chatbots are text-based agents, and thus, users mainly have to rely on the chatbot’s (text-based) messages as cues to, e.g., evaluate the chatbot’s warmth and competence (van Pinxteren et al., 2020). Moreover, although chatbot service failures are common (Seeger & Heinzl, 2021), scholars confirm that there is still a lack of scientific knowl- edge about chatbot service recovery and its effectiveness (Xu & Liu, 2022). Therefore, this study evaluates how two distinct chatbot messages increase perceptions of social cognitions and enhance subsequent recovery responses. Recovery strategies for chatbot failure As chatbot response failures seem inevitable and lead to severe negative outcomes, firms are well advised to con- sider failure recovery strategies (Benner et al., 2021; Jans- sen et al., 2021). Thereby, a recovery strategy refers to an “effort [that] mitigates the previous negative effect of the failure” (Roschk & Gelbrich, 2014, p. 196). Scholars have revealed a wide range of such strategies as organizational responses, mainly with regard to service failures (for an overview, see van Vaerenbergh et al. (2019)). There are two basic dimensions of such failure recovery responses, namely (1) tangible compensation, such as monetary refunds, and (2) psychological compensations, including positive service employee behavior (Roschk & Gelbrich, 2014; van Vaer- enbergh et al., 2019). (1) Tangible compensations mainly include financial and process-related efforts within a firm. A common approach to tangible compensation in chatbot failures is to hand over the conversation to a human employee to manage the problem and to prevent negative experiences (Ashktorab et al., 2019; Janssen et al., 2021). However, this solution comes with additional costs and reduces the level of automation (Reinkemeier & Gnewuch, 2022). In contrast, (2) Chatbots and the stereotype content model Following related studies about human–machine interactions (i.e., robots or chatbots), people quickly draw inferences about a bot’s personality as interaction partner similarly as they would evaluate a human frontline employee (Belanche et al., 2021; Choi et al., 2021). For example, following the “comput- ers are social actors” (CASA) paradigm (Nass et al., 1996), con- sumers are expected to evaluate a chatbot as digital interaction partner similarly as they would evaluate a human conversation partner—for instance by assessing its warmth and competence. According to the stereotype content model (Fiske et al., 2007) as one of the most established frameworks regarding social cognitions, people use warmth and competence as two universal dimensions of social perception when judg- ing others. Thereby, warmth covers aspects like honesty, kindness, or trustworthiness, while competence perceptions reflect capability, confidence, intelligence, and skillfulness (Dubois et al., 2016; Fiske et al., 2007; Judd et al., 2005). Taken together, these dimensions are suggested to “account almost entirely how people characterize others” (Fiske et al., 2007, p. 77). Originally, this system of social judgment was applied to explain perceptions of social groups (Fiske et al., 2007) or individuals (Judd et al., 2005). Since then, scholars have extended its use to brands (Aaker et al., 2010) and more recently to service interactions with humans (Scott et al., 2013) or non-human entities (i.e., robots or virtual agents) (Choi et al., 2021; Kull et al., 2021; Xu & Liu, 2022). Judgments of warmth and competence influence how peo- ple interact with others, as well as how people feel and behave (Cuddy et al., 2008; Marinova et al., 2018). Warmth is gener- ally linked to cooperative intentions and prosocial behavior, whereas competence is associated with the power and ability to realize one’s goals (Cuddy et al., 2008). Inferred warmth and competence assessments enhance customer- and service- related outcomes such as satisfaction, trust, or brand admira- tion, and they influence downstream behaviors like purchase intentions and retention (Aaker et al., 2010; Cuddy et al., 2008; Marinova et al., 2018; Scott et al., 2013). Recently, scholars have increasingly investigated the impact of social cognitions, i.e., warmth and competence per- ceptions, on various outcomes in the field of digital agents (Belanche et al., 2021; Choi et al., 2021; Kull et al., 2021; McKee et al., 2022; Xu & Liu, 2022). These studies mainly focus on anthropomorphism effects. For instance, Sungwoo Choi et al. (2021) found that people perceive humanoid (vs. nonhumanoid) service robots as warmer but not as more competent. In turn, higher warmth influences satisfaction after a failure and supports recovery effectiveness. In con- trast, Belanche et al. (2021) revealed that both dimensions of warmth and competence indicate a robot’s level of “human- ness,” and both dimensions positively influence customers’ Electronic Markets (2023) 33:56 1 3 Page 7 of 22 56 psychological compensations generally come without costs and could be executed by the service encounter agent (i.e., frontline employee or chatbot) directly. Prominent exam- ples are apologies from the service employee or expressions of regret for the occurred failure (van Vaerenbergh et al., 2019). This research focuses on psychological compensa- tions, as this is of interest for both research and manage- ment: Scientifically, this study complements initial research which evaluates effects of different message elements (such as expressions of humor, cuteness, apology, or gratitude) (Lv et al., 2022a, 2022b; Lv et al., 2021; Xu & Liu, 2022; Yang et al., 2023). Managerially, this type of compensation requires fewer resources (vs. human recovery) and can be integrated directly into the conversational process. In fact, a textual addition is all that is required to deliver these types of psychological compensation. As gestures and nonverbal behaviors do not exist in chat- bot conversations, people judge the chatbot conversation mainly based on written messages (van Pinxteren et al., 2020). We, therefore, analyze how different messages trig- ger social cognitions. As the study’s outcome, post-recovery satisfaction and re-use intentions were chosen to evaluate recovery effectiveness. Post-recovery satisfaction represents one of the most widely used metrics to indicate successful recovery efforts (Song et al., 2022; Worsfold et al., 2007; Yang et al., 2023). Re-use intentions indicate continued acceptance of a chatbot and are relevant for its long-term success on digital platforms (Adam et al., 2021; Lv et al., 2022a, 2022b; Weiler et al., 2022). Different messages as failure recovery strategies Even small changes in the framing of communication mes- sages were found to influence people’s judgments and behav- iors (You et al., 2020). Regarding chatbot conversations, dif- ferent messages could be used in response to a service failure. In this research, two distinct message types labeled as an empathy-seeking message or solution-oriented message were deliberately chosen as they (1) represent the most common failure recovery strategies as revealed by our pilot study (see below in the empirical studies section) and (2) are thought to influence warmth and competence perceptions, respectively. Both types express a request from the chatbot. As first type, a chatbot might ask for a user’s empathy and understanding regarding its limited abilities. This request for understand- ing is sought to elicit empathic concern for the chatbot’s “infancy” and difficulties in handling requests. Scholars also refer to this message as “social” recovery strategy, which reflects apologizing for the failure “to appeal to the users’ empathy and understanding similar to that which is shown in human–human conversations” (Benner et al., 2021, p. 9). Empathy is defined as a person’s intellectual or imagi- native understanding of another person’s condition or state (Hogan, 1969). Related to service, empathic custom- ers were found to be less angry and more forgiving when they encounter a service failure (Wieseke et al., 2012). In a study with “classic” human frontline employees, customer empathy towards an employee was found to enhance social interactions, foster supportive attitudes, and create a more satisfying experience (Lazarus, 1991; Wieseke et al., 2012). Scholars in the field of social service research support this, showing that empathy-related expressions are often benefi- cial to build or strengthen social bonds between interaction partners (Gerdes, 2011), which in turn increase warmth per- ceptions (Cuddy et al., 2008; Judd et al., 2005). These well-documented effects could also be observed in human interactions with digital agents. As a chatbot reflects a digital version of a service employee, a chatbot message that evokes empathy (e.g., asking for patience and to hold on to the joint interaction) should trigger these warmth perceptions. Scholars consistently demonstrated that humans can feel empa- thy with inanimate objects such as chatbots or robots (Mis- selhorn, 2009). Related to the adjacent field of service robots, Wirtz et al. (2018) concludes that a bot’s social-emotional and relational elements (e.g., social interactivity) increase warmth. As the chatbot’s empathy message contains mainly such social- emotional and relational elements (e.g., asking for patience and to hold on to the joint interaction), we propose: H1. The message type empathy increases consumer-per- ceived chatbot warmth. As an alternative option, a chatbot could request the user to adapt the input to the chatbot’s abilities, e.g., by rephras- ing the input in short and simple words. Input from users was often found to be complex, and a shorter and more precise input has a higher probability of being processed correctly (Ashktorab et al., 2019; Luger & Sellen, 2016). Indeed, conversational agents were found to respond more successfully when the input was rather simple, short, and unambiguous (Luger & Sellen, 2016). This type of request could be labeled as a solution-oriented message, as the chat- bot tries to solve the failure actively. Related IS research has already used the solution-oriented message (i.e., “please rephrase your inquiry and try again”) to encourage users to continue with the chatbot (Benner et al., 2021; Weiler et al., 2022). While Weiler et al. (2022) use this message as ex-ante strategy at the beginning of the interaction, this study employs it as ex-post strategy to address the chatbot response failure directly when it occurred. This concept has also been observed in human service interactions. When a frontline employee focuses on the task (vs. social components) as the “core” of the service delivery and offers a possible solution to make the interaction more successful and convenient, this task-related behavior increases the perceived competence of this employee (Marinova et al., Electronic Markets (2023) 33:56 1 3 56 Page 8 of 22 2018). Several scholars support this argumentation, and acknowledge that competence-oriented messages imply that service providers are “very capable in providing consumers with solutions” (Huang & Ha, 2020, p. 620). Related to the chatbot, the message-type solution focuses on the task, that is, to make the interaction with the customer effective. As consumers perceive digital assistants such as chatbots as social actors (Nass et al., 1996; van Pinx- teren et al., 2020), this solution-oriented message should increase chatbot competence perceptions (Marinova et al., 2018). Moreover, the solution message indicates that the chatbot is aware of the linguistic complexity of user input and of options to improve the quality of the chatbot’s answer (Weiler et al., 2022). Both aspects (i.e., awareness of a prob- lem, and presentation of a possible solution) indicate a kind of skillfulness or intelligence, two key items reflecting com- petence (Cuddy et al., 2008; Xu & Liu, 2022). In addition, related service robot literature proposed that when a bot can serve a user’s functional needs (e.g., offering a solution to a request), this service enhances perceptions of its useful- ness and competence (Wirtz et al., 2018). Therefore, we hypothesize: H2. The message type solution increases consumer-per- ceived chatbot competence. According to scholars, warmth and competence per- ceptions can serve as underlying mechanisms that explain how consumers respond to technology infusion in service (Belanche et al., 2021; van Doorn et al., 2016). According to van Doorn et al. (2016), warmth and competence perceptions elicited by digital service technology both enhance con- sumers’ satisfaction and loyalty intentions. A chatbot-study found that if chatbots could elicit warmth perceptions within human-chatbot interactions, chatbot use is rising (Mozafari et al., 2021). Supporting this, research from the related field of service robots found that warmth perceptions significantly increased post-failure satisfaction and loyalty (Choi et al., 2021). Similarly, research confirmed that consumers’ compe- tence perceptions (e.g., the belief that chatbots are capable to fulfill a task or enable successful service recovery) increase their interaction satisfaction and re-use intentions (Lv et al., 2022a, 2022b; Mozafari et al., 2022). Further studies about human (Babbar & Koufteros, 2008; Güntürkün et al., 2020; Habel et al., 2017) and digital service agents (Belanche et al., 2021; Han et al., 2021) support that higher warmth and competence perceptions drive consumers’ service value perceptions, satisfaction, and loyalty. Thus: H3. Stronger consumer-perceived (a) warmth and (b) competence increase consumers’ post-recovery satisfaction. H4. Stronger consumer-perceived (a) warmth and (b) competence increase consumers’ chatbot re-use intentions. Factors influencing the perception of recovery messages Research has shown that situational factors regarding chatbot interactions inf luence user perceptions and responses (Gnewuch et al., 2017; Janssen et al., 2021; Pizzi et al., 2021). Therefore, we identified two rel- evant factors, namely failure attribution and failure frequency, which are thought to impact users’ reac- tions and preference for one of the recovery messages. Both factors were found to be important elements in the failure and recovery literature (Choi & Mattila, 2008; Collier et al., 2017; Ozgen & Duman Kurt, 2012; van Vaerenbergh et al., 2019). Failure frequency In chatbot conversations, users regularly need to make multiple attempts to enter a request in a way that the chatbot will understand (Ashktorab et al., 2019). That means that many initial service failures are not recovered ade- quately but lead to a second service failure—a situation also labeled as double deviation (Johnston & Fern, 1999; van Vaer- enbergh et al., 2019). Such double deviations were found to reinforce negative customer responses that were caused by the first failure, such as customer dissatisfaction, anger, or churn (Ozgen & Duman Kurt, 2012; van Vaerenbergh et al., 2019). Furthermore, people were found to prefer different recovery strategies for a single vs. double deviation, leading to the con- clusion that the service provider should adequately account for the failure frequency in choosing the appropriate recovery strategy (Pacheco et al., 2019). Therefore, chatbot creators need to identify the best-possible “match” for the response to the failure (Roschk & Gelbrich, 2014). After a first fail- ure, both response messages are expected to mitigate nega- tive consequences via the paths of warmth and competence as proposed above. Yet, when users re-enter their request and the chatbot fails again to deliver an appropriate answer, this represents a new situation with (potential) implications for the effectiveness of both message types after the second failure. The empathy-related message seeks to evoke understand- ing and empathy and create feelings of warmth and mutual connection (Cuddy et al., 2008; Lazarus, 1991; Wieseke et al., 2012). Asking for understanding regarding the chat- bot’s limited abilities is possible at any interaction stage or situation, as the chatbot refers to its own lack of abilities (vs. the user). Therefore, an empathy message is assumed to create warmth perceptions irrespective of the failure fre- quency. Related to the message type solution, as argued above, people are expected to accept the request to re-phrase Electronic Markets (2023) 33:56 1 3 Page 9 of 22 56 their input to better adhere to a chatbot’s needs after a first failure and even perceive that chatbot as competent (Chong et al., 2021; Marinova et al., 2018). However, after re- phrasing the request and being confronted with a second service failure, this competence perception is assumed to be negatively affected as the chatbot was again not able to provide a solution. As Johnston and Fern’s (1999) study showed, more than half of the respondents lost confidence in a service agent’s competence after a double deviation. Taken together, after a double deviation, empathy-seeking message should be more effective than solution-oriented messages. Formally: H5. After a double deviation, an empathy message is more effective than a solution message in that the effect of the empathy message on consumer-perceived chatbot warmth is stronger than the effect of the solution message on consumer-perceived chatbot competence. Failure attribution Following attribution theory (Weiner, 1985, 2012), particularly in its application to service fail- ures, customers seek to attribute the responsibility for the occurrence of a negative incident to some person or thing as a way to understand the situation and regain control over their environment. Thereby, people mainly differenti- ate between two dimensions of a so-called locus of con- trol—either they blame others (i.e., external attribution) or they blame themselves (i.e., internal attribution) for the failure that has occurred (Weiner, 1985). Previous research showed that customers respond differently to service failures depending on which party they believe to be responsible for the failure (Choi & Mattila, 2008; Collier et al., 2017). For instance, when people assign the firm or its service agent as responsible for the failure, people react more negatively than when they perceive that they are (at least partially) responsible for the failure as well (Choi & Mattila, 2008). Consequently, people which respond more positively to ser- vice failures that are self-attributed (versus firm-attributed) remain more satisfied with the firm and are more likely to forgive such failures (Choi & Mattila, 2008; Gelbrich, 2010). When considering which chatbot recovery message should be employed (i.e., solution or empathy), failure attributions are supposed to differentiate its effectiveness. More precisely, we expect that the failure attribution and the recovery message should match the failure type to cre- ate positive outcomes. Recovery research has shown that matching the recovery strategy with the failure type (e.g., monetary compensation for monetary failure) is more effec- tive than a non-match (Roschk & Gelbrich, 2014). Related to chatbot interaction, when users attribute the failure to the chatbot (i.e., blame it for the failure), an empathy (vs. solu- tion) message should be a better match, as in that case the attributed party “takes the blame” by asking for empathy and understanding. Scholars have established that such messages send cues that clarify and acknowledge blame attributions, and they help users to understand the possible reason for the failure (e.g., the “infancy” of the chatbot). In turn, these cues work as a coping mechanism to handle the negative consumer reactions caused by the failed service (Gelbrich, 2010). In line with that, an empathy message as response to a chatbot-caused failure is supposed to match, while the solution message expresses that the user is also part of the failure—a message cue which does not match the responsi- bility perception of the user. Vice versa, the solution message matches a user-attrib- uted failure because it offers guidance for the user to tailor the request to the chatbot. When a user acknowledges to Fig. 1 Overview of research studies Electronic Markets (2023) 33:56 1 3 56 Page 10 of 22 be (at least partly) responsible for the failure or is unsure about who to blame, a solution message (vs. empathy) should better match this perception. To put it differently, users are supposed to accept a request to rephrase their entry when they admit to be part of the problem (Choi & Mattila, 2008), and they might even be thankful for guid- ance on how to react in the interrupted process. Yet, when a chatbot is believed to be the responsible party, a solution message that expresses a user action to resolve the situation is expected to be perceived as less appropriate, and should therefore affect consumers’ competence perceptions to a smaller extent. Thus: H6a. An empathy message leads to higher consumer- perceived warmth in the case of a chatbot-attributed fail- ure (match) than in the case of a user-attributed failure (mismatch). H6b. A solution message leads to higher consumer-per- ceived competence in the case of a user-attributed failure (match) than in the case of a chatbot-attributed failure (mismatch). Empirical studies Pilot study As initial pilot study, chatbots from different companies and across industries in the DACH-region (Germany, Austria, and Switzerland) were analyzed to assess which recovery strategy they used after a service failure. A service failure reflects that a chatbot did not understand the user’s request and was provoked by entering some random letters as incom- prehensible input. The final sample resulted in 101 chatbots from business, education, and public administration. Almost a third of these bots (i.e., 27) did not allow any free-text entry but only a set of options to choose, and consequently no “failure” in communication could occur when engaging with them. Out of the remaining 74 (free-text processing) chatbots, 34 ask the user to reformulate their request, reflect- ing the solution message type. Users were asked to use short sentences, simple words, and to be as precise as possible in their wording. Furthermore, 12 chatbots appealed to the user’s empathy and understanding. Lastly, no clear strategy was identified for 28 chatbots, and most of these chatbots just replied with a simple error feedback message. That means the chatbot just sends short messages like “Sorry I did not understand that.” In sum, the pilot study revealed that four major message- based recovery strategies are prominent in chatbot conversa- tions: (1) pre-defined answers, (2) a solution-oriented mes- sage, (3) an empathy-seeking message, and (4) a simple error feedback message. As pre-defined answers limit the variety of entries, they are generally less flexible. Therefore, this mes- sage type was omitted and the latter three types were analyzed. Study 1 Study design To investigate the influence of type of the recovery messages on users’ post-recovery satisfaction, Study 1 applies a one-factorial between-subjects experiment with three cases (mes- sage type: empathy vs. solution vs. control) (Fig. 1). Participants were recruited from two European universities through email dis- tribution lists and randomly assigned to one of the scenarios (see Fig. 2 for detailed scenarios). After excluding four participants who failed in the attention check (i.e., “If you read this, please press button 1”), our sample resulted in 178 participants (MAge: 24 years, SD: 18.34, 56.2% females). Participants had to imagine that they interact with a chatbot of an electronics provider, as electronic retail and service offers are nowadays vastly provided by digital platforms and electronic markets, and prior research considered e-commerce as prevalent field of chatbot service (Adam et al., 2021; Alt, 2020; Gnewuch et al., 2017). As for the conversation, three questions about a camera were asked; two of which the chatbot answers correctly and the last one where the chatbot mentioned a non-understanding of the user request (i.e., response failure, see Table 2). As a manipulation, we varied the failure responses: The chatbot either asked the user to have empathy with its limited abilities and to try again (i.e., type empathy) or to adapt and simplify the input (i.e., type solu- tion). As control case, the chatbot just replied, “Sorry, I did not understand your request.” As manipulation checks, we relied to Hosseini and Caragea (2021) as they described empathy-seeking behavior: People in the empathy message scenario perceived more strongly that the chatbot had “asked for their empathy and under- standing” (MEmpathy: 6.31, MSolution: 2.88, MControl: 3.02; F = 90.14, p < 0.001) compared to the other scenarios. Likewise, for the solu- tion case, we relied on Marinova et al. (2018) to describe problem- solving behavior: Respondents of the solution message perceived more strongly that the chatbot “has asked to rephrase my request” (MEmpathy: 2.53, MSolution: 5.98, MControl: 2.32, F(2175) = 94.94, p < 0.001). Thus, the manipulation was effective. Moreover, the scenarios were perceived as realistic (i.e., “The scenario is realis- tic” and “I can imagine a chatbot interaction happening like this in real life.”) (α: 0.89, M: 5.71, SD: 1.41 (on a 7-point Likert scale)). Measures For all three studies, reflective multi-item measures with 7-point Likert scales (1 = strongly disagree and 7 = strongly agree) from the extant literature were used and adapted to the study context. Post-recovery satisfaction was captured with three items from Agustin and Singh (2005). Perceived competence and warmth of the chatbot were captured by three-item scales each from Aaker et al. (2010), followed by some demographics (i.e., age, gender). Reliability and validity values were all above the thresholds (see Table 3). Cronbach’s alpha and composite reliability values are above the cut-off value of 0.70, indicating Electronic Markets (2023) 33:56 1 3 Page 11 of 22 56 construct-level reliability (Hulland et al., 2018). Second, the average variance extracted (AVE) for every multiple-item con- struct exceeded 0.50, showing appropriate convergent validity. Third, the AVE values were found to be larger than the shared variance of any other remaining construct, indicating discrimi- nant validity (Hulland et al., 2018). All items and factor loadings are illustrated in Table 3, and means and standard deviations for the main variables are provided in Table 4. Results An ANOVA revealed significant effects of the three message types on post-recovery satisfaction (F(2,175) = 15.97, p < 0.001). Post-hoc tests (Bonferroni) showed that both the empathy message and the solution mes- sage led to significantly higher post-recovery satisfaction than the control message (MSolution: 2.93 vs. MControl: 1.79, p < 0.001; MEmpathy: 2.60 vs. MControl: 1.79, p < 0.001). In contrast, the empathy and solution messages did not lead to significantly different satisfaction (p = 0.36). Thus, both messages enhance satisfaction compared to control—but not to a different degree when compared to each other. To test H1 to H3, a mediation analysis was conducted with PROCESS Model 4 using 5,000 bootstrapping samples and 95% confidence intervals (CIs) (Hayes, 2018). The message types were used as a multicategorical independent variable, warmth and competence served as parallel mediators, satis- faction was the outcome, and age and gender were covariates. As hypothesized, the empathy message (vs. control) increased warmth (b = 1.01, p < 0.001), and the solution message (vs. control) led to higher competence perceptions (b = 0.88, p < 0.001), supporting H1 and H2, respectively. The empathy message (vs. control) did not increase competence perceptions (p = 0.94), while the solution message also increased warmth (b = 0.54, p < 0.05). In turn, both warmth (b = 0.12, p < 0.05) and competence (b = 0.42, p < 0.001) had a positive effect on satisfaction, supporting H3 (a and b). The indirect effects of the empathy message on satisfaction were significant via warmth (b = 0.12, [0.01, 0.27]), and they were significant for the solution message on satisfaction via competence (b = 0.37, [0.17, 0.60]).1 Study 2—Failure frequency Design and procedure Study 2 examined the effect of the recovery messages on post-recovery satisfaction and re-use intentions under different failure recovery condi- tions (i.e., success vs. second failure after the recovery). Two hundred fifty-eight respondents were recruited via the online platform Prolific (US participants with 95% former tasks approval ratio). Participants were randomly assigned to a 3 (message type: empathy vs. solution vs. control) × 2 (recovery outcome: success vs. second fail- ure) between-subjects experiment and had to imagine a chatbot interaction for a table reservation in a restaurant (see Table 2). The chatbot did not understand the initial user request and responded with one of the three message types from Study 1. After reading the recovery message, respondents had to rate their warmth, competence, anger, and satisfaction and enter an individual input as response. On the next page, the survey tool illustrates the interac- tion including the individual user input and adds either a success message (i.e., “I successfully booked a table”) or a second failure message. In case of the second failure, one of the three message types (i.e., empathy-seeking, solution-oriented, control) was displayed (again); with a slightly adapted text for the solution-message to fit the context. After these messages, respondents again rated their perceptions (i.e., warmth, competence, satisfaction, re-use intentions, anger). On average, respondents needed 8 min to complete the survey. To increase realism and the fit of user-entry and message, we excluded fourteen par- ticipants in the recovery success condition who entered nonsensical input. Furthermore, we excluded seven par- ticipants who failed the attention check (i.e., participants who agreed to the false statement “the chatbot has for- warded me to a human service employee”); the final sam- ple consisted of 237 respondents (MAge: 45 years, SDAge: 14.56, 49% female). Measures Scales were identical to those used in Study 1. Chatbot re-use intentions were measured with the scale from Wallenburg and Lukassen (2011). As control variable, we assessed participant’s anger with three items from Xie et al. (2015), as this emotional response could influence user reactions in chatbot interactions (Crolic et al., 2021). All scales displayed adequate validity and reliability (see Table 3). Moreover, scenarios were perceived as realis- tic (M: 5.41, SD: 1.45) and the manipulation checks were effective. The empathy message was perceived as stronger for seeking empathy and understanding (MSolution: 3.26, MEmpathy: 5.70, MControl: 2.25, F(2234) = 71.16, p < 0.001), and respondents in the solution message scenario agreed 1 We also conducted a study (students from two European universi- ties, n = 270, MAge = 27 years, 52% female) with the same measures based on a further scenario (i.e., a chatbot as pizza delivery agent as food delivery represents another common field for digital platforms (e.g., Uber eats, Deliveroo, HelloFresh) and for chatbot services (Li et  al. (2020); van Pinxteren et  al. (2020))). Results of a media- tion analysis (PROCESS model 4) showed that the empathy mes- sage (vs. control) led again to higher warmth perceptions (b = 1.16, p < 0.001) while the solution message (vs. control) did not (p = 0.45). The solution message (vs. control) led to higher competence percep- tions (b = 0.56, p < 0.05), whereas the empathy message (vs. control) did not (p = 0.56). In turn, satisfaction was influenced by warmth (b = 0.30, p < 0.001) and competence (b = 0.48, p < 0.001). Neither message influenced satisfaction directly. In sum, the results also pro- vide support for H1–H3 again and add further validity to Study 1. Electronic Markets (2023) 33:56 1 3 56 Page 12 of 22 more that the chatbot has asked to rephrase the input as possible solution (MSolution: 6.62, MEmpathy: 2.45, MControl: 2.56, F(2234) = 155.62, p < 0.001). Regarding the recovery success manipulation, participants in the success scenarios (vs. second failure) rated significantly stronger that the chatbot “has successfully reserved a table” (MSuccess: 6.67, MSecond-Failure: 1.28, t(235) = 50.82, p < 0.001). Moreover, Table 4 provides descriptives for the main variables. Results To test H1 to H3 in one comprehensive model, we again conducted a mediation analysis (PROCESS Model 4, Hayes (2018) with 5000 bootstrap samples and 95% CIs) with the same setup as in Study 1. Anger, age, and gen- der were added as covariates. Consumer perceptions were evaluated after the first failure. The empathy message (vs. control) increased perceived warmth (b = 1.22, p < 0.001), and the solution message (vs. control) led to higher com- petence perceptions (b = 0.49, p < 0.05), supporting H1 and H2. The solution message also increased perceived warmth (b = 0.50, p < 0.05), whereas the empathy message did not increase competence (p = 0.84). Satisfaction was influenced by warmth (b = 0.26, p < 0.001) and competence (b = 0.51, p < 0.001), supporting H3. The indirect effect of the empathy message on satisfaction via warmth was significant (b = 0.31, [0.15, 0.51]) and the indirect effect of the solution message on satisfaction via competence was significant (b = 0.25, [0.02, 0.49]). In our analysis, neither of the two message types had a direct impact on satisfaction. Fig. 2 Exemplary scenarios Electronic Markets (2023) 33:56 1 3 Page 13 of 22 56 Ta bl e 2 S ce na rio d es cr ip tio ns So lu tio n m es sa ge sc en ar io s Em pa th y m es sa ge sc en ar io s C on tro l St ud y 1 (E le ct ro ni cs p ro vi de r) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m [C am er a sh op ]! H ow c an I he lp y ou ? [U se r:] I am se ar ch in g fo r a n ew c am er a [C ha tb ot ]: Su re ! W hi ch fe at ur es a re p ar tic ul ar ly im po rta nt to y ou w he n yo u lo ok fo r a c am er a? [U se r:] It sh ou ld b e lig ht w ei gh t a nd h av e ne w es t t ec hn ol og y. A nd it sh ou ld a ls o be a ffo rd ab le [C ha tb ot ]: So rr y, I di d no t u nd er st an d [C ha tb ot ]: Pl ea se tr y to fo rm ul at e yo ur q ue sti on s o r e nt ry a s pr ec is e as p os si bl e. P ar tic ul ar ly sh or te r s en te nc es o r w or ds w ill he lp m e to u nd er st an d yo ur re qu es t b et te r. Th an k yo u! St ud y 1 (E le ct ro ni cs p ro vi de r) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m [C am er a sh op ]! H ow c an I he lp y ou ? [U se r:] I am se ar ch in g fo r a n ew c am er a [C ha tb ot ]: Su re ! W hi ch fe at ur es a re p ar tic ul ar ly im po rta nt to y ou w he n yo u lo ok fo r a c am er a? [U se r:] It sh ou ld b e lig ht w ei gh t a nd h av e ne w es t t ec hn ol og y. A nd it sh ou ld a ls o be a ffo rd ab le [C ha tb ot ]: So rr y, I di d no t u nd er st an d [C ha tb ot ]: Pl ea se b e pa tie nt w ith m e as I am n ew to th is jo b an d ha ve a lo t t o le ar n. I re al ly tr y m y be st to a ns w er a ll yo ur qu es tio ns to y ou r s at is fa ct io n. P le as e be ar w ith m e an d gi ve m e an ot he r c ha nc e! T ha nk y ou v er y m uc h! St ud y 1 (E le ct ro ni cs p ro vi de r) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m [C am er a sh op ]! H ow c an I he lp y ou ? [U se r:] I am se ar ch in g fo r a n ew c am er a [C ha tb ot ]: Su re ! W hi ch fe at ur es a re p ar tic ul ar ly im po rta nt to y ou w he n yo u lo ok fo r a c am er a? [U se r:] It sh ou ld b e lig ht w ei gh t a nd h av e ne w es t t ec hn ol og y. A nd it sh ou ld a ls o be a ffo rd ab le [C ha tb ot ]: So rr y, I di d no t u nd er st an d St ud y 2 (T ab le re se rv at io n) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to re se rv e a ta bl e fo r n ex t w ee k Tu es da y be tw ee n 12 a nd 1 3  h fo r m e an d m y pa rtn er [C ha tb ot ]: So rr y, I di d no t u nd er st an d [C ha tb ot ]: Pl ea se tr y to fo rm ul at e yo ur q ue sti on s o r e nt ry a s p re - ci se a s p os si bl e. P ar tic ul ar ly sh or te r s en te nc es o r p re ci se w or ds w ill h el p m e to u nd er st an d yo ur re qu es t b et te r. Th an k yo u! [U se r] : [ in di vi du al e nt ry ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d yo u ag ai n. P le as e fo rm ul at e yo ur e nt ry a s p re ci se a s p os si bl e. F or e xa m pl e, p le as e en te r o nl y th e da te , a rr iv al ti m e an d th e nu m be r o f p eo pl e fo r t he re se rv a- tio n. [2 nd fa ilu re ] / T ha nk y ou ! I b oo ke d a ta bl e fo r T ue sd ay fo r fl ex ib le a rr iv al b et w ee n 12 :0 0 an d 13 :0 0 fo r 2 p eo pl e. Y ou r bo ok in g co de is 2 55 3. S ee y ou so on ! [ Re co ve ry su cc es s] St ud y 2 (T ab le re se rv at io n) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to re se rv e a ta bl e fo r n ex t w ee k Tu es da y be tw ee n 12 a nd 1 3  h fo r m e an d m y pa rtn er [C ha tb ot ]: So rr y, I di d no t u nd er st an d [C ha tb ot ]: Pl ea se b e pa tie nt w ith m e as I am n ew to th is jo b an d ha ve a lo t t o le ar n. I re al ly tr y m y be st to a ns w er a ll yo ur qu es tio ns to y ou r s at is fa ct io n. P le as e be ar w ith m e an d gi ve m e an ot he r c ha nc e. T ha nk y ou v er y m uc h! [U se r] : [ in di vi du al e nt ry ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d yo u ag ai n Pl ea se b e pa tie nt w ith m e ag ai n as I am n ew to th is jo b an d ha ve a lo t t o le ar n. I re al ly tr y m y be st to a ns w er a ll yo ur q ue sti on s to y ou r s at is fa ct io n. P le as e be ar w ith m e an d gi ve m e an ot he r ch an ce . T ha nk y ou v er y m uc h! [2 nd fa ilu re ] / T ha nk y ou ! I bo ok ed a ta bl e fo r T ue sd ay fo r fl ex ib le a rr iv al b et w ee n 12 :0 0 an d 13 :0 0 fo r 2 p eo pl e. Y ou r b oo ki ng c od e is 2 55 3. S ee y ou so on ! [ Re co ve ry su cc es s] St ud y 2 (T ab le re se rv at io n) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to re se rv e a ta bl e fo r n ex t w ee k Tu es da y be tw ee n 12 a nd 1 3  h fo r m e an d m y pa rtn er [C ha tb ot ]: So rr y, I di d no t u nd er st an d [U se r] : [ in di vi du al e nt ry ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d yo u ag ai n [2 nd fa ilu re ] / T ha nk y ou ! I b oo ke d a ta bl e fo r T ue sd ay fo r fl ex ib le ar riv al b et w ee n 12 :0 0 an d 13 :0 0 fo r 2 p eo pl e. Y ou r b oo ki ng c od e is 2 55 3. S ee y ou so on ! [ Re co ve ry su cc es s] St ud y 3 (P iz za d el iv er y) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to o rd er a P iz za S al am i t o m y ho m e [C ha tb ot ]: Su re ! T o w hi ch a dd re ss m ay I de liv er y ou r P iz za ? [U se r:] T o Sc hl ös sc he n St re et 1 2, p le as e [C ha tb ot -a ttr ib ut ed fa il- ur e] / “T o m y ho m e pl ea se ” [U se r-a ttr ib ut ed fa ilu re ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d. I do n ot k no w th e ad dr es s “T o Sc hl ös sc he n St re et 1 2 pl ea se ” [C ha tb ot -a ttr ib ut ed fa ilu re ] / “T o m y ho m e pl ea se ” [U se r-a ttr ib ut ed fa ilu re ] [C ha tb ot ]: Pl ea se tr y to fo rm ul at e yo ur q ue sti on s a s p re ci se a s po ss ib le . P ar tic ul ar ly sh or te r s en te nc es o r w or ds w ill h el p m e to un de rs ta nd y ou r r eq ue st be tte r. Th an k yo u! St ud y 3 (P iz za d el iv er y) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to o rd er a P iz za S al am i t o m y ho m e [C ha tb ot ]: Su re ! T o w hi ch a dd re ss m ay I de liv er y ou r P iz za ? [U se r:] T o Sc hl ös sc he n St re et 1 2, p le as e [C ha tb ot -a ttr ib ut ed fa il- ur e] / “T o m y ho m e pl ea se ” [U se r-a ttr ib ut ed fa ilu re ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d. I do n ot k no w th e ad dr es s “T o Sc hl ös sc he n St re et 1 2 pl ea se ” [C ha tb ot -a ttr ib ut ed fa ilu re ] / “T o m y ho m e pl ea se ” [U se r-a ttr ib ut ed fa ilu re ] [C ha tb ot ]: Pl ea se b e pa tie nt w ith m e as I am n ew to th is jo b an d ha ve a lo t t o le ar n. I re al ly tr y m y be st to a ns w er a ll yo ur qu es tio ns to y ou r s at is fa ct io n. P le as e be ar w ith m e an d gi ve m e an ot he r c ha nc e! T ha nk y ou v er y m uc h! St ud y 3 (P iz za d el iv er y) [C ha tb ot ]: H el lo ! I a m y ou r d ig ita l A ss ist an t f ro m P iz za H ou se ! H ow c an I he lp y ou ? [U se r:] I w ou ld li ke to o rd er a P iz za S al am i t o m y ho m e [C ha tb ot ]: Su re ! T o w hi ch a dd re ss m ay I de liv er y ou r P iz za ? [U se r:] T o Sc hl ös sc he n St re et 1 2, p le as e [C ha tb ot -a ttr ib ut ed fa ilu re ] / “ To m y ho m e pl ea se ” [U se r-a ttr ib ut ed fa ilu re ] [C ha tb ot ]: So rr y, I di d no t u nd er st an d Electronic Markets (2023) 33:56 1 3 56 Page 14 of 22 0.31]), whereas the indirect effect of the solution mes- sage on satisfaction via competence was not significant (b = 0.25, [− 0.19; 0.70]). Similarly, the indirect effect of the empathy message on re-use intentions via warmth was significant (b = 0.23, [0.05; 0.47]), whereas the indirect effect of the solution message on re-use intentions via competence was not significant (b = 0.15, [− 0.11; 0.44]). Thus, as hypothesized, the empathy message was found to be more effective than the solution message after the second failure. In case of a successfully resolved second attempt, the indirect effect of the empathy message on satisfaction via warmth was not significant (b = 0.11, [− 0.004; 0.28]). Simi- larly, the indirect effect of solution on satisfaction via com- petence was not significant (b = 0.23, [− 0.15; 0.60]).Thus, message effects dissolve when the chatbot solved the user’s request. Study 3—Failure attributions Design and procedure Study 3 aimed to examine the effect of the recovery messages on post-recovery satisfaction and re- use intentions under different failure attribution conditions, i.e., either chatbot or user was responsible for the failure. Next, to examine effects of responses to the second fail- ure (H3, H4, and H5), we used a moderated mediation analysis (PROCESS Model 8, Hayes (2018) with 5000 bootstrap samples and 95% CIs) and compared the dif- ferent messages after the second response of the chatbot. The response condition (i.e., second failure vs. success- ful chatbot answer) was used as moderator, and anger, age, and gender were covariates again. Related to the effects of the mediators on the dependent variables (i.e., H3, H4), results of the mediation model with satisfac- tion showed that warmth and competence significantly increased post-recovery satisfaction (bwarmth = 0.16, p < 0.001; bcompetence = 0.72, p < 0.001), supporting H3. Similarly, when using re-use intentions as dependent vari- able, warmth and competence significantly increased re- use intentions (bwarmth = 0.24, p < 0.005; bcompetence = 0.50, p < 0.001), supporting H4. Results of the messages after the second failure on the mediators (H5) show that the empathy message still led to perceived warmth (b = 0.94, p < 0.01), whereas the solution message did not lead to higher competence per- ceptions (p = 0.21). Thus, H5 could be supported. Corre- spondingly, the indirect effect of the empathy message on satisfaction via warmth was significant (b = 0.15, [0.04; Table 3 Scale items and statistics Study 2a: values after first failure; 2b: values after second response Construct name and items Factor loading Study 1 Study 2a/b Study 3 Warmth (Study 1/2a and b/3: α = .80/.95 and .96/.87; CR = .87/.92 and .93/.91; AVE = .69/.80 and .81/.76)   I perceive the chatbot as …    • warm .80 .90/.91 .84    • kind .84 .91/.92 .88    • generous .85 .87/.87 .91 Competence (Study 1/2a and b/3: α = .87/.95 and .98/.90; CR = .88/.90 and .89/.88; AVE = .71/.75 and .72/.71)   I perceive the chatbot as …    • competent .77 .84/.86 .77    • effective .87 .88/.84 .90    • efficient .89 .88/.85 .86 Post-recovery satisfaction (Study 1/2a and b/3: α = .74/.93 and .98/.90; CR = .80/.83 and .86/.82; AVE = .58/.62 and .67/.60)   The interaction with the chatbot service was …    • satisfying .79 .74/.84 .73    • pleasant .64 .81/.79 .81    • good .83 .81/.83 .79 Re-use intentions (Study 2b/3: α = .96/.96, CR = .85/.87; AVE = .65/.70)    • I would use this chatbot again - /.79 .79    • I would use this chat service in my daily life - /.83 .89    • I would order my pizza again with this chatbot - /.80 .83 Electronic Markets (2023) 33:56 1 3 Page 15 of 22 56 Respondents from a German university were recruited via E-Mail distribution lists and randomly assigned to a 3 (mes- sage type: empathy vs. solution vs. control) × 2 (user fault vs. chatbot fault) between-subjects experiment. After excluding eight participants who failed the attention check (i.e., if you read this, please press button 1), the final sample consisted of 249 respondents (MAge: 27 years, SDAge: 14.24, 63% female). As scenario, a pizza delivery case was used (see Table 2), as this case represents another common field for digital plat- forms (e.g., Uber eats, Deliveroo, HelloFresh) and for chatbot services (Li et al., 2020; van Pinxteren et al., 2020). As user- fault scenario, the user entered “to my home” as the delivery address, which obviously could not be found in a database. As chatbot-fault scenario, the user entered an address “to Schlösschen Street 12,” which a chatbot would be supposed to find in a location database. Recovery messages were taken from Study 1 and slightly adapted to fit the failure situation. Measures After reading the scenario, participants rated their post-recovery satisfaction, followed by demographics and manipulation and realism checks. Scales were identi- cal to the ones used in Study 1 and Study 2. All scenarios were perceived as realistic (α = 0.81; M: 5.71, SD: 1.26). As manipulation check for failure attribution, respondents rated “who was responsible for the failure,” anchored at “user (1)” up to “chatbot (7).” People in the chatbot-fault scenario held the chatbot more responsible for the failure compared to the user-fault scenario (MChatbot-fault: 4.69; SD: 1.89 vs. MUser-fault: 3.56, SD: 2.27, t(247) = − 4.08, p < 0.001). Moreover, for the message types, respond- ents of the empathy scenario rated significantly stronger that the chatbot asked for their empathy and under- standing (MEmpathy: 5.33, MSolution: 2.90, MControl: 2.25, F(2246) = 130.93, p < 0.001). Similarly, respondents in the solution message scenario perceived more strongly that the chatbot has suggested a solution (MSolution: 4.34, MEmpathy: 3.10, MControl: 2.39, F(2246) = 26.62, p < 0.001). Again, all scales exhibited adequate validity and reliability (see Table 3). In addition, Table 4 shows the means and standard deviations of the key variables. Table 4 Descriptives for studies 1, 2, and 3 Numbers represent means (standard deviations) Dependent variables Warmth Competence Post-recovery satisfaction Re-use intentions Study 1 Empathy 4.40 (1.43) 2.66 (1.27) 2.11 (1.03) - Solution 3.87 (1.30) 3.19 (1.29) 2.50 (1.19) - Control 3.53 (1.33) 2.66 (1.35) 1.78 (0.79) - Study 2 First failure Empathy 4.35 (1.51) 2.88 (1.61) 3.08 (1.48) - Solution 3.64 (1.49) 3.51 (1.52) 3.18 (1.56) - Control 3.04 (1.61) 2.78 (1.52) 2.41 (1.39) - Second failure Empathy 3.59 (1.64) 1.80 (1.11) 1.95 (1.14) 1.86 (1.25) Solution 3.28 (1.54) 2.66 (1.75) 2.66 (1.79) 2.23 (1.87) Control 2.57 (1.49) 1.90 (1.32) 1.77 (1.18) 2.17 (1.34) Success Empathy 4.67 (1.73) 5.17 (1.50) 5.10 (1.65) 4.63 (1.81) Solution 3.69 (1.55) 5.02 (1.03) 4.64 (1.22) 4.81 (1.36) Control 3.85 (1.84) 4.76 (1.61) 4.81 (1.61) 4.68 (1.69) Study 3 User-attributed failure Empathy 4.63 (1.41) 3.66 (1.47) 4.49 (1.62) 4.37 (1.89) Solution 3.70 (1.16) 4.54 (1.49) 4.31 (1.53) 4.13 (1.70) Control 3.48 (1.36) 3.79 (1.34) 4.27 (1.65) 4.02 (1.78) Chatbot-attributed failure Empathy 4.28 (1.30) 3.21 (1.15) 3.63 (1.22) 3.33 (1.70) Solution 3.53 (1.53) 3.49 (1.72) 3.15 (1.39) 3.46 (1.99) Control 3.06 (1.35) 2.75 (1.67) 2.72 (1.22) 2.38 (1.46) Electronic Markets (2023) 33:56 1 3 56 Page 16 of 22 Results To test H1 to H3 in one comprehensive model, we used a moderated mediation analysis (PROCESS Model 8, Hayes (2018) with 5000 bootstrap samples and 95% CIs) with the same setup as in the studies above, including age and gender as covariates. Regarding H1 and H2, results confirmed Study 1 and Study 2. Again, the empathy message (vs. control) increased per- ceived warmth (b = 1.18, p < 0.01), and the solution mes- sage (vs. control) led to higher competence perceptions (b = 1.22, p < 0.01), supporting H1 and H2. In addition, results showed that the solution message (vs. control) did not increase warmth (p = 0.51) and the empathy message did not increase competence (p = 0.50). Satisfaction was influenced by warmth (b = 0.16, p < 0.01) and competence (b = 0.58, p < 0.001), supporting H3 again. Both message types had no direct impact on satisfaction. Regarding H6a, the interaction of empathy message × fail- ure attribution had no significant impact on warmth (p = 0.64). The indirect effect of the empathy message (vs. control) on satisfaction via warmth was significant in the case of a user-attributed failure (b = 0.19; [0.03, 0.41]) and in the case of a chatbot-attributed failure (b = 0.15; [0.04, 0.31]). Subsequently, the moderated mediation effect was not significant (b = − 0.04; [− 0.23, 0.14]). This means, irre- spective of the failure attribution, there is a mediation effect of empathy on satisfaction via warmth. As a consequence, 6a could not be supported. However, the situation changes when considering the solution message (H6b). In this case, the interaction of the solution message × failure attribution had a negative impact on competence (b = − 1.23, p < 0.05). The indirect effect of the solution message (vs. control) on satisfaction via com- petence was significant in the case of a user-attributed fail- ure (b = 0.71; [0.32, 1.14), but not significant in the case of a chatbot-attributed failure (b = 0.06; [− 0.31, 0.45]). The index of moderated mediation was significant and negative (b = − 0.65; [− 1.22, − 0.12]). This indicates that the positive (mediated) effect of the solution message through competence on satisfaction is only supported when the failure is attributed to the user. When the chatbot is responsible for the failure, the positive effect diminishes. In sum, H6b could be supported. Finally, to test H4 (a and b), we applied the same mod- erated mediation model (Model 8) and replaced satisfac- tion with re-use intentions. Results are comparable to those above. Empathy led to warmth (b = 1.18, p < 0.01) and solu- tion increased competence (b = 1.22, p < 0.01). Moreover, solution did not lead to warmth (p = 0.51) and empathy did not lead to competence (p = 0.50). “In turn, chatbot re-use intentions were influenced by warmth (b = 0.16, p < 0.05) and competence (b = 0.69, p < 0.001), supporting H4 (a and b).” The effects of moderated mediation remained compa- rable to those above: The indirect effects of empathy via warmth on re-use intentions were significant irrespective of failure attribution (buser-attribution = 0.19, [0.01; 0.46] and bchatbot-attribution = 0.15, [0.01; 0.33]; index = − 0.04; [− 0.25;0.16]), while the indirect effects of solution via com- petence on re-use intentions were only significant in case of user-attribution (and not for chatbot-attribution) (i.e., buser-attribution = 0.84, [0.37; 1.37] and bchatbot-attribution = 0.07, [− 0.38; 0.54]; index = − 0.77; [− 1.48; − 0.13]). Discussion As response failures occur frequently during chatbot inter- actions, recovery strategies are greatly needed to mitigate negative user reactions, avoid financial losses, and assure re-use intentions. This is especially relevant for electronic markets and digital platforms such as Airbnb, Booking, or Uber, as service provision and customer-facing support are part of their key assets. To help answer the question of whether and how recovery messages might support these goals, the present research investigated how people respond to two characteristic recovery messages in chatbot conver- sations and focused on the mediating role of social cogni- tion. Three experiments in two contexts compared the two characteristic messages empathy and solution and identified that these messages trigger social cognitions of warmth or competence (H1 and H2)—which positively influence post- recovery satisfaction and chatbot re-use intentions (H3 and H4). Furthermore, the impacts of situational factors on mes- sage effectiveness were analyzed. First, failure frequency influences which message should be preferred (H5). More precisely, after a double deviation, only an empathy mes- sage has a significantly positive effect on warmth, whereas the solution message had no significant effect on compe- tence any more. In contrast, when the chatbot solved the user request successfully after an initial failure, effects of different recovery messages dissolved. Thus, the final suc- cess of a chatbot interaction shifts post-hoc perceptions of the previous recovery messages. Second, integrating the factor of failure attribution (H6a/b) showed that a solution message is particularly det- rimental to user satisfaction with a chatbot-attributed failure (i.e., a mismatch). In this situation, the solution message did not lead to higher satisfaction (mediated via competence). In contrast, in a user-attributed failure situation, people seemed to accept a solution message more, as this message type led to higher post-recovery satisfaction via increased compe- tence perceptions. An empathy message was found to be acceptable for both user- and chatbot-failure attributions. This indicates that an apology and request for understanding is “always possible” and a less critical approach compared to the solution message, and rather preferable when failure attribution remains unclear. Electronic Markets (2023) 33:56 1 3 Page 17 of 22 56 Theoretical contributions This research responds to scholarly calls for further user-cen- tered investigation of chatbot response failures (Diederich et al., 2020) and provides several theoretical contributions. First, we add to the growing body of research regarding digi- tal agents’ conversational design (Crolic et al., 2021; Sands et al., 2021; Song et al., 2022; Weiler et al., 2022). Interac- tions in electronic markets and particularly digital platforms (e.g., Airbnb, eBay) rise continuously, leading to a parallel increase in demand for effective and efficient customer ser- vice (Hein et al., 2020; Suta et al., 2020). Next to such user- facing platforms, chatbots are also increasingly implemented in corporate applications (e.g., Slack or Microsoft Teams) to facilitate processes and information access (Stoeckli et al., 2020). Thus, as chatbots are increasingly taking over tasks in the digital surrounding and are a major service innovation, an appropriate design of chatbot responses is key for positive customer experiences and firm profitability (Mozafari et al., 2022). This study proposes that message types, when used as a psychological recovery attempt, should be carefully chosen depending on situational factors like failure frequencies or failure attribution. These results offer a more nuanced view on the effectiveness of recovery messages—and confirm for- mer studies that stated that chatbot designs should follow human service chat interactions in order to be successful (Belanche et al., 2021; Gnewuch et al., 2018; van Pinxteren et al., 2020). Second, this research adds to the literature of service failures and recovery, particularly in the domain of digital agents (Chong et al., 2021; Mozafari et al., 2022). With this study, we respond to scholars who have called for an exami- nation of effective recovery strategies to improve users’ ser- vice experience after chatbot failures (Benner et al., 2021; Janssen et al., 2021; van der Goot et al., 2021). We also complement the findings of Weiler et al. (2022) who exam- ined ex-ante strategies by showing that messages directly after the failure (ex-post) also have a positive effect on re-use intention and thus reduced discontinuance. Moreover, this research complements studies which consider the impact of recovery messages of digital agents (L. Lv et al., 2022a, 2022b; Song et al., 2023). As service delivery by chatbots becomes more widespread, understanding how people respond to chatbot recovery attempts is of crucial relevance to secure service quality and consumer loyalty (Mozafari et al., 2022; Sands et al., 2021). Supporting findings from related studies (such as Xu and Liu (2022), our study results show that messages could trigger different social cognitions and achieve their goal of increasing post-recovery satisfac- tion via different paths. In addition, this study examines several conditions that influence the effectiveness of a par- ticular message. By including failure frequency (i.e., double deviation) and failure attributions in the research design, we illustrated that such dimensions indeed play a role for the optimal message choice. As such, this paper also adds to the scant research around double deviations (Pacheco et al., 2019) and to knowledge of the effects of failure attributions in the field of human–computer interaction. Additionally, results might encourage future-related work to incorporate these factors into their research as well. Third, this study adds to research assessing social cog- nitions. Only recently have scholars started to assess the perceptions of warmth or competence in relation to digital (conversational) agents (Choi et al., 2021; Han et al., 2021; Kull et al., 2021; McKee et al., 2022; Xu & Liu, 2022). As new technology, such as artificial intelligence or machine learning, further develops, digital agents will interact in more humanlike service interactions and will increasingly imitate human behavior in order to create more favorable user responses. While many studies in this field concen- trate on anthropomorphism as visual cues for warmth or competence (e.g., Choi et al., 2021), our research extends insights about text-related cues (Han et al., 2021; Kull et al., 2021). While prior studies focused on effects due to peo- ple’s reactance (Han et al., 2021) or the impacts of an initial warmth- or competence-related message at the beginning of a conversation (Kull et al., 2021), this study considers post- failure messages and examines how two prototypical mes- sages trigger social cognitions. Warmth and competence perceptions were found to be the underlying mechanisms of the respective messages on users’ post-recovery responses. More precisely, message elements requesting a person’s understanding are social-oriented and were perceived as warm, whereas a message which presents a possible solu- tion is task-oriented and was perceived as competent. In turn, both perceptions increased post-recovery satisfaction and re-use intentions. This supports the “computers are social actors” (CASA) paradigm (Nass et al., 1996) and shows that chatbot responses are processed and perceived like human service-agent messages. However, the study also shows that the mediation through social perceptions could be eliminated by external circumstances. For instance, a double deviation (i.e., a chatbot’s second non-understand- ing) removed the mediated effect of solution-oriented mes- sages via competence. Managerial implications Results of the three studies provide guidance to both soft- ware designers and companies employing chatbots on how to implement chatbot recovery messages as cost-effective and universally usable tool to mitigate negative service experi- ences. First, using a dedicated recovery message is beneficial to mitigate negative users’ responses after a chatbot failure with only marginal costs for software programming. This research revealed that each message follows a distinct path Electronic Markets (2023) 33:56 1 3 56 Page 18 of 22 to increase post-recovery satisfaction—either by driving competence-perceptions or warmth-perceptions of users. Uncovering these underlying mechanisms helps manag- ers to understand how consumers’ responses are formed. In particular, software designers can now formulate pre- cise warmth- or competence-related messages as effective response to service failures. Second, across the studies, competence perceptions gen- erally exerted a stronger total effect on satisfaction than warmth. As the solution message fosters competence per- ceptions, this message type could therefore be considered a more effective strategy for both product- and service-related contexts. Using the solution message also allows chatbot designers to employ corrective measures to successfully conclude the conversation. However, if the recovery pro- cess was successful after the initial failure (i.e., the chatbot successfully resolved the request), the impact of the recovery messages dissolved, as consumers do not seem to care (post- hoc) how they got to this point. Nevertheless, as likelihood of failure is high, managers and chatbot developers should be encouraged to incorporate one of the two message forms to safeguard against negative effects in case of failure without risking negative effects in case of success. Third, the analysis of situational factors revealed several insights. When failing twice, the empathy message led to warmth (which acts as a mediator between message and sat- isfaction), while the solution message did not increase com- petence (and subsequently did not mediate between message and satisfaction). Regarding the final outcomes, however, the solution message generated higher means for post-recovery satisfaction and re-use intentions than the empathy message (see Table 4). This should be considered by managers when deciding on which message to use. Moreover, when people attribute the chatbot as responsible for the failure, only the empathy message is preferable. In that case, the solution message had no indirect effect on satisfaction (via compe- tence), while the empathy message had a positive indirect effect on satisfaction. When managers are in doubt about whether the chatbot or user is responsible for the failure, the empathy message reflects a rather uncritical choice. In sum, our results show that the “solution” message is more effective than the “empathy” message in some situations, while it is the other way round in other situations. Therefore, managers need to be aware of the type of failure to evaluate failure attributions, and about the failure frequency, in order to adapt the recovery messages accordingly. More generally, with the fast-paced developments in the field of deep learning and large language models, managers might be tempted to integrate chatbots such as “ChatGPT” in their service processes (Dwivedi et al., 2023). However, unlike most current chatbots (based on natural language processing or simple decision trees), which respond gener- ally with some sort of error message (e.g., “Sorry, I don’t know”), ChatGPT generally responds with a text expressing the most likely answer. Based on a vast amount of available text, the algorithm aims to anticipate the highest likelihood of an answer by forecasting what a human would use to reply to the specific request. Thus, instead of acknowledg- ing failure, ChatGPT often “hallucinates,” meaning that this kind of chat tools produce information that may be nonsen- sical, untrue, or inconsistent with the content of the source input (Dwivedi et al., 2023; Ji et al., 2023). In the context of diverse service interactions, such hallucinated responses to user queries pose a significant threat, as service activities are often associated with actions (e.g., customer data, con- firmations, bookings, and returns). Therefore, while integrat- ing language processing models such as ChatGPT may be beneficial for service interactions, failure acknowledgment and recovery attempts (e.g., via messages) remain highly relevant for digital service interactions. Limitations and future research Although this research offers valuable insights, it also has some limitations. First, our study relied on screenshots of chat conversa- tions to ensure high internal validity. To add validity to our findings, future research could investigate our framework in the field. In this vein, scholars could also analyze if new and more sophisticated bots such as ChatGPT are less prone to service failures, and whether these bots could also integrate more context-aware information to create a more person- alized and failure-congruent recovery message. Moreover, longitudinal designs would provide a fuller perspective on the chatbot recovery process and allow to investigate pos- sible long-term effects of chatbot messages. Second, this study considered failure frequency and failure attribution as two situational factors. Future studies could include additional factors such as the type of product or service associated with the chatbot service. While our studies used a product-related and two service-related cases to somehow include this situational factor, future studies might investigate product- or service-specific features (e.g., simple vs. complex; hedonic vs. utilitarian; low- vs. high- risk). Next to this, future research might also explore the effects of other design elements, such as different message tonalities or recovery feedback elements, in combination with the two message types. For instance, a chatbot could present a message and ask if the information was helpful. Related chatbot studies revealed that already minor adapta- tions in the conversational design (e.g., response delays or chatbot- vs. user-initiation) have effects on user’s satisfaction with the chatbot (Gnewuch et al., 2018; Pizzi et al., 2021). Third, while our research did not focus on the role of emotions in chatbot failure and recovery, prior research Electronic Markets (2023) 33:56 1 3 Page 19 of 22 56 found emotions to influence consumers’ reactions in chat- bot interactions (Crolic et al., 2021). Future studies should therefore investigate the role of emotions such as anger, frustration, and helplessness in human-chatbot interactions. Fourth, we used two prototypical messages to measure their effects precisely, neglecting other possible forms or mixtures of messages, or even the combination with other forms of compensation such as vouchers or human interac- tion, leaving open a fruitful field for future research related to digital agents’ conversational design. Funding Open Access funding enabled and organized by Projekt DEAL. Data Availability Data is available on reasonable request. Open Access This article is