How Multimodal Can Jumpstart Voice in Retail

Voice-driven shopping is powerful, but it understandably comes with some skepticism on the consumer side. Yes, we have grown accustomed to a variety of digital retail methods – such as social commerce and mobile app- and web-driven commerce – but in every scenario, consumers are able to see the entire transaction in front of their eyes. Voice removes that visual element, and for many, that causes a measurable level of uncertainty and discomfort.

That said, 22% of Americans who own a smart speaker have actually made a purchase with it, according to Edison Research. Sure, this is a small fraction of the entire market, but there is clearly traction to the concept. While there are several barriers impacting the adoption of voice shopping, the one that can make an immediate difference is the integration of a screen into the experience.

Visuals Build Consumer Confidence

Every time a consumer makes an online purchase, it has one final chance to review its purchase details before confirming the transaction. If nothing else, this provides peace of mind in knowing the right items in the cart. However, when purchasing through voice, that same visual confirmation of the cart is typically unavailable. We place a lot of trust in technology, but confidence in voice comprehension is something we’re still getting used to.

Adding a visual component to your voice experience can significantly reduce this unease. Not only does it validate your voice commands, but it retains the frictionless element of voice interactions that we desire in the first place. With this validation, customers gain the confidence to carry out a transaction and ideally come back for more.


How to Seize the Multimodal Opportunity

Learn the key design concerns when building multimodal experiences that connect with customers via visuals and voice.

Read 'How to Seize the Multimodal Opportunity' Now

Visuals Improve Shopping Experience

Consumers have shown they’re willing to buy many items, such as consumables, sight unseen, but when it comes to apparel, it can be a tough sell. Unless consumers know well in advance the specific items they want, they often struggle to know exactly what they’ll be receiving. By bringing a screen into the equation, customers can use a brand’s voice component to filter by style, color, size, and more, but use the screen to browse the catalog and verify their selections.

Consumers feel strongly about this ability to blend technology to form a more complete experience. In a recent survey of the global Applause Community, 69% of those with voice-enabled devices reported they would be more inclined to make a voice purchase through a multimodal experience. Consumers are committing to multimodal displays faster than ever before – there was 558% growth in ownership by U.S. adults from January 2018 to January 2019 – so, as a retailer, you don’t have time to waste in delivering the right experience.

Visuals Spur Purchase Frequency

The time is now to provide a quality multimodal experience. Per, smart display owners are 133% more likely to make monthly voice purchases, proving that multimodal experiences are more than a novelty. As the growth in smart display ownership continues, the potential to capture voice-driven revenue will follow suit.

While repeat purchases can be confidently made through any voice device, a multimodal experience offers a greater opportunity for incremental sales. By bringing the personalized recommendation feature back into play (like with traditional ecommerce), you enable consumers to see additional products and make snap decisions as they are wont to do.

Voice represents the next generation of retail, much like mobile commerce did many years ago. As a result, voice adoption by retailers is expected to increase by 127% in the coming year, per Salesforce. While it’s exciting to see this level of adoption, success is no guarantee. Heed the advice and behavior of your customers, though, and you will be one step ahead of your competition.

As we explore the impact of voice on the retail landscape, we will next dive into the world of voice search and how voice engine optimization is the most important piece of the puzzle that you didn’t even know existed.


Voice UX Best Practices

Discover the tips and techniques for building better voice experiences in this report from Voicebot and Applause.

Read 'Voice UX Best Practices' Now
Want to see more like this?
Emerson Sklar
Tech Evangelist and Solution Architect
Reading time: 5 min

How to Assess Testing Resource Allocation

If you can’t measure the value of your efforts, you can’t explain or even justify your testing investment

Using Questionable Datasets to Train AI Could Come With High Costs

As companies look to capitalize on AI development, they must stay mindful of how they source training data — AI algorithms developed from private or non-consensual data may cost businesses in the long run.

Why Payment Testing is a Constant for Media Companies

Simulated transactions and pre-production testing won’t ensure friction-free subscriber payment flows

How Mobile Network Operators Can Leverage e- and iSIMs

We explain what e- and iSIMs are, what they mean for the industry and how MNOs and MVNOs can leverage them to their advantage.

Jumpstart Your Software Testing Education

Testers have a variety of options to upskill and grow professionally

The Future of Generative AI: An Interview with ChatGPT

We ask ChatGPT about where it sees itself in the future, what needs to happen for it to get there and how Applause can help.