A Microphone, a Speaker, and an Internet Connection Walk into a Bar...

 

 EXPLORING VOICE AI AT SXSW INTERACTIVE 2018

 

Both AI (Artificial Intelligence) and voice interfaces were hot topics at this year’s SXSW Interactive conference. While the technology still falls short of the independently-cognitive vision familiar to science fiction fans, progress has been rapid, and opportunity is expanding. Two important presentations from this year’s conference: Crafting Conversations: Design in the Age of AI, and The Role of Voice in Music Discovery, captured important and notably contrasting approaches to designing for voice-enabled interfaces.

4.9 billion devices running Voice AI in 2016 will grow to a projected 21 billion by 2020.

In Crafting Conversations, Google™ Conversation Design Lead Daniel Padgett summarized the foundation of design for voice and highlighted the focuses that guide design practices for Google Home. “I teach robots to talk...” stated Padgett, and he positioned the state of voice interfaces as marking a stark contrast to earlier stages of computer technology in which “we had to learn to speak to the computer in its native language.” Indeed, to anyone working on early technologies such as punch cards and command line inputs, today’s voice-enabled devices must seem almost miraculous.

INDUSTRY TRENDS

Padgett’s views on the growth of Voice AI reveal much about Google’s broader strategy. He stressed the speed and simplicity of voice queries—comparing them to the number of “taps” necessary for text input for even simple searches—as well as the ubiquity of the service. These map well to Google’s core brand values exemplified by its clean white search landing page. He also cited statistics illustrating the category’s growth: 400 million-plus devices running Google Assistant alone, a sales volume of one Google voice-enabled device per second from October 2017–January 2018, and a platform supporting 22 languages (also a core competency for Google).

In The Role of Voice in Music, SoundHound™ Inc. VP and General Manager Katie McMahon expressed contrasting views of the state of Voice AI. While Padgett emphasized the evolution of the technology, McMahon framed the current point in Voice AI development as a generational one. She stated that while the year 2000 defined the “Touch-Tap-Swipe generation,” 2015 marked “Gen V: the voice-first generation.” She also identified 2017 as a tipping point in the development voice-enabled AI, much as 2007 marked the takeoff for mobile-first UX/UI strategies. She noted the coming growth of the category as well, from approximately 4.9 billion devices running voice AI in 2016 to a projected 21 billion by 2020.

GOOGLE’S SEARCH FOR A VOICE AI DESIGN PROCESS

Google’s approach to designing for Voice AI focus on the cognitive approach to human conversation. Padgett outlined four broad considerations to explain his team’s design approach for voice.

The first involves modeling conversation. Central to this is the Cooperative Principle developed by Paul Grice in 1975. Grice emphasized four “maxims” that facilitated effective conversation by building cooperation between speakers. Quality (truth), quantity (the right amount), relevance (the topic at hand), and manner (clarity) are all important to keep conversation engaged. Other linguistic cues, such as turn taking, questions, silence and even gesture also inform our conversations. Indeed, Padgett emphasized the need for clear and concise inquiries to make best use of Voice AI.

 

Google has optimized language processing to a word error rate of 4.9%, making this a solved issue.

The second consideration is in knowing the two speakers involved in Voice AI conversation. The first is the human side. Padgett described the human in the conversation as a “hands-busy/eyes-busy/multitasker.” Their personas identified them as “instant experts” with high standards and low tolerance for error in their use of Voice AI. They are happiest when acting within what they instinctively know and do in conversation. The flip side is the Voice AI itself. Padgett astutely describes this as literally “the voice of your brand.” As such, it deserves a specific role and even its own backstory to establish it as a core brand channel.

Google’s third consideration is the toolkit for Voice AI—addressing the nature of the voice signal itself, and the ability to recognize and understand speech. The spoken word, according to Padgett, is both always moving and by nature ephemeral: always fading (best illustrated by a game of Pass It On). Google constructs its Voice AI responses specifically to the ephemeral nature of speech: answering the primary query and then adding prompts to explore further information. And while Google has optimized language processing to a word error rate of 4.9%, there remains development to resolve the dilemma of what someone says combined with the intuitive interpretation of how they said it.

Lastly Google considers the expanding ecosystem of technology and communication. It aspires to “design for the overlaps” between voice only, voice-forward, intermodal, and visual communication and function.

SOUNDHOUND’S EXPANDING VOICE ECOSYSTEM

SoundHound appears to take a more holistic—and perhaps innovative—approach to the use of Voice AI. From its position as a leader in music discovery, it has developed a self-contained ecosystem for Voice AI enabled devices and applications. SoundHound continues to focus on music, but has integrated two additional applications: Houndify™—a Voice AI platform, and Hound™–an AI-enabled Voice Assistant.

The Houndify AI offers two strategic technology advances to voice queries relative to other platforms such as the Amazon Echo™/Alexa™, Apple’s SIri™ or OK Google. 

The first of these is a different model of query described by McMahon as “compound/complex.” While Padgett stressed the ideal of simple, concise questions, this still limits utility, and remains at the level of “speaking in the computer’s language.” The Houndify AI can handle queries with both inclusions and exclusions. An example would be “OK Hound, find me a restaurant within 3 miles but not a pizza place,” or “find me a flight next week to Chicago but not on United Airlines.” The answers generated by Houndify—while more lengthy and detailed than Google’s assistant, are also more specific. This is also a more intuitive manner of voice search for people. People often know more about what they aren’t looking for when they’re in browsing mode.

 

SoundHound users can search, discover and play music using voice commands instead of clicking, texting, tapping or swiping.

The second tech innovation for Houndify involves what McMahon called “Speech to Meaning.” This involves integrating the two primary Machine Learning aspects of Voice AI: Automatic Speech Recognition (ASR) and Natural Language Understanding (NLU). By making these data sets interoperable, the interactions between human and AI are more seamless and organic.

SoundHound displays its own penchant for innovation in use of voice technology. Through UX research, SoundHound discovered that the number one reason users abandoned its platform was because navigating the app was frustrating. Rather than taking a visual UX/UI approach to address this problem, SoundHound looked at voice-driven navigation as a better, more user-focused solution. Now, SoundHound users can search, discover and play music using voice commands instead of clicking, texting, tapping or swiping. The company’s willingness to go beyond an incremental move exemplifies the innovative DNA of a company comfortable in applying the principles of design thinking and agile development.

BRAND VOICES

McMahon echoed Padgett’s endorsement of brand as an important dimension of Voice AI. Here, Houndify also diverges from Google’s strategy. While Google enjoys the strength of its brand and Android™ ecosystem. Houndify is adaptive for use by an independent brand to create and integrate Voice AI functionality and applications. Houndify and Hound don’t own a brand of device or system, and thus becomes an enabling platform, McMahon continued. This makes Houndify a potentially valuable partner for brands that prefer to amplify their own voice through the technology. This open source offers additional flexibility by being adaptable across devices.

 

Houndify is a potentially invaluable partner for brands that prefer to amplify their own voice.

Houndify’s flexibility gives designers and companies an additional dimension of choice to consider when integrating voice AI. Companies my prefer the brand halo of Alexa, Google, or Siri as an amplifying feature. Or they may see a potential competitive advantage in creating their own Voice AI presence—one that’s unique to their brand.

WHAT’S NEXT

Padgett indicated that Google’s strategy addresses both the static placement of in-home smart speakers as well as mobile devices. Each has unique operating conditions, levels of privacy, and utility for the user. Google’s expansion into smart displays (also being developed by Amazon™, Panasonic™, and others) also tips their hand. It’s clear that they see an integration of voice and visual browsing, particularly in the home environment.

Padgett also emphasized the need for better use of linguistics, creative writing, and script writing as part of the UX toolkit for voice. McMahon countered that “with little or no UI, systems need to become smarter.” It is clear that this portends an advantage for the systems best able to automate Machine Learning and expand AI capabilities.

These are still early days for Voice AI, and while there are early leaders, it seems that there is still ample time to develop best practices and claim leadership in multiple markets.

 


: : Contact Tom Berno directly at tb.idea21@gmail.com for more information

Strategic convergence comes to big technology brands.

Screen Shot 2018-02-23 at 9.10.05 AM.png
Screen Shot 2018-02-23 at 9.33.24 AM.png
Screen Shot 2018-02-23 at 9.21.22 AM.png
Modernist, san-serif typography makes the logotypes of Google, Spotify, and Pinterest nearly identical.

It's a given in brand communications that differentiation (i.e., standing out from the crowd) is a top priority. Yet there is a clear trend amongst a number of the largest scale tech companies in which a certain sameness appears in their approach to brand logotypes. In a recent post on its Co.Design web portal, Fast Company highlights this trend in the article "Why Do Google, Airbnb, And Pinterest All Have Such Similar Logos?"

The article identifies a number of possible explanations, which undoubtedly have merit. Among those is the observation that a logo no longer equals the brand. This definitely true, although hardly a new insight. Marty Neumeier made this idea a central pillar to his seminal book on brand building: The Brand Gap. Other contributing factors identified by the various experts quoted in the article include a necessary simplicity, unity across UI elements, and a focus on the broader visual programs each brand creates. "So much of the identity now is defined by a lot of elements and experiences that surround the logo, that are supporting it" stated Howard Belk, co-CEO at Siegel + Gale. It is also true that each brand identity integrates a symbol, or in Google's case, a monogram, that help differentiate each brand.

However the article does not go further into examining the underlying condition: that strategic convergence in big tech brands is rampant. Strategic convergence occurs when a significant number of players in an industry or market establish similar strategic approaches. In the case at hand, one sees strategic convergence evident in the visual approach to brand typography in identity. Indeed, the modernist, san-serif typography makes the logotypes of Google, Spotify, and Pinterest nearly identical, as is the new AirBnB identity logotype.

 The extended look and feel of the AirBnB brand system creates its unique personality.

The extended look and feel of the AirBnB brand system creates its unique personality.

When an industry arrives at a state of strategic convergence (often described as "best practices"), it creates barriers to innovation, as more players seek to adopt the strategy choices of others. It's ironic that many of these companies, while they enjoy reputations as being innovative or disruptive, found such a similar approach desirable.

One possible explanation is less perceived risk; the success of the approach for key companies makes it appealing on its own. Drilling down further, by reflecting the design of logotypes from iconic companies like Google et. al., a new entity acquires just a little of the former's brand halo. The similar appearance of one company to another transfers just a little of the established company's trustworthiness—another brand imperative along with differentiation. Again from the Fast Company article: "All these bold and neutral logos are telling the consumer the same message: Our brand and our services are simple, straight-forward, and clear. And extremely readable.” Thierry Brunfaut, creative director and founding partner at Base Design.

It should also be said that there is no evidence that any of these companies made a deliberate decision to emulate another. One of the conditions that defines strategic convergence is that players frequently arrive at the same or similar conclusions independently. Neither are these conclusions limited to issues of visual brand design.

There is also another, more urgent risk to any company residing in a state of strategic convergence: those companies are ripe for disruption. By failing to look deeper at ways to create a unique identity and personality, companies leave a host of potential competitive advantages on the table. The "bold, neutral" approach above may be simple, and even effective, but mostly falls short of the sustainable advantage enjoyed by truly iconic brands, via their associated identities.

It is clear that thoughtful design can meet the baseline requirements for design compatible with 21st century technology and an appropriate level of simplicity. One example of design that—intentionally or not—defies the strategic convergence of technology branding comes from Medium. The selection of a serif font, vs. a san serif, immediately distinguishes the brand in a unique and memorable way.

 Medium's visual brand standards illustrate an approach that counters the status quo.

Medium's visual brand standards illustrate an approach that counters the status quo.

Companies should always deeply examine their goals and purposes, and be vigilant to avoid slipping into the false comfort of strategic convergence. When it comes to brand identity, moving away from the current status quo offers a much greater prospect of building a more compelling and dynamic personality. This offers the best opportunity for a company to truly own its market and insulate itself from encroaching disruption.

Contact Tom Berno to continue this discussion and find out more about how your brand can evolve beyond its competition.

Does David Ogilvy still matter?

ogilvy-aabbd961fc8c2d49eee8c70773566b5d.jpeg

"The world David Ogilvy inhabited no longer exists."

One of the most valuable things about Twitter—if one uses it right—is that it's the ultimate RSS feed. Inevitably, compelling content comes right into one's feed. So it was with an interesting Point/Counterpoint (H/T @TheDrum) on the legacy of advertising giant David Ogilvy.

In two posts, different authors take opposing opinions on the continuing relevance of Ogilvy. In the first, David Baldwin (Founder of the Raleigh, NC agency Baldwin& and author of The Belief Economy) argues that Ogilvy's methods have become irrelevant. In our times, he argues, media has "become more personal" and that people, particular millenials, value brands that express and act on beliefs that reflect their own priorities. Furthermore, he dismissed Ogilvy's time as one in which advertisers didn't face the myriad of mobile, social and interactive media channels that proliferate today. More succinctly, Baldwin states that "the world David Ogilvy inhabited no longer exists."

This is striking language, and also familiar. Fast Company magazine—in a seminal article on the future of advertising from 2010—addressed those who believed they could preserve the traditional strategies and tactics of "Big Idea" advertising:

“This is a holdover from 20th-century marketing,” says Brian Collins, a former Ogilvy exec who now runs an innovation consultancy. “People who think that way are supremely well equipped to work in a world that no longer exists.”

Note the pedigree of the source.

JP Hanson (Chief Executive of brand agency Rouser) responded aggressively. He stated that the singular purpose of marketing has not changed since Ogilvy's heyday: to sell.

Hanson questions the purpose-based view of branding, as well as its real relevance to consumers, stating it is overrated. He opined that the focus on purpose and values has more to do with brand marketers feeling discomfort with the idea that "selling" is their true purpose. Successful modern brands, he continues, enjoy preferred status because of positioning, rather than values or purpose.

These are interesting points, but both authors overlook more relevant factors: two major trends have altered brand communication since Ogilvy's time, while a third pillar has remained central.

Customers prefer to choose how and when they engage with a brand

The first trend is in the nature of branding that directly affects marketing. Branding is no longer a simply a matter of selling products or services. It is a matter of journeys and experiences. Every interaction with a brand defines its relevance and value, and the total experience determines if people will trust the brand with continued patronage, and even advocacy. International ad conglomerate WPP states that, while experiences and journeys are central to modern branding and marketing, that these are not linear matters. Customers prefer to choose how and when they engage with a brand, and the possible number of journeys is myriad. Brands must connect with consumers along all points of the experience to maximize impact, and build preferred trust. Noted author and brand consultant Denise Lee Yohn adds: "Eventually, Apple may no longer exist as a product company, or even a technology company, but an experience company."

The second trend is reflective of our current social and mobile media age. But rather than simply being a matter of these technologies adoption, there is a more central issue. The traditional advertising/marketing model of sender > message > receiver is no longer relevant. Communication for brands is now circular, and customer participation is as important as the brand's. 

Ogilvy's ongoing relevance has to do with whether his approaches intersect with new realities

McKinsey published an article summarizing many of these new factors. Important insights included:

"In today’s decision journey, consumer-driven marketing is increasingly important as customers seize control of the process and actively “pull” information helpful to them. Our research found that two-thirds of the touch points during the active-evaluation phase involve consumer-driven marketing activities, such as Internet reviews and word-of-mouth recommendations from friends and family..."

Yet for all the changes in the nature of branding, marketing, communication and technology, one consistency has remained: the premium on emotional connection between brands and people. WPP states that "Consistent emotions deliver consistent brand experiences." Harvard Business Review recommended brands define emotional connection as the "true North" of their strategy, and that a focus on delivering emotionally-relevant experiences at key points in the customer journey increased both customer satisfaction and ROI.

Ultimately, Ogilvy's ongoing relevance as a creative or strategy guide has less to do about changes in technology or shifting priorities in younger generation, and more to do with whether—and which of—his approaches intersect with new realities. It's highly unlikely that a creative who drove many successful campaigns over time could not have dealt with new conditions. The better question may be to ask how one might apply Ogilvy's techniques and views to experiences and journeys, to a circular communications model, and to engage at an emotional level. Equally important: what would make those approaches the appropriate, valid choice in application.