ADVERTISEMENT
See Inside January 2012

How Siri Makes Computers (and Coders) More Human

How much personality do we want from our gadgets?



Illustration by Thomas Fuchs

The most buzzed-about new feature in the latest iPhone is Siri, the virtual minion. You can give her an amazing range of spoken commands, without any training or special syntax, and marvel as she does your bidding.

You can say, “Call my assistant” or “Wake me up at eight” or “Make an appointment with Dr. Woodward for Friday at 2 p.m.” You can say, “How do I get to the airport from here?” or “Play Taylor Swift” or “When I get to the office, remind me to file the Smithers report.” You can ask her how many fluid ounces there are in a liter or the distance to Mars or when George Washington was born.

In each case, Siri briefly contacts Apple’s servers and then responds in a calm female voice, simultaneously displaying the information you requested.

It didn’t take long, though, for Internet wiseacres to start asking her questions with less concrete answers—and marveling at her witty, sometimes snarky replies.

You: “Siri, I love you.” Siri: “That’s sweet, David. Now can we get back to work?”

You: “What’s the meaning of life?” Siri: “I can’t answer that now, but give me some time to write a very long play in which nothing happens.”

You: “Open the pod bay doors, Siri.” Siri: “I’m sorry, David, I’m afraid I can’t do that. [Pause] Are you happy now?”

Siri is a breakthrough in voice control, sure, but she’s also a breakthrough in computerized personality. The question is: Do we want our gadgets to have personality?

Programmers and designers have always struggled with that question. The creators of every operating system have had to come up with a consistent syntax for communicating with people. Over the years various companies have flitted uncertainly from one philosophy to another.

Until Siri came along, Apple’s software has always avoided personal pronouns such as “I” and “you.” The result: some awkward passive-voice snarls like “The document could not be opened because it could not be found.”

Microsoft’s dialog-box English not only favors the passive voice, but it’s usually aimed at programmers, not humans: “SL_E_CHREF_BINDING_0UT_0F_T0LERANCE: The activation server determined that the specified product key has exceeded its activation count.” Ah, of course!

Citibank’s automated-teller machines lie at the opposite end of the Emily Post spectrum. They take the “I”/”you” personal approach to an extreme. “Hello. How may I help you?” says the welcome screen. When you sign off, you get, “Thank you. It’s always a pleasure to serve you.” These machines even try to take the blame for your own dumb mistakes: “I’m sorry, I don’t recognize that password.”

Now, deep down—actually, not that far down—we all know that our computers are not really engaging us; every utterance they make was written by a programmer somewhere. So why do the software companies even bother? If everyone knows it’s just a trick, should we even care how personable our machines are?

Yes, we should.

The designers’ intention, no doubt, was to make their machines more user-friendly by simulating casual conversation with fellow humans. But there’s a side effect of that intention: in trying to program machines that speak like people, the programmers are forced to think like people.

In Citibank’s case, writing messages in that second-person conversational style forced the engineers to put themselves in the mind-set of real humans. You can’t write an “I” statement for your ATM without also considering the logic, the terminology and the clarity of those messages. Someone writing in that frame of  mind would never come up with “The activation server determined that the specified product key has exceeded its activation count.”

Rights & Permissions
Share this Article:

Comments

You must sign in or register as a ScientificAmerican.com member to submit a comment.