
Game on:
Microsoft's answer to the Wii, called Natal, responds to gestures of players, who in this game must block a barrage of balls.
Image: Courtesy of Microsoft
-
The Best Science Writing Online 2012
Showcasing more than fifty of the most provocative, original, and significant online essays from 2011, The Best Science Writing Online 2012 will change the way...
Read More »
Editor's note: The online version of this story was posted on January 7.
When Nintendo’s Wii game console debuted in November 2006, its motion-sensing handheld “Wiimotes” got players off the couch and onto their feet. Now Microsoft hopes to outdo its competitor by eliminating the controller altogether: this past January it revealed details of Project Natal, which will give Xbox 360 users the ability to manipulate on-screen characters via natural body movement. The machine-learning technology will enable players to kick a digital soccer ball or swat a handball simply by mimicking the motion in their living room.
Microsoft, which announced its ambitious Xbox upgrade plan in June 2009, has not set a release date, but many observers expect to see Natal at the end of the year. It will consist of a depth sensor that uses infrared signals to create a digital 3-D model of a player’s body as it moves, a video camera that can pick up fine details such as facial expressions, and a microphone that can identify and locate individual voices.
Programming a game system to discern the almost limitless combinations of joint positions in the human body is a fearsome computational problem. “Every single motion of the body is an input, so you’d need to program near-infinite reactions to actions,” explains Alex Kipman, Microsoft’s director of innovation for Xbox 360.
Instead of trying to preprogram actions, Microsoft decided to teach its gaming technology to recognize gestures in real time, just like a human does: by extrapolating from experience. Jamie Shotton of Microsoft Research Cambridge in the U.K. devised a machine-learning algorithm for that purpose. It also recognizes poses and renders them in the game space on-screen at 30 frames per second, a rate more than sufficient to convey smooth motion. Essentially, a Natal-enhanced Xbox will capture movement on the fly, without the need for the mirror-studded spandex suit of conventional motion-capture approaches.
Training Natal for the task has required Microsoft to amass a large amount of biometric data. The firm sent observers to homes around the globe, where they videotaped basic motions such as turning a steering wheel or catching a ball, Kipman says. Microsoft researchers later laboriously selected key frames within this footage and marked each joint on each person’s body. Kipman and his team also went into a Hollywood motion-capture studio to gather data on more acrobatic movements.
“During training, we need to provide the algorithm with two things: realistic-looking images that are synthesized and, for each pixel, the corresponding part of the body,” Shotton says. The algorithm processes the data and changes the values of different elements to achieve the best performance.
To keep the amount of data manageable, the team had to figure out which were most relevant for training. For example, the system doesn’t need to recognize the entire mass of a person’s body, but only the spacing of his or her skeletal joints. After whittling down the data to the essential motions, the researchers mapped each unique pose to 12 models representing different ages, genders and body types.
The end result was a huge database consisting of frames of video with people’s joints marked. Twenty percent of the data was used to train the system’s brain to recognize movements. Engineers are keeping the rest in a “ground truth” database used to test Natal’s accuracy. The better the system can recognize gestures, the more fun it will be to play the game.
Of course, Microsoft is not the only company exploring gestural interfaces. Last May, Sony demonstrated a prototype unit that relies on stereo video cameras and depth sensors that, it says, could be used to control a computer cursor, game avatar or even a robot. Canesta, a company that makes computer-vision hardware, has demonstrated a system that lets couch potatoes control the TV with a wave of the hand and has partnered with computer manufacturers Hitachi and GestureTek to create gestural controls for PC applications.




See what we're tweeting about


6 Comments
Add CommentI'm a math student and am really annoyed at seeing expressions like "almost limitless" and "near-infinite". No number however big is more limitless or infinite than any other.
Reply | Report Abuse | Link to thisI love my Wii, and I have to say that holding that Wiimote in my hand makes slashing at the bad guys, swinging at golf balls and batting at badminton birdies way more fun.
Reply | Report Abuse | Link to thisHow would this system do a shooter? I'd feel pretty silly if I'd have to form a little gun with my thumb and forefinger. I think the technology is cool, and I'd love for my Wii to have something like this, and it opens up all kinds of options, but I still want something in my hand. Combining the 2 technologies would be fantastic.
@Derick, I agree. Combining the two technologies would be brilliant. As to your sword/gun in hand dilemma, I don't see why Microsoft would be unwilling to provide props for gameplay, such as swords (althought I suppose you could improvise that with even a handle to something or a stick), steering wheels, and guns.
Reply | Report Abuse | Link to thisI wonder if, in keeping with their thinking on the brilliant "Surface" project, this will require a gymnasium to store it in and operate.
Reply | Report Abuse | Link to thishaven't you guys seen "Gamer"?
Reply | Report Abuse | Link to thisYes Gamer was very futuristic and the world had a lot of convicts apparently. But 1 point I extracted from your comment may be the fact that the FAT American chick playing xbox3millioon couldnt move out of her chair even if she wanted to. So i have no idea why macsoft* would ever bother going down the out-of-sofa experience way. I guess they arent bothered about losing the American market.
Reply | Report Abuse | Link to this*macsoft ... Microsoft and bought macdonalds in the future and now you can order your pies from the comfort of your armchair while playing dragonslayer 2065. In fact you get a free pie T-livered** when you hit 1000, 5000 and 10000 points.
**T-livered (or tele-delivered) is the new teleport from your tv technology...push what you want on the tv and bang their it is in the box next to you.
Unfortunately a few years later the technology was hacked by a 16 year old American kid later described by the FBI as the largest kid in the world, who had fed his entire town for several months before they caught him.
Witness' wonder why it took them several months to catch him ... and wondered how fast his chair was travelling.
:)