May 6, 2002 | 0 comments

Autonomic Computing

Programs crash, people make mistakes, networks grow and change. That¿s life, and computer scientists are finally building systems that can deal with it

By W. Wayt Gibbs   

 
e-mail print comment

Computer hardware increases in speed and capacity by factors of thousands each decade; computer software piles on new features and fancier interfaces nearly as fast. So why do computers still waste our time and drive us crazy?

computer

One quarter of those under age 25 polled in a recent British survey said they had kicked their computers or seen friends do so. And the cost of sophisticated networked systems (on which nearly all large organizations are coming to depend) is now dominated not by ever-cheaper hardware and software but by the rising salaries of the gurus who can keep it all up and running. According to a study published in March 2002 by researchers at the University of California at Berkeley, the labor costs outstrip equipment by factors of three to 18, depending on the type of system. And one third to one half the total budget is spent preventing or recovering from crashes. And no wonder: a system failure at a brokerage or credit-card authorization center can run up millions of dollars per hour in lost business.

Computer Crisis

"There is no less than a crisis today in three areas: cost, availability and user experience," says Robert Morris, director of IBM¿s Almaden Research Laboratory. At a conference in Almaden, Calif., last month, research leaders from most of the largest computer companies and several universities agreed on the problem as it was sketched out in a "manifesto" released last October by IBM. "The growing complexity of the I.T. infrastructure threatens to undermine the very benefits information technology aims to provide," the anonymously authored manifesto asserts. The sheer number of computer devices is forecasted to rise at a compound rate of 38 percent a year; most of these devices will be connected to one another other and to the Internet. "Up until now, we've relied mainly on human intervention and administration to manage this complexity," the manifesto continues. "Unfortunately, we are starting to gunk up the works."

There is less agreement on the solution. IBM argues in its treatise that the goal should be "autonomic" computer systems analogous to the involuntary nervous system that allows the human body to cope with environmental change, external attack and internal failures. "Our bodies have great availability," Morris observes. "I have soft errors all the time: my memory fails once in a while, but I don¿t ¿crash.¿ My whole body doesn¿t shut down when I cut a finger."

Morris and the other heads of IBM¿s autonomic computing research effort have more in mind than just fault tolerance. The manifesto lists eight defining characteristics (right) of autonomic computing systems. Some have already been demonstrated in prototypes.

An autonomic system must have a sense of self, for example. It must keep track of its parts, some of which may be borrowed from or lent out to other systems. And it must keep its public and private parts segregated. At Columbia University, Gail Kaiser and colleagues in the Programming Systems Lab have worked out ways to add software probes, gauges and configuration controls to certain kinds of existing systems so that they can be monitored, tuned and even repaired automatically rather than by highly paid engineers.

Autonomic systems should also be able to heal, to recover from damage by some means other than a suicidal crash. Armando Fox and co-workers at Stanford University have demonstrated one way to accomplish this. Fox redesigned a satellite ground station system so that every subsystem can be rebooted independently if--or rather, when--it gets knocked offline. The system still goes down occasionally, but now it can resume operation in six seconds rather than 30. The same principle, called recursive restartability, could be applied to many kinds of complex systems to prevent small glitches from accumulating and cascading into full-blown outages.



Read Comments (0) | Post a comment 1 2 3 Next >


Share
Propeller    Digg!  Reddit delicious  Fark 
Slashdot    RT @sciam Autonomic ComputingTwitter Review it on NewsTrust 
sharebar end

You Might Also Like


Discuss This Article


Click here to submit your comment.

VIEW:

2,573 characters remaining
 
  Email me when someone responds to this discussion.
 

risk free issue 

Sciam - cover Email:
Name:
Address:
Address 2:
City:
State:  
spacer




Editor's Pick

  • Adapting to the Freshwater CrisisForward-thinking experts are getting a better handle on the growing global water shortage and coming up with innovative approaches to ensuring the security, safety and sustainability of this resource

Newsletter

Technology Newsletter

Get weekly coverage delivered to your inbox


 Podcasts

  • 60-Second Earth     RSS  · iTunes The Jellyfish Menace
    click to enable

    Download

  • 60-Second Science     RSS  · iTunes Plants Share Light If Neighbor Is Related
    click to enable

    Download





ADVERTISEMENT
 
 


Also on Scientific American


© 1996-2009 Scientific American Inc. All Rights Reserved. Reproduction in whole or in part without permission is prohibited.
ADVERTISEMENT