Clay Shields, a professor of computer science at Georgetown University, explains.
Computers crash because of errors in the operating system (OS) software or errors in the computer hardware. Software errors are probably more common, but hardware errors can be devastating and harder to diagnose.
A variety of hardware components must function correctly in order for a computer to work. These components, like many things, age over time and can develop faults. Unfortunately, these faults are often transient, and can be hard to diagnose because they do not appear consistently. The system power supply can fail in this manner. Normally a computer's power supply converts alternating current to clean direct current. If it starts to fail, the computer can crash accidentally when the power supply generates a noisy signal. The random access memory (RAM) can also fail in an intermittent way, particularly if it gets hot. Because the values RAM stores get corrupted unpredictably, it causes random system crashes. The central processing unit (CPU) can also be the source of crashes due to excessive heat. The (often loud) fans on most common computers are there to prevent this type of crash, though they may eventually fail. The fans that bring cooling air into the case also carry dirt and dust inside. This dirt can accumulate and cause intermittent short circuits as the dirt blows around. Fortunately, compressed air or a vacuum cleaner easily gets rid of the dirt. Still other hardware problems that can cause crashes are trickier to identify and require software tests or sequential replacement of components.
More permanent faults happen with errors on a computer's disk. Each disk stores information in units named sectors. Most new disks come with bad sectors that occur in the manufacturing process and are marked at the factory. Makers expect this and include ample additional sectors to replace the defective ones. Sectors can go bad later, however, and lose the information stored on them. If these sectors happen to hold system information, they can cause a crash. Worse, a disk can fail completely when the computer gets jarred and the head that reads information makes contact with the disk surface. This may cause all data on the disk to be lost.
Although crashes caused by hardware are possible, most computer crashes are caused by errors in the OS software. The OS does more than provide an interface for the user to operate the computer. It also provides a consistent interface between applications and the hardware, and acts to share system resources between different programs. As a result, there are a number of errors that can occur. Perhaps the most common is a glitch that arises when the OS tries to access an incorrect memory address, perhaps as a result of a programming error. In Windows, this can lead to an error known as a General Protection Fault (GPF). Other errors drive the OS into an infinite loop, in which the computer executes the same instructions over and over without hope of escape. In these cases, the computer might seem to "lock up"--the system doesn't crash, but is not longer responsive to input and needs to be reset. Still other problems result when a bug allows information to be written into a memory buffer that is too small to accept it. The additional data "overflows" out of the buffer and overwrites information in memory, corrupting the OS state. These same errors can occur in application programs. Newer OSs are robust against application crashes, but in older systems application bugs can affect the OS and cause a system-wide crash. Modern operating systems are carefully tested, and tend to be relatively stable, but drivers that are added to the OS to allow the use of additional devices such as printers may not be, and are often the source of crashes. This is why most modern OSs allow for a special boot mode that disables loading drivers. The drivers can then be added one at a time to determine which one causes the error.



See what we're tweeting about






Comments
Add Comment