File Systems Gone Bad
File Systems Gone Bad
Copyright(c) Management Analytics, 1995 - All Rights Reserved
Copyright(c), 1990, 1995 Dr. Frederick B. Cohen - All Rights Reserved
Problem:
A little while ago, a cleaning lady sprayed one of my disk
drives with cleaning solution, and very nearly caused a disk crash. In
fact, there were many transient errors, and the system reported a disk
crash, but through some miracle, the system did not go down. I did an
immediate backup onto tape, and all was well (whew!). The greatest fear
of the computer user is not death by fire, it is a disk crash.
Under UNIX, file systems are not completely stored on the
disk. In order to enhance performance, many systems keep portions of
the file system in a memory cache area. As a result, if the system is
simply turned off, there may be an inconsistent state stored on the
disk. Fortunately, UNIX file systems normally contain enough
redundant information to recover from many such problems. The recovery
process is normally performed as an automatic consequence of bootup disk
checks, but in some cases systems administrators have to cleanup disks
during normal operation.
Prevention:
File-system failures cannot be completely prevented, but
there are some important techniques to help reduce the rate of
occurrence. The most common fault leading to a file-system failure is
a power failure. Because most UNIX systems cache file-system
changes to enhance performance, a power failure at the wrong time may be
catastrophic. The best defense is an uninterruptable power supply
(UPS). In my facility, we experience power failures or serious
fluctuations more than 20 times per year. Without the UPS, we would
have massive problems, but with the UPS, we haven't lost file
information in over 15 years of timesharing under UNIX.
Detection:
File system crashes are very easily detected, because UNIX
systems perform automatic self-test at bootup. They also tend to
produce obvious and dramatic results.
Cure:
The only real cure for otherwise irreperable file-system failures
is restoration from backups.