Jump to content

Defragmentation

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Meaganmurphy (talk | contribs) at 19:57, 13 September 2007 (Defragmentation issues). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the context of administering computer systems, defragmentation is a process that reduces the amount of fragmentation in file systems. It does this by physically reorganizing the contents of the disk to store the pieces of each file close together and contiguously. It also attempts to create larger regions of free space using compaction to impede the return of fragmentation. Some defragmenters also try to keep smaller files within a single directory together, as they are often accessed in sequence. According to a survey, 42% of PC users fail to defrag their system regularly which leads to adversely affecting system performance. [1]

Aims of defragmentation

Reading and writing data on a heavily fragmented file system is slowed down as the time needed for the disk heads to move between fragments and waiting for the disk platter to rotate into position is increased (see seek time and rotational delay). For many common operations, the performance bottleneck of the entire computer is the hard disk; thus the desire to process more efficiently encourages defragmentation. Operating system vendors often recommend periodic defragmentation to keep disk access speed from degrading over time.

Fragmented data also spreads over more of the disk than it needs to. Thus, one may defragment to gather data together in one area, before splitting a single partition into two or more partitions (for example, with GNU Parted, or PartitionMagic).

Defragmenting may help to increase the life-span of the hard drive itself, by minimizing head movement and simplifying data access operations.[citation needed]

Causes and cures

Fragmentation occurs when the operating system cannot or will not allocate enough contiguous space to store a complete file as a unit, but instead puts parts of it in gaps between other files (usually those gaps exist because they formerly held a file that the operating system has subsequently deleted or because the operating system allocated excess space for the file in the first place). Larger files and greater numbers of files also contribute to fragmentation and consequent performance loss. Defragmentation attempts to alleviate these problems.

Consider the following scenario, as shown by the image on the right:

An otherwise blank disk has 5 files, A, B, C, D and E each using 10 blocks of space (for this section, a block is an allocation unit of that system, it could be 1K, 100K or 1 megabyte and is not any specific size). On a blank disk, all of these files will be allocated one after the other. (Example (1) on the image.) If file B is deleted, there are two options, leave the space for B empty and use it again later, or compress all the files after B so that the empty space follows it. This could be time consuming if there were hundreds or thousands of files which needed to be moved, so in general the empty space is simply left there, marked in a table as available for later use, then used again as needed.[2] (Example (2) on the image.) Now, if a new file, F, is allocated 7 blocks of space, it can be placed into the first 7 blocks of the space formerly holding the file B and the 3 blocks following it will remain available. (Example (3) on the image.) If another new file, G is added, and needs only three blocks, it could then occupy the space after F and before C. (Example (4) on the image). Now, if subsequently F needs to be expanded, since the space immediately following it is no longer available, there are two options: (1) add a new block somewhere else and indicate that F has a second extent, or (2) move the file F to someplace else where it can be created as one contiguous file of the new, larger size. The latter operation may not be possible as the file may be larger than any one contiguous space available, or the file conceivably could be so large the operation would take an undesirably long period of time, thus the usual practice is simply to create an extent somewhere else and chain the new extent onto the old one. (Example (5) on the image.) Repeat this practice hundreds or thousands of times and eventually the file system has many free segments in many places and many files may be spread over many extents. If, as a result of free space fragmentation, a newly created file (or a file which has been extended) has to be placed in a large number of extents, access time for that file (or for all files) may become excessively long.

The process of creating new files, and of deleting and expanding existing files, may sometimes be colloquially referred to as churn, and can occur at both the level of the general root file system, but in subdirectories as well. Fragmentation not only occurs at the level of individual files, but also when different files in a directory (and maybe its subdirectories), that are often read in a sequence, start to "drift apart" as a result of "churn".

A defragmentation program must move files around within the free space available to undo fragmentation. This is a memory intensive operation and cannot be performed on a file system with no free space. The reorganization involved in defragmentation does not change logical location of the files (defined as their location within the directory structure).

Another common strategy to optimize defragmentation and to reduce the impact of fragmentation is to partition the hard disk(s) in a way that separates partitions of the file system that experience many more reads than writes from the more volatile zones where files are created and deleted frequently. In Microsoft Windows, the contents of directories such as "\Program Files" or "\Windows" are modified far less frequently than they are read. The directories that contain the users' profiles are modified constantly (especially with the Temp directory and Internet Explorer cache creating thousands of files that are deleted in a few days). If files from user profiles are held on a dedicated partition (as is commonly done on UNIX systems), the defragmenter runs better since it does not need to deal with all the static files from other directories. For partitions with relatively little write activity, defragmentation performance greatly improves after the first defragmentation, since the defragmenter will need to defrag only a small number of new files in the future.

Defragmentation issues

The presence of immovable system files, especially a swap file, can impede defragmentation. These files can be safely moved when the operating system is not in use. For example, ntfsresize moves these files to resize an NTFS partition.

All files with read-only attributes are immovable if the defragger is not run with administrative rights. While the system files are correctly read-only, most computers today contain many inappropriately read-only files. When copying from a CD all the copied files retain the read-only attribute. These immovable files will interfere with defrag operations. Unsetting all the read only flags can be accomplished in MS-DOS and Windows with the command "attrib -R /S /D * " which will not impact files marked with the system attribute.

On systems without fragmentation resistance, fragmentation builds upon itself when left unhandled, so periodic defragmentation is necessary to keep the disk performance at peak and avoid the excess overhead of less frequent defragmentation.

Defragmentaion according to a recent articles is important in a virtual environments.

[3] [4]


A recent article gives tips on improving computer performance including the need to defragment [5]


A recent article oulines the the insidious disease of disk file fragmentation is still very much alive and infecting. [6]

A recent article oulines the the insidious disease of disk file fragmentation is still very much alive and infecting. [7]

Myths

Defragging the disk will not stop a system from malfunctioning or crashing because the filesystem is designed to work with fragmented files. [8] Since defrag cannot be run on a filesystem marked as dirty without first running chkdsk [9], a user who intends to run defrag "to fix a system acting strangely" often ends up running chkdsk, which repairs file system errors, the end result of which may mislead the user into thinking that defrag fixed the problem when it was actually fixed by chkdsk.

In fact, in a modern multi-user operating system, an ordinary user cannot defragment the system disks since superuser access is required to move system files. Additionally, modern file systems such as NTFS are designed to decrease the likelihood of fragmentation. [10] Improvements in modern hard drives such as RAM cache, faster platter rotation speed, and greater data density reduce the negative impact of fragmentation on system performance to some degree, though increases in commonly used data quantities offset those benefits. However, modern systems profit enormously from the huge disk capacities currently available, since partially filled disks fragment much less than full disks. [11] In any case, these limitations of defragmentation have led to design decisions in modern operating systems like Windows Vista to automatically defragment in a background process but not to attempt to defragment a volume 100% because doing so would only produce negligible performance gains. [12]

Filesystems

  • FAT DOS 6.x and Windows 9x-systems come with a defragmentation utility called Defrag. The DOS version is a limited version of Norton SpeedDisk.
  • NTFS Windows 2000 and newer include an online defragmentation tool based on Diskeeper. NT 4 and below do not have built-in defragmentation utilities. Unfortunately the integrated defragger does not consolidate free space. Thus a heavily fragmented drive with many small files may still have no large consecutive free space after defragmentation. So any new large file will instantly be split into small fragments with immediate impact on performance. This can happen even if the overall disk usage is less than 60%.
  • ext2 uses an offline defragmenter called e2defrag, which does not work with its successor ext3, unless the ext3 filesystem is temporarily down-graded to ext2.
  • JFS has a defragfs utility on IBM operating systems.
  • HFS Plus in 1998 introduced a number of optimizations to the allocation algorithms in an attempt to defragment files while they're being accessed without a separate defragmenter.
  • XFS provides an online defragmentation utility called xfs_fsr.

See also

References

  • Jensen, Craig (1994). Fragmentation: The Condition, the Cause, the Cure. Executive Software International. ISBN 0-9640049-0-9.

Notes

  1. ^ "42% of PC Users Fail to Defrag their Computers: Survey".
  2. ^ The practice of leaving the empty space behind after a file is deleted, marked in a table as available for later use, then used again as needed is why undelete programs were able to work, they simply recovered the file whose name had been deleted from the directory, but the contents were still on disk.
  3. ^ Maximizing Performance in the 'Green Computing'Trend
  4. ^ The Hidden Hindrance to Virtualization Performance
  5. ^ Still a Problem; Writer Provides Tips for Improving Computer Performance
  6. ^ The Disease of Epidemic Proportions
  7. ^ The Disease of Epidemic Proportions
  8. ^ Defragmentation is not a solution to program or system crashes
  9. ^ Defrag cannot be run on file system marked as dirty until chkdsk is run
  10. ^ NTFS decreases the likelihood of fragmentation as compared to older file systems
  11. ^ Modern hard drive improvements minimize negative impact of fragmentation
  12. ^ Windows Vista automatic defragmentation does not attempt to reach 100% defragmentation because that would not help system performance