Virtual memory

The memory pages of the virtual address space seen by the process, may reside non-contiguously in primary, or even secondary storage.

Virtual memory or virtual memory addressing is a memory management technique, used by multitasking computer operating systems wherein non-contiguous memory is presented to a software application (aka process) as contiguous memory. The contiguous memory is referred to as the virtual address space.

Virtual memory addressing is typically used in paged memory systems. This in turn is often combined with memory swapping, whereby memory pages stored in primary storage are written to secondary storage, thus freeing faster primary storage for other processes to use.

The term "virtual memory" is often confused with "memory swapping" (or "page/swap file" use), probably due in part to the prolific Microsoft Windows family of operating systems referring to the enabling/disabling of memory swapping as virtual memory. In fact, Windows uses paged memory and virtual memory addressing, even if the so called "virtual memory" is disabled.

In technical terms, virtual memory allows software to run in a memory address space whose size and addressing are not necessarily tied to the computer's physical memory. To properly implement virtual memory the CPU (or a device attatched to it) must provide a way for the operating system to map virtual memory to physical memory and for it to detect when an address is required that does not currently relate to main memory so that the needed data can be swapped in. While it would certainly be possible to provide virtual memory without the CPU's assistance it would essentially require emulating a CPU that did provide the needed features.

Background

Most computers possess four kinds of memory: registers in the CPU, CPU caches (generally some kind of static RAM) both inside and adjacent to the CPU, main memory (generally dynamic RAM) which the CPU can read and write to directly and reasonably quickly; and disk storage, which is much slower, but also much larger. CPU register use is generally handled by the compiler and this isn't a huge burden as data doesn't generally stay in them very long. The decision of when to use cache and when to use main memory is generally dealt with by hardware so generally both are regarded together by the programmer as simply physical memory.

Many applications require access to more information (code as well as data) than can be stored in physical memory. This is especially true when the operating system allows multiple processes/applications to run seemingly in parallel. The obvious response to the problem of the maximum size of the physical memory being less than that required for all running programs is for the application to keep some of its information on the disk, and move it back and forth to physical memory as needed, but there are a number of ways to do this.

One option is for the application software itself to be responsible both for deciding which information is to be kept where, and also for moving it back and forth. The programmer would do this by determining which sections of the program (and also its data) were mutually exclusive, and then arranging for loading and unloading the appropriate sections from physical memory, as needed. The disadvantage of this approach is that each application's programmer must spend time and effort on designing, implementing, and debugging this mechanism, instead of focusing on his or her application; this hampers programmer's efficiency. Also, if any programmer could truly choose which of their items of data to store in the physical memory at any one time, they could easily conflict with the decisions made by another programmer, who also wanted to use all the available physical memory at that point.

Another option is to store some form of handles to data rather than direct pointers and let the OS deal with swapping the data associated with those handles between the swapfile and physical memory as needed. This works but has a couple of problems, namely that it complicates application code, that it requires applications to play nice (they generally need the power to lock the data into physical memory to actually work on it) and that it stops the languages standard library doing its own suballocations inside large blocks from the OS to improve performance. The best known example of this kind of arrangement is probably the 16-bit versions of Windows.

The modern solution is to use virtual memory, in which a combination of special hardware and operating system software makes use of both kinds of memory to make it look as if the computer has a much larger main memory than it actually does and to lay that space out differently at will. It does this in a way that is invisible to the rest of the software running on the computer. It usually provides the ability to simulate a main memory of almost any size (as limited by the size of the addresses being used by the operating system and cpu; the total size of the Virtual Memory can be 2³² for a 32 bit system, or approximately 4 Gigabytes (though OS design decisions can make the amount available to applications in practice much less than this), while newer 64 bit chips and operating systems use 64 or 48 bit addresses and can index much more virtual memory).

This makes the job of the application programmer much simpler. No matter how much memory the application needs, it can act as if it has access to a main memory of that size and can place its data wherever in that virtual space that it likes. The programmer can also completely ignore the need to manage the moving of data back and forth between the different kinds of memory.

Basic operation

When virtual memory is used, or when a main memory location is read or written to by the CPU, hardware within the computer translates the address of the memory location generated by the software (the virtual memory address) into either:

the address of a real memory location (the physical memory address) which is assigned within the computer's physical memory to hold that memory item, or
an indication that the desired memory item is not currently resident in main memory (a so-called virtual memory exception or page fault)

In the former case, the memory reference operation is completed, just as if the virtual memory were not involved. In the latter case, the operating system is invoked to handle the situation, since the actions needed before the program can continue are usually quite complex.

The effect of this is to swap sections of information between the physical memory and the disk; the area of the disk which holds the information which is not currently in physical memory is called the swap file (OS/2, early Windowses, and others), page file (Windows), or swap partition (a dedicated partition of a hard disk, commonly seen in the Linux operating system).

Also on most modern systems virtual address space can be mapped to disk storage other than the swapfile. This allows parts of executables to be paged in as needed direct from the executable saving the need to load the entire executable at application load time and reducing the demand for swap space. It also allows the operating system to keep one copy of an executable in memory at once rather than loading it separately for each running instance thereby reducing the pressure on physical memory.

Details

The translation from virtual to physical addresses is implemented by an MMU (Memory Management Unit). This may be either a module of the CPU, or an auxiliary, closely coupled chip.

The operating system is responsible for deciding which parts of the program's simulated main memory are kept in physical memory. The operating system also maintains the translation tables which provide the mappings between virtual and physical addresses, for use by the MMU. Finally, when a virtual memory exception occurs, the operating system is responsible for allocating an area of physical memory to hold the missing information (and possiblly in the process pushing something else out to disk), bringing the relevant information in from the disk, updating the translation tables, and finally resuming execution of the software that incurred the virtual memory exception.

In most computers, these translation tables are stored in physical memory. Therefore, a virtual memory reference might actually involve two or more physical memory references: one or more to retrieve the needed address translation from the page tables, and a final one to actually do the memory reference.

To minimize the performance penalty of address translation, most modern CPUs include an on-chip MMU, and maintain a table of recently used virtual-to-physical translations, called a Translation Lookaside Buffer, or TLB. Addresses with entries in the TLB require no additional memory references (and therefore time) to translate, However, the TLB can only maintain a fixed number of mappings between virtual and physical addresses; when the needed translation is not resident in the TLB, action will have to be taken to load it in.

On some processors, this is performed entirely in hardware; the MMU has to do additional memory references to load the required translations from the translation tables, but no other action is needed. In other processors, assistance from the operating system is needed; an exception is raised, and on this exception, the operating system replaces one of the entries in the TLB with an entry from the translation table, and the instruction which made the original memory reference is restarted.

The hardware that supports virtual memory almost always supports memory protection mechanisms as well. The MMU may have the ability to vary its operation according to the type of memory reference (for read, write or execution), as well as the privilege mode of the CPU at the time the memory reference was made. This allows the operating system to protect its own code and data (such as the translation tables used for virtual memory) from corruption by an erroneous application program and to protect application programs from each other and (to some extent) from themselves (e.g. by preventing writes to areas of memory which contain code).

Paging and virtual memory

Virtual memory is usually (but not necessarily) implemented using paging. In paging, the low order bits of the binary representation of the virtual address are preserved, and used directly as the low order bits of the actual physical address; the high order bits are treated as a key to one or more address translation tables, which provide the high order bits of the actual physical address.

For this reason a range of consecutive addresses in the virtual address space whose size is a power of two will be translated in a corresponding range of consecutive physical addresses. The memory referenced by such a range is called a page. The page size is typically in the range of 512 to 8192 bytes (with 4K currently being very common), though page sizes of 4 megabytes or larger may be used for special purposes. (Using the same or a related mechanism, contiguous regions of virtual memory larger than a page are often mappable to contiguous physical memory for purposes other than virtualization, such as setting access and caching control bits.)

The operating system stores the address translation tables, the mappings from virtual to physical page numbers, in a data structure known as a page table.

If a page that is marked as unavailable (perhaps because it is not present in physical memory, but instead is in the swap area), when the CPU tries to reference a memory location in that page, the MMU responds by raising an exception (commonly called a page fault) with the CPU, which then jumps to a routine in the operating system. If the page is in the swap area, this routine invokes an operation called a page swap, to bring in the required page.

The page swap operation involves a series of steps. First it selects a page in memory, for example, a page that has not been recently accessed and (preferably) has not been modified since it was last read from disk or the swap area. (See page replacement algorithms for details.) If the page has been modified, the process writes the modified page to the swap area. The next step in the process is to read in the information in the needed page (the page corresponding to the virtual address the original program was trying to reference when the exception occurred) from the swap file. When the page has been read in, the tables for translating virtual addresses to physical addresses are updated to reflect the revised contents of the physical memory. Once the page swap completes, it exits, and the program is restarted and continues on as if nothing had happened, returning to the point in the program that caused the exception.

It is also possible that a virtual page was marked as unavailable because the page was never previously allocated. In such cases, a page of physical memory is allocated and filled with zeros, the page table is modified to describe it, and the program is restarted as above.

Additional details

One additional advantage of virtual memory is that it allows a computer to multiplex its CPU and memory between multiple programs without the need to perform expensive copying of the programs' memory images. If the combination of virtual memory system and operating system supports swapping, then the computer may be able to run simultaneous programs whose total size exceeds the available physical memory. Since most programs have a small subset (active set) of pages that they reference over significant periods of their execution, the performance penalty is less than that which might be expected. If too many programs are run at once, or if a single program continuously accesses widely scattered memory locations, then page swapping becomes excessively frequent and overall system performance will become unacceptably slow. This is often called thrashing (since the disk is being excessively overworked - thrashed) or paging storm, which corresponds to accessing the swap medium being three orders of magnitude slower compared to main memory access.

Note that virtual memory is not a requirement for precompilation of software, even if the software is to be executed on a multiprogramming system. Precompiled software loaded by the operating system has the opportunity to carry out address relocation at load time. This suffers by comparison with virtual memory in that a copy of program relocated at load time cannot run at a distinct address once it has started execution.

It is possible to avoid the overhead of address relocation using a process called rebasing, which uses metadata in the executable image header to guarantee to the run-time loader that the image will only run within a certain virtual address space. This technique is used on the system libraries on Win32 platforms, for example.

Also many systems run multiple instances of the same program using the same physical copy of the program in physical memory but separate virtual address spaces. This is possible because the separate virtual address spaces can all have the same layout and thus avoid the need to relocate the code at load time. Some operating systems take this even further implementing copy on write systems to allow a process to fork into two copies of itself without a complete copy of its data being created immediately.

History

Before the development of the virtual memory technique, programmers in the 1940s and 1950s had to manage two-level storage (main memory or RAM, and secondary memory in the form of hard disks or earlier, magnetic drums) directly.

Virtual memory was developed in approximately 1959 - 1962, at the University of Manchester for the Atlas Computer, completed in 1962. However, Fritz-Rudolf Güntsch, one of Germany's pioneering computer scientists and later the developer of the Telefunken TR 440 mainframe, claims to have invented the concept in his doctoral dissertation Logischer Entwurf eines digitalen Rechengerätes mit mehreren asynchron laufenden Trommeln und automatischem Schnellspeicherbetrieb (Logic Concept of a Digital Computing Device with Multiple Asynchronous Drum Storage and Automatic Fast Memory Mode) in 1957.

Like many technologies in the history of computing, virtual memory was not accepted without challenge. Before it could be regarded as a stable entity, many models, experiments, and theories had to be developed to overcome the numerous problems with virtual memory. Specialized hardware had to be developed that would take a "virtual" address and translate it into an actual physical address in memory (secondary or primary). Some worried that this process would be expensive, hard to build, and take too much processor power to do the address translation.

By 1969 the debates over virtual memory for commercial computers were over. An IBM research team, lead by David Sayre, showed that the virtual memory overlay system worked consistently better than the best manual-controlled systems.

Nevertheless, early personal computers in the 1980s were developed without virtual memory, on the assumption that such issues would only apply to large-scale commercial computers. Virtual memory was introduced for Microsoft Windows only in Windows 3.1 (1992), as described below, but available to Apple Macintosh starting with System 7 (1991).

Windows example

Virtual memory has been a feature of Microsoft Windows since Windows 3.1 in 1992. 386SPART.PAR (or WIN386.SWP on Windows 3.11 and Windows for Workgroups) is a hidden file created by Windows 3.x for use as a virtual memory swap file. It is generally found in the root directory, but it may appear elsewhere (typically in the WINDOWS directory). Its size depends on how much virtual memory the system has set up under Control Panel - Enhanced under "Virtual Memory." If a user moves or deletes this file, Windows will BSOD the next time it is started with "The permanent swap file is corrupt" and will ask the user if they want to delete the file (It asks whether or not the file exists). This page file is located at C:\pagefile.sys on all NT-based versions of Windows (including Windows 2000 and Windows XP), though Windows may be configured to place additional pagefiles on other drives.

Windows 95 uses a similar file, except it is named WIN386.SWP, and the controls for it are located under Control Panel - System - Performance tab - Virtual Memory. Windows automatically sets the page file to start 1.5x physical memory, and expand up to 3x physical memory if necessary. If a user runs memory intensive applications on a low physical memory system, it is preferable to manually set these sizes to a value higher than default.

Misconceptions about the Windows page file

There are some common misconceptions about Windows page file expansion, in that a page file can become heavily "fragmented" and cause "performance issues". The common advice given to avoid this problem is to set a single page file size, and not allow Windows to resize the page file. This is problematic for a few reasons;

If a Windows application requests more memory than is available from both physical memory and the page file, and Windows cannot resize the page file to fulfill this request, then the memory is not successfully allocated. Many applications (and sometimes Windows itself) will crash (sometimes gracefully, sometimes not) as a result of being unable to allocate more memory.
Concerns about "performance" are moot when a Windows system is using two or three times its total physical memory. Performance concerns about a further expanding pagefile are not going to be a user's primary concern at this time.
Concerns about "fragmentation" are not significant, when you consider how and when the page file is used. Windows does not read from or write to the page file in sequential order for long periods of time, so the performance advantages of having a completely sequential page file is minimal at best. Also, if a large number of pages need to be moved in or out of the page file, chances are quite good that other hard-disk activity is taking place at the same time, further reducing performance.

In short, a Windows system does not benefit from having a locked page file size. A larger "minimum" size will indeed help systems with little physical memory, but a large "maximum" will incur no performance penalty.

Virtual Memory in Linux

In Linux operating system, it is possible to use a whole partition of the HDD for virtual memory. Though it is still possible to use a file for swapping, it is recommended to use a separate partition, because this excludes chances of fragmentation, which reduces the performance of swapping. A swap area is created using the command mkswap filename/device , and may be turned on and off using the commands swapon and swapoff, respectively, accompanied by the name of the swap file or the swap partition.

In order to additionally increase performance of swapping, it is advisable to put the swap partition at the beginning of the HDD, because the transfer speed there is somewhat higher than at the end of the disk.

There were also some successful attempts to use the memory located on the videocard for swapping, as modern videocards often have 128 or even 256 megabytes of RAM which normally only gets put to use when playing games.

References

John L. Hennessy, David A. Patterson, Computer Architecture, A Quantitative Approach (ISBN 1-55860-724-2)

External links