Magnetic-tape data storage

Magnetic tape has been used for data storage for over 50 years. In this time, many advances in tape formulation, packaging, and data density have been made. Modern magnetic tape is most commonly packaged in cartridges and cassettes. The device that performs actual writing or reading of data is a tape drive. Autochangers and tape libraries are frequently used to automate cartridge handling.

When storing large amounts of data, tape can be substantially less expensive than disk or other data storage options. Tape storage has always been used with large computer systems. Modern usage is primarily as a high capacity medium for backups and archives. As of 2007, the highest capacity tape cartridges (DLT-S4, LTO-4) can store 800 GB of data without using compression.

Open reels

Initially, magnetic tape for data storage was wound on large (10.5 in) reels. This defacto standard for large computer systems persisted though the late 1980s. Tape cartridges and cassettes were available as early as the mid 1970s and were frequently used with small computer systems. With the introduction of the IBM 3480 catridge in 1984, large computer systems started to move away from open reel tapes and towards cartridges.

UNIVAC

Magnetic tape was first used to record computer data in 1951 on the Eckert-Mauchly UNIVAC I. The UNISERVO drive recording medium was a thin metal strip of ½″ wide(12.7 mm) nickel-plated phosphor bronze. Recording density was 128 characters per inch (198 micrometre/character) on eight tracks at a linear speed of 100 in/s (2.54 m/s), yielding a data rate of 12,800 characters per second. Of the eight tracks, six were data, one was a parity track, and one was a clock, or timing track. Making allowance for the empty space between tape blocks, the actual transfer rate was around 7,200 characters per second.

IBM formats

IBM computers from the 1950s used ferrous-oxide coated tape similar to that used in audio recording. IBM's technology soon became the de facto industry standard. Magnetic tape dimensions was .5" (12.7 mm) wide and wound on removable reels of up to 10.5 inches (267 mm) in diameter. Different tape lengths were available with 1200', 2400' on mil and one half thickness being somewhat standard. Later during the 80's, longer tape lengths such as 3600' became available, but only with a much thinner Mylar plastic(TM) Most tape drives could support a maximum reel size of 10.5"

Early IBM tape drives, such as the IBM 727 and IBM 729, were mechanically sophisticated floor-standing drives that used vacuum columns to buffer long u-shaped loops of tape. Between active control of powerful reel motors and vacuum control of these u-shaped tape loops, extremely rapid start and stop of the tape at the tape-to-head interface could be achieved. (1.5ms from stopped tape to full speed of up to 112.5 IPS) When active, the two tape reels thus fed tape into or pulled tape out of the vacuum columns, intermittently spinning in rapid, unsynchronized bursts resulting in visually-striking action. Stock shots of such vacuum-column tape drives in motion were widely used to represent "the computer" in movies and television.

Early half-inch tape had 7 parallel tracks of data along the length of the tape allowing six-bit characters plus parity written across the tape. This was known as 7-track tape. With the introduction of the IBM System 360, 9 track tapes became common to support 8-bit characters or "bytes." 7-track tapes used a .75 IRG (inter-record-gap) 9-track 800 NRZI and 1600 PE (Phase Encoding) tapes utilized a .60" IRG placed between data records to allow the tape to stop. 6250 GCR tapes used a very tight .3" IRG. Both 7 and 9 track tapes had reflective stickers placed near (10', 14') each end to signal beginning of tape (BOT) and end of tape (EOT) to the hardware. Effective recording density increased over time. Common 7-track densities started at 200, then 556, and finally 800 cpi. Nine-track tapes commonly had densities of 800, 1600, and 6250 cpi, giving approximately 20MB, 40MB and 140MB respectively on a standard 2400' tape. Signaling EOT (end of tape) with space remaining to write trailer blocks allowed support for multivolume labelled tapes.

Standards

ANSI INCITS 40-1993 (R2003) Unrecorded Magnetic Tape for Information Interchange (9-Track, 800 CPI, NRZI; 1600 CPI, PE; and 6250 CPI, GCR)
ISO/IEC 1863:1990 9-track, 12.7 mm (½ in) wide magnetic tape for information interchange using NRZ1 at 32 ftpmm (800 ftpi) - 32 cpmm (800 cpi)
ISO/IEC 3788:1990 9-track, 12.7 mm (½ in) wide magnetic tape for information interchange using phase encoding at 126 ftpmm (3 200 ftpi), 63 cpmm (1600 cpi)
ANSI INCITS 54-1986 (R2002) Recorded Magnetic Tape for Information Interchange (6250 CPI, Group Code Recording)
ANSI INCITS 27-1987 (R2003) Magnetic Tape Labels and File Structure for Information Interchange

Since then, a multitude of tape formats have been used.

DEC format

LINCtape, and its derivative, DECtape, were variations on this "round tape." They were essentially a personal storage medium. The tape was ¾ inch wide and featured a fixed formatting track which, unlike standard tape, made it feasible to read and rewrite blocks repeatedly in place. LINCtapes and DECtapes had similar capacity and data transfer rate to the diskettes that displaced them, but their "seek times" were on the order of thirty seconds to a minute.

Cartridges and Cassettes

In the context of magnetic tape, the term cassette usually refers to an enclosure that holds two reels with a single span of magnetic tape. The term cartridge is more generic, but frequently means a single reel of tape in a plastic enclosure.

The type of packaging is a large determinant of the load and unload times as well as the length of tape that can be held. A tape drive that uses a single reel cartridge has a takeup reel in the drive while cassettes have the take up reel in the cassette. A tape drive (or "transport" or "deck") uses precisely-controlled motors to wind the tape from one reel to the other, passing a read/write head as it does.

A different type of tape cartridge has a continuous loop of tape wound on a special reel that allows tape to be withdrawn from the center of the reel and then wrapped up around the edge. This type is similar to a cassette in that there is no take-up reel inside the tape drive.

In the 1970's and 1980's, audio Compact Cassettes were frequently used as an inexpensive data storage system for home computers. Most modern magnetic tape systems use reels that are fixed inside a cartridge to protect the tape and facilitate handling. Modern cartridge formats include DAT/DDS, DLT and LTO with capacities in the tens to hundreds of gigabytes.

Technical details

Tape width

Medium width is the primary classification criterion for tape technologies. The most common width of tape for high capacity data storage has long been one half inch. Many other sizes exist and most were developed to either have smaller packaging or higher capacity.

Recording method

The choice of recording method also provides important classification of tape technologies. Main methods are:

linear (or longitudinal)
helical scan
linear serpentine

The linear method arranges data in long parallel tracks that span the length of the tape. Multiple tape heads write simulteaonously parallel tape tracks on single medium. Used in early tape drives it was the simplest but the least efficient method.

The helical scan method writes short dense tracks in diagonal not longitudinal manner. The head is placed on a drum which rotates with high speed over a slowly moving tape.

A much later variation on linear technology is linear serpentine recording, which uses more tracks than tape heads. Each head still writes one track at a time. After making a pass over the whole length of the tape, all heads shift slightly and make another pass in the reverse direction writing next set of tracks. This procedure is repeated until all tracks have been read or written. By using the linear serpentine method, the tape medium can have many more tracks than there are read/write heads. Comparing to simple linear recording tape capacity multiples, even when using same tape length and same number of heads.

**The recording methods**
Linear	Helical	Linear serpentine

Block layout

In a typical format, data is written to tape in blocks with inter-block gaps between them, and each block is written in a single operation with the tape running continuously during the write. However, since the rate at which data is written or read to the tape drive is not deterministic, a tape drive usually has to cope with a difference between the rate at which data goes on and off the tape and the rate at which data is supplied or demanded by its host.

Various methods have been used alone and in combination to cope with this difference. The tape drive can be stopped, backed up, and restarted (known as shoe-shining, because of increased wear of both medium and head). A large memory buffer can be used to queue the data. The host can assist this process by choosing appropriate block sizes to send to the tape drive. There is a complex tradeoff between block size, the size of the data buffer in the record/playback deck, the percentage of tape lost on inter-block gaps, and read/write throughput.

Finally modern tape drives offer speed matching feature, where drive can dynamically decrease physical tape speed as much as 50% to avoid shoe-shining.

Sequential access to data

From user perspective the primary difference between tape data storage and disk data storage is that tape is a sequential access medium while disk is a random access medium. Hence tape uses a very trivial filesystem in which files are addressed by number not by filename. Metadata such as file name or modification time is typically not stored at all. Over time some tools (i.e. tar) were introduced to enable storing metadata by introducing richer formats of packing multiple files in a single large 'tape file'.

Another difference to hard disk storage is that data is generally added by appending a file to the end of the recording, not by overwriting a particular file (or part of file) in the middle of tape.

A trivial example interaction of an *nix user with tape might be:

 (user inserts an empty tape manually; in case of a robotic library uses mtx)
 tar cvf /dev/rmt/mytape   /mydata                  # backup mydata to file 1
 dd if=/dev/random of=/dev/rmt/mytape count=123     # write random bytes to file 2
 cat myfile   > /dev/rmt/mytape                     # copy "myfile" to tape as file 3
 mt -f /dev/rmt/mytape offline                      # eject the tape

Example of reading from tape:

 (user inserts the same tape)
 mt -f /dev/rmt/mytape fsf 2                        # wind the tape right to file 3
 cat /dev/rmt/mytape  > myrestoredfile              # copy entire file back to disk
 mt -f /dev/rmt/mytape offline                      # eject the tape

Access time

Tape has quite a long latency for random accesses since the deck must wind an average of one-third the tape length to move from one arbitrary data block to another. Most tape systems attempt to alleviate the intrinsic long latency, either using indexing, where a separate lookup table (tape directory) is maintained which gives the physical tape location for a given data block number (a must for serpentine drives), or by marking blocks with a tape mark that can be detected while winding the tape at high speed.

Data compression

Most tape drives now include some kind of data compression. There are several algorithms which provide similar results: LZ (most), IDRC (Exabyte), ALDC (IBM, QIC) and DLZ1 (DLT). Embedded in tape drive hardware, these operate on relatively small buffer of data, so cannot achieve spectacular ratio results. 2:1 ratio is typical, some vendors claim 2.6:1 or 3:1. However, compression implemented in hardware is a must for any high-end or midrange tape as it provides very high throughput, which would not be achievable using standard software compression on a host CPU.

Some enterprise tape drives allow for an encryption to be performed after compression. (Once data has been encrypted, compression algorithms are no longer effective.) The symmetric streaming encryption algorithms are also implemented to provide high performance.

The actual compression algorithms used in low-end products are not the most effective known today, and better results can usually be obtained by turning off the compression built into the device and using a software compression (and encryption) program instead.

Viability

Tape remains a viable alternative to disk due to its higher bit density and lower cost per bit. Tape has historically offered enough advantage in these two areas above disk storage to make it a viable product, particularly for backup. The rapid improvement in disk storage density and price, coupled with arguably less-vigorous innovation in tape storage, has reduced the market share of tape storage products.

Chronological list of tape formats

1951 - UNISERVO
1952 - IBM 7 Track

1962 - LINCtape
1963 - DECtape
1964 - 9 Track
1964 - Magnetic tape selectric typewriter

1972 - QIC
1975 - KC Standard, Compact Cassette
1976 - DC100
1977 - Datassette
1979 - DECtapeII
1979 - Exatron Stringy Floppy

1983 - ZX Microdrive
1984 - Rotronics Wafadrive
1984 - IBM 3480
1984 - DLT

1986 - SLR
1987 - Data8
1989 - DDS/DAT

1992 - Ampex DST
1994 - Mammoth
1995 - IBM 3590
1995 - Redwood SD-3
1995 - Travan
1996 - AIT
1997 - IBM 3570 MP
1998 - T9840
1999 - VXA

2000 - T9940
2000 - LTO Ultrium
2003 - SAIT
2006 - T10000

References

Bitsavers: HP 7970 Maintenance Course Handouts: 800 NRZI & 1600 PE drives

ISO Standard lists

These are just lists, not the actual standards.