Jump to content

FLOPS

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 72.10.97.18 (talk) at 17:03, 5 June 2007 (→‎Records). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In computing, FLOPS (or flops) is an acronym meaning FLoating point Operations Per Second. This is used as a measure of a computer's performance, especially in fields of scientific calculations that make heavy use of floating point calculations; similar to instructions per second. One should speak in the singular of a FLOPS and not of a FLOP, although the latter is frequently encountered. The final S stands for second and does not indicate a plural. Alternatively, the singular FLOP (or flop) is used as an abbreviation for "FLoating-point OPeration", and a flop count is a count of these operations (e.g., required by a given algorithm or computer program). In this context, "flops" is simply the plural rather than a rate.

Computing devices exhibit an enormous range of performance levels in floating-point applications, so it makes sense to introduce larger units than FLOPS.
The standard SI prefixes can be used for this purpose, resulting in such units as:

megaFLOPS (MFLOPS = 1,000,000 FLOPS)
gigaFLOPS (GFLOPS = 1,000 MFLOPS)
teraFLOPS (TFLOPS = 1,000 GFLOPS)
petaFLOPS (PFLOPS = 1,000 TFLOPS)
exaFLOPS (EFLOPS = 1,000 PFLOPS) and
zettaFLOPS (ZFLOPS = 1,000 EFLOPS).

As of 2006, the fastest supercomputer's performance tops out at one petaflops. A basic calculator performs relatively few FLOPS. Each calculation request to a typical calculator requires only a single operation, so there is rarely any need for its response time to exceed that needed by the operator. Any response time below 0.1 second is perceived as instantaneous by a human operator, so a simple calculator could be said to operate at about 10 FLOPS.

FLOPS as a measure of performance

In order for FLOPS to be useful as a measure of floating-point performance, a standard benchmark must be available on all computers of interest. One example is the LINPACK benchmark.

FLOPS in isolation are arguably not very useful as a benchmark for modern computers. There are many factors in computer performance other than raw floating-point computation speed, such as I/O performance, interprocessor communication, cache coherence, and the memory hierarchy. This means that supercomputers are in general only capable of a small fraction of their "theoretical peak" FLOPS throughput (obtained by adding together the theoretical peak FLOPS performance of every element of the system). Even when operating on large highly parallel problems, their performance will be bursty, mostly due to the residual effects of Amdahl's law. Real benchmarks therefore measure both peak actual FLOPS performance as well as sustained FLOPS performance.

For ordinary (non-scientific) applications, integer operations (measured in MIPS) are far more common. Measuring floating point operation speed, therefore, does not predict accurately how the processor will perform on just any problem. However, for many scientific jobs such as analysis of data, a FLOPS rating is effective.

Historically, the earliest reliably documented serious use of the Floating Point Operation as metric appears to be AEC justification to Congress for purchasing a Control Data CDC 6600 in the mid-1960s.

The terminology is currently so confusing that until April 24, 2006 U.S. export control was based upon measurement of "Composite Theoretical Performance" (CTP) in millions of "Theoretical Operations Per Second" or MTOPS. On that date, however, the U.S. Department of Commerce's Bureau of Industry and Security amended the Export Administration Regulations to base controls on Adjusted Peak Performance (APP) in Weighted TeraFLOPS (WT).


Records

Today IBM-Blue Gene is the world's fastest computer at 360 traflops it not only USA's fastest computer but also the worlds. In June 2006, a new computer was announced by Japanese research institute RIKEN, the MDGRAPE-3. The computer's performance tops out at one petaflop, over three times faster than the Blue Gene/L (Blue Gene has now passed its speed) . MDGRAPE-3 is not a general purpose computer, which is why it does not appear in the TOP500 list. It has special-purpose pipelines for simulating molecular dynamics. MDGRAPE-3 houses 4,808 custom processors, 64 servers each with 256 dual-core processors, and 37 servers each containing 74 processors, for a total of 40,314 processor cores, compared to the 131,072 needed for the Blue Gene/L. MDGRAPE-3 is able to do many more computations with few chips because of its specialized architecture. The computer is a joint project between Riken, Hitachi, Intel, and NEC subsidiary SGI Japan.

Distributed computing uses the Internet to link personal computers to achieve a similar effect:

  • The entire BOINC averages at 536 TFLOPS. [1]
  • SETI@home computes data at more than 700 TFLOPS. [2]
  • Folding@home has reached over 1 PFLOPS. [3]. Note, as of March 22nd, 2007, PlayStation 3 owners may now participate in the FAH project. Because of this, FAH is now sustaining considerably higher than 210 TFLOPS (990 as of 3/25/07). See current stats[4] for details.
  • Einstein@home is crunching more than 65 TFLOPS. [5]
  • As of June 2005, GIMPS is sustaining 23 TFLOPS. [6]
  • Intel Corporation has recently unveiled the experimental multi-core POLARIS chip, which achieves 1 TFLOPS at 3.2GHz. The 80-core chip can increase this to 1.8 TFLOPS at 5.6GHz, although the thermal dissipation at this frequency exceeds 260 watts.

Cost of computing

  • 1997: about US$30,000 per GFLOPS; with two 16-Pentium-Pro–processor Beowulf cluster computers, [7]
  • 2000, April: $1,000 per GFLOP, Bunyip, Australian National University. First sub-US$1/MFlop. Gordon Bell Prize 2000.
  • 2000, May: $640 per GFLOPS, KLAT2, University of Kentucky
  • 2003, August: $82 per GFLOPS, KASY0, University of Kentucky
  • 2005: about $2.60 ($300/115 GFLOPS CPU only) per GFLOPS in the Xbox 360 in case Linux will be implemented as intended [8]
  • 2006, February: about $1 per GFLOPS in ATI PC add-in graphics card (X1900 architecture) - these figures are disputed as they refer to highly parallelized GPU power (see above)
  • 2007, March: about $0.42 per GFLOPS in Ambric AM2045 [9]


This trend toward low cost follows Moore's law.

Pop culture references

  • In the Star Trek fictional universe, circa 2364, the android Data was constructed with an initial linear computational speed rated at 60 trillion operations per second, or 60 TIPS (and thereby, potentially 'dating' the series Star Trek: The Next Generation in which he appears); however, he was later able to infinitely exceed this limit by modifying his hardware and software.
  • In the movie Terminator III, Skynet is said to be operating at "60 teraflops per second," a nonsensical misuse of the term.

References