Jump to content

Ackermann function

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Gene Ward Smith (talk | contribs) at 05:34, 7 February 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

In the theory of computation, the Ackermann function or Ackermann-Peter function is a simple example of a recursive function that is not primitive recursive. It takes two natural numbers as arguments and yields a natural number, and its value grows extremely quickly. Even for small inputs (4,3, say) the values of the Ackermann function become so large that they cannot be feasibly computed, and in fact their decimal expansions cannot even be stored in the entire physical universe.

History

In 1928, Wilhelm Ackermann, a mathematician studying the foundations of computation, originally considered a function A(mnp) of three variables, the p-fold iterated exponentiation of m with n, or m → n → p as expressed using the Conway chained arrow. When p=1, this is simply mn, which roughly is m multiplied by itself n times. When p=2, it is a tower of exponents with n levels, or roughly m raised to its own power n times. We can continue to generalize this indefinitely as p becomes larger.

Ackermann proved that A is a recursive function, a function a computer with infinite memory can calculate, but it is not a primitive recursive function, a class of functions including almost all familiar functions such as addition and factorial.

In On the Infinite, David Hilbert hypothesized that the Ackermann function was not primitively recursive, but it was Ackermann, a former student and Hilbert’s personal secretary, who actually proved the hypothesis in his paper On Hilbert’s Construction of the Real Numbers. On the Infinite was Hilbert’s most important paper on the foundations of mathematics, serving as the heart of Hilbert's program to secure the foundation of transfinite numbers by basing them on finite methods. The paper also outlines a proof of the Continuum hypothesis and is central in influencing Kurt Gödel to study the completeness and consistency of mathematics leading to Gödel's incompleteness theorem.

A similar function of only two variables was later defined by Rozsa Peter and Raphael Robinson; its definition is given below. Note that the numbers, except in the first few rows, are three less than powers of two. For the exact relation between the two functions, see below.

Definition and properties

The Ackermann function is defined recursively for non-negative integers m and n as follows:

The Ackermann function can be calculated by a simple function based directly on the definition:

Template:Wikicode

 function ack(m, n)
     if m = 0
         return n+1
     else if m > 0 and n = 0
         return ack(m-1, 1)
     else
         return ack(m-1, ack(m, n-1))

The same function can be written partially iteratively as:

 function ack(m, n)
     while m ≠ 0
         if n = 0
             n := 1
         else
             n := ack(m, n-1)
         m := m - 1
     return n+1

It may be surprising that these functions always return a value. This is because at each step either n decreases, or n increases and m decreases. Each time that n reaches zero, m must decrease, so m must eventually reach zero as well. Note, however, that when m decreases there is no upper bound on how much n can increase — and it will often increase greatly.

The Ackermann function can also be expressed nonrecursively using Conway chained arrow:

A(m, n) = (2 → (n+3) → (m − 2)) − 3 for m>2

hence

2 → nm = A(m+2,n-3) + 3 for n>2

(n=1 and n=2 would correspond with A(m,-2)=-1 and A(m,-1)=1, which could logically be added).

or the hyper operators:

A(m, n) = hyper(2, m, n+3)−3.

For small values of m like 1, 2, or 3, the Ackermann function grows relatively slowly with respect to n (at most exponentially). For m ≥ 4, however, it grows much more quickly; even A(4, 2) is about 2×1019728, and the decimal expansion of A(4, 3) cannot be recorded in the physical universe. If we define the function f (n) = A(nn), which increases both m and n at the same time, we have a function of one variable that dwarfs every primitive recursive function, including very fast-growing functions such as the exponential function, the factorial function, multi- and superfactorial functions, and even functions defined using Knuth's up-arrow notation (except when the indexed up-arrow is used).

This extreme growth can be exploited to show that f, which is obviously computable on a machine with infinite memory such as a Turing machine and so is a recursive function, grows faster than any primitive recursive function and is therefore not primitive recursive. In combination with the Ackermann function's applications in analysis of algorithms, discussed later, this debunks the theory that all useful or simple functions are primitive recursive functions.

One surprising aspect of the Ackermann function is that the only arithmetic operations it ever uses are addition and subtraction of 1. Its properties come solely from the power of unlimited recursion. This also implies that its running time is at least proportional to its output, and so is also extremely huge. In actuality, for most cases the running time is far larger than the output; see below.

Table of values

Computing the Ackermann function can be restated in terms of an infinite table. We place the natural numbers along the top row. To determine a number in the table, take the number immediately to the left, then look up the required number in the previous row, at the position given by the number just taken. If there is no number to its left, simply look at column 1 in the previous row. Here is a small upper-left portion of the table:

Values of A(mn)
m\n 0 1 2 3 4 n
0 1 2 3 4 5
1 2 3 4 5 6
2 3 5 7 9 11
3 5 13 29 61 125
4 13 65533 265536 − 3 A(3, 265536 − 3) A(3, A(4, 3))
5 65533 A(4, 65533) A(4, A(5, 1)) A(4, A(5, 2)) A(4, A(5, 3))
6 A(5, 1) A(5, A(5, 1)) A(5, A(6, 1)) A(5, A(6, 2)) A(5, A(6, 3))

A(4, 2) is greater than the number of particles in the universe raised to the power 200. A(5, 2) is the item at column A(5, 1) in the m = 4 row, and cannot be written as a decimal expansion in the physical universe. Beyond row 4 and column 1, the values can no longer be feasibly written with any standard notation other than the Ackermann function itself — writing them as decimal expansions, or even as references to rows with lower m, is not possible.

If you were able to expand every particle in the universe to a universe the size of ours by snapping your fingers, and likewise with all the particles in the created universes, and did this repeatedly, you would die of old age before the number of particles reached A(4, 3). Note that A(5, 1) is larger than even this number.

Despite the inconceivably large values occurring in this early section of the table, some even larger numbers have been defined, such as Graham's number, which cannot be written with any small (or, indeed, recordable) number of Knuth arrows. This number is constructed with a technique similar to applying the Ackermann function to itself recursively. Extending the table further to overcome it is like trying the same with the list of natural numbers.

Explanation

To see how the Ackermann function grows so quickly, it helps to expand out some simple expressions using the rules in the original definition. For example, we can fully evaluate A(1, 2) in the following way:

 
A(1, 2) = A(0, A(1,1))
        = A(0, A(0, A(1,0)))
        = A(0, A(0, A(0,1)))
        = A(0, A(0, 2))
        = A(0, 3)
        = 4

Now let us attempt the more complex A(4, 3), the first value with fairly small n which cannot be recorded as a decimal expansion in the physical universe:

A(4, 3) = A(3, A(4, 2))
        = A(3, A(3, A(4, 1)))
        = A(3, A(3, A(3, A(4, 0))))
        = A(3, A(3, A(3, A(3, 1))))
        = A(3, A(3, A(3, A(2, A(3, 0)))))
        = A(3, A(3, A(3, A(2, A(2, 1)))))
        = A(3, A(3, A(3, A(2, A(1, A(2, 0))))))
        = A(3, A(3, A(3, A(2, A(1, A(1, 1))))))
        = A(3, A(3, A(3, A(2, A(1, A(0, A(1, 0)))))))
        = A(3, A(3, A(3, A(2, A(1, A(0, A(0, 1)))))))
        = A(3, A(3, A(3, A(2, A(1, A(0, 2))))))
        = A(3, A(3, A(3, A(2, A(1, 3)))))
        = A(3, A(3, A(3, A(2, A(0, A(1, 2))))))
        = A(3, A(3, A(3, A(2, A(0, A(0, A(1, 1)))))))
        = A(3, A(3, A(3, A(2, A(0, A(0, A(0, A(1, 0))))))))
        = A(3, A(3, A(3, A(2, A(0, A(0, A(0, A(0, 1))))))))
        = A(3, A(3, A(3, A(2, A(0, A(0, A(0, 2))))))
        = A(3, A(3, A(3, A(2, A(0, A(0, 3)))))
        = A(3, A(3, A(3, A(2, A(0, 4)))))
        = A(3, A(3, A(3, A(2, 5))))
        = ...
        = A(3, A(3, A(3, 13)))
        = ...
        = A(3, A(3, 65533))
        = ...

We stop here because A(3, 65533) returns 265536-3, a number which is much larger than the number of atoms in the universe, and fully expanding out the expression A(3, 65533) would form a line roughly 1020000 times longer than the diameter of the universe. After this, this number is itself raised as a power of 2 to obtain the final result.

Inverse

Since the function  f (n) = A(nn) considered above grows very rapidly, its inverse function, f−1, grows very slowly. This inverse Ackermann function f−1 is usually denoted by α. In fact, α(n) is less than 5 for any conceivable input size n, since A(4, 4) has a number of digits that cannot itself be written in binary in the physical universe. For all practical purposes, f−1(n) can be regarded as being a constant.

This inverse appears in the time complexity of some algorithms, such as the disjoint-set data structure and Chazelle's algorithm for minimum spanning trees. Sometimes Ackermann's original function or other variations are used in these settings, but they all grow at similarly high rates. In particular, some modified functions simplify the expression by eliminating the −3 and similar terms.

A two-parameter variation of the inverse Ackermann function can be defined as follows:

This function arises in more precise analyses of the algorithms mentioned above, and gives a more refined time bound. In the disjoint-set data structure, m represents the number of operations while n represents the number of elements; in the minimum spanning tree algorithm, m represents the number of edges while n represents the number of vertices. Several slightly different definitions of α(mn) exist; for example, log2 n is sometimes replaced by n, and the floor function is sometimes replaced by a ceiling.

Use as benchmark

The Ackermann function, due to its definition in terms of extremely deep recursion, can be used as a benchmark of a compiler's ability to optimize recursion. For example, a compiler which, in analyzing the computation of A(3, 30), is able to save intermediate values like the A(3, n) and A(2, n) in that calculation rather than recomputing them, can speed up computation of A(3, 30) by a factor of hundreds of thousands. Also, if A(2, n) is computed directly rather than as a recursive expansion of the form A(1, A(1, A(1,...A(1, 0)...))), this will save significant amounts of time. Computing A(1, n) takes linear time in n. Computing A(2, n) requires quadratic time, since it expands to O(n) nested calls to A(1, i) for various i. Computing A(3, n) requires time proportionate to 4n+1. Note that the computation of A(3, 1) in the example above takes 16 (42) steps.

It may be noted that A(4, 2), which appears as a decimal expansion in several web pages, cannot possibly be computed by recursive application of the Ackermann function in any even remotely plausible amount of time. Instead, formulas such as A(3, n) = 8×2n−3 are used to quickly complete some of the recursive calls.

See also

References

  • von Heijenoort,
    From Frege To Gödel, 1967. This is an invaluable reference in understanding the context of Ackermann's paper On Hilbert’s Construction of the Real Numbers, containing his paper as well as Hilbert’s On The Infinite and Gödel’s two papers on the completeness and consistency of mathematics.