Jump to content

Inferential statistics

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 193.163.255.100 (talk) at 09:57, 5 September 2005. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Inferential statistics or Statistical induction is a branch of statistics that consists of generalizing from samples to populations, performing hypothesis testing, determining relationships among variables, and making predictions. Two schools of inferential statistics are frequency probability using maximum likelihood estimation, and Bayesian inference. The following is an example of the latter.

From a population containing N items of which I are special, a sample containing n items of which i are special can be chosen in

          

ways.

Fixing (N,n,I) , this expression is the unnormalized deduction distribution function of i .

Fixing (N,n,i) , this expression is the unnormalized induction distribution function of I .

The mean value ± the standard deviation of the deduction distribution is used for estimating i knowing (N,n,I)

        i≈f(N,n,I) 

where

        f(N,n,I)=(nI±√[nI(N-n)(N-I)/(N-1)])/N 

The mean value ± the standard deviation of the induction distribution is used for estimating I knowing (N,n,i)

        I≈-1-f(-2-n,-2-N,-1-i) 

Thus deduction is translated into induction by means of the involution

        (N,n,I,i) <———> (-2-n,-2-N,-1-i,-1-I)

Considering the case where the population contains 1 item and the sample is empty

        (N,n,i)=(1,0,0) 

the induction formula gives

        I≈1/2±1/2 

confirming that the number of special items in the population is either 0 or 1.

The frequency probability solution to this problem is I≈Ni/n=0/0, which gives no meaning.

In the limiting case where N is a large number, the deduction distribution of i tends towards the binomial distribution with the probability P=I/N as a parameter,

        i≈nP(1±√[(1/P−1)/n])

and the induction distribution of P tends towards the beta distribution

        P≈((i+1)±√[(i+1)(n−i+1)/(n+3)])/(n+2)

The frequency probability solution to this problem is P≈i/n; the probability is estimated by the relative frequency.

In the limiting case where N/n and n are large numbers, the deduction distribution of i tends towards the poisson distribution with the intensity M=nI/N as a parameter,

        i≈M±√M

and the induction distribution of M tends towards the gamma distribution

        M≈i+1±√[i+1]

(cf. Bo Jacoby: En formel til statistisk induktion, 2005. boja@dk.ibm.com)