You are looking at historical revision 20600 of this page. It may differ significantly from its current revision.
Still under testing!
This library is a port of Larry Hunter's Lisp statistics library to chicken scheme.
The library provides a number of formulae and methods taken from the book "Fundamentals of Biostatistics" by Bernard Rosner (5th edition).
To use this library, you need to understand the underlying statistics. In brief:
The Binomial distribution is used when counting discrete events in a series of trials, each of which events has a probability p of producing a positive outcome. An example would be tossing a coin n times: the probability of a head is p, and the distribution gives the expected number of heads in the n trials.
The Poisson distribution is used to count discrete events which occur with a known average rate. A typical example is the decay of radioactive elements.
The |Normal distribution is used for real-valued events which cluster around a specific mean with a symmetric variance. A typical example would be the distribution of people's heights.
Utilities[procedure] (average-rank value sorted-values)
returns the average position of given value in the list of sorted values: the rank is based from 1.
> (average-rank 2 '(1 2 2 3 4)) 5/2[procedure] (beta-incomplete x a b)
[procedure] (bin-and-count items n)
Divides the range of the list of items into n bins, and returns a vector of the number of items which fall into each bin.
> (bin-and-count '(1 1 2 3 3 4 5) 5) #(2 1 2 1 1)[procedure] (combinations n k)
returns the number of ways to select k items from n, where the order does not matter.[procedure] (factorial n)
returns the factorial of n.[procedure] (find-critical-value p-function p-value)
[procedure] (fisher-z-transform r)
returns the transformation of a correlation coefficient r into an approximately normal distribution.[procedure] (gamma-incomplete a x)
[procedure] (gamma-ln x)
[procedure] (permutations n k)
returns the number of ways to select k items from n, where the order does matter.[procedure] (random-normal mean sd)
returns a random number distributed with specified mean and standard deviation.[procedure] (random-pick items)
returns a random item from the given list of items.[procedure] (random-sample n items)
returns a random sample from the list of items without replacement of size n.[procedure] (sign n)
returns 0, 1 or -1 according to if n is zero, positive or negative.[procedure] (square n)
These functions provide information on a given list of numbers, the items. Note, the list does not have to be sorted.[procedure] (mean items)
returns the arithmetic mean of the items (the sum of the numbers divided by the number of numbers).
(mean '(1 2 3 4 5)) => 3[procedure] (median items)
returns the value which separates the upper and lower halves of the list of numbers.
(median '(1 2 3 4)) => 5/2[procedure] (mode items)
returns two values. The first is a list of the modes and the second is the frequency. (A mode of a list of numbers is the most frequently occurring value.)
> (mode '(1 2 3 4)) (1 2 3 4) 1 > (mode '(1 2 2 3 4)) (2) 2 > (mode '(1 2 2 3 3 4)) (2 3) 2[procedure] (geometric-mean items)
returns the geometric mean of the items (the result of multiplying the items together and then taking the nth root, where n is the number of items).
(geometric-mean '(1 2 3 4 5)) => 2.60517108469735[procedure] (range items)
returns the difference between the biggest and the smallest value from the list of items.
(range '(5 1 2 3 4)) => 4[procedure] (percentile items percent)
returns the item closest to the percent value if the items are sorted into order; the returned item may be in the list, or the average of adjacent values.
(percentile '(1 2 3 4) 50) => 5/2 (percentile '(1 2 3 4) 67) => 3[procedure] (variance items)
[procedure] (standard-deviation items)
[procedure] (coefficient-of-variation items)
returns 100 * (std-dev / mean) of the items.
(coefficient-of-variation '(1 2 3 4)) => 51.6397779494322[procedure] (standard-error-of-the-mean items)
returns std-dev / sqrt(length items).
(standard-error-of-the-mean '(1 2 3 4)) => 0.645497224367903[procedure] (mean-sd-n items)
returns three values, one for the mean, one for the standard deviation, and one for the length of the list.
> (mean-sd-n '(1 2 3 4)) 5/2 1.29099444873581 4
Distributional functions[procedure] (binomial-probability n k p)
returns the probability that the number of positive outcomes for a binomial distribution B(n, p) is k.
> (do-ec (: i 0 11) (format #t "i = ~d P = ~f~&" i (binomial-probability 10 i 0.5))) i = 0 P = 0.0009765625 i = 1 P = 0.009765625 i = 2 P = 0.0439453125 i = 3 P = 0.1171875 i = 4 P = 0.205078125 i = 5 P = 0.24609375 i = 6 P = 0.205078125 i = 7 P = 0.1171875 i = 8 P = 0.0439453125 i = 9 P = 0.009765625 i = 10 P = 0.0009765625[procedure] (binomial-cumulative-probability n k p)
returns the probability that less than k positive outcomes occur for a binomial distribution B(n, p).
> (do-ec (: i 0 11) (format #t "i = ~d P = ~f~&" i (binomial-cumulative-probability 10 i 0.5))) i = 0 P = 0.0 i = 1 P = 0.0009765625 i = 2 P = 0.0107421875 i = 3 P = 0.0546875 i = 4 P = 0.171875 i = 5 P = 0.376953125 i = 6 P = 0.623046875 i = 7 P = 0.828125 i = 8 P = 0.9453125 i = 9 P = 0.9892578125 i = 10 P = 0.9990234375[procedure] (binomial-ge-probability n k p)
returns the probability of k or more positive outcomes for a binomial distribution B(n, p).[procedure] (binomial-le-probability n k p)
returns the probability k or fewer positive outcomes for a binomial distribution B(n, p).
Confidence intervals[procedure] (binomial-probability-ci n p alpha)
returns two values, the upper and lower bounds on an observed probability p from n trials with confidence (1-alpha).
> (binomial-probability-ci 10 0.8 0.9) 0.724273681640625 0.851547241210938 ; 2 values
Sample size estimates
Correlation and regression
Significance test functions
GPL version 3.0.
Needs srfi-1, srfi-25, srfi-69, vector-lib, numbers, extras, foreign, format
Uses the GNU scientific library for basic numeric processing, so requires libgsl, libgslcblas and the development files for libgsl.
trunk, for testing