You are looking at historical revision 20598 of this page. It may differ significantly from its current revision.

Still under testing!

Introduction

This library is a port of Larry Hunter's Lisp statistics library to chicken scheme.

The library provides a number of formulae and methods taken from the book "Fundamentals of Biostatistics" by Bernard Rosner (5th edition).

Binomial Distribution

The Binomial distribution is what you get from counting discrete events in a series of trials, each of which events has a probability p of producing a positive outcome. An example would be tossing a coin n times: the probability of a head is p, and the distribution gives the expected number of heads in the n trials.

Provided Functions

Utilities

[procedure] (average-rank value sorted-values)

returns the average position of given value in the list of sorted values: the rank is based from 1.

> (average-rank 2 '(1 2 2 3 4))
5/2
[procedure] (beta-incomplete x a b)
[procedure] (bin-and-count items n)

Divides the range of the list of items into n bins, and returns a vector of the number of items which fall into each bin.

> (bin-and-count '(1 1 2 3 3 4 5) 5)
#(2 1 2 1 1)
[procedure] (combinations n k)

returns the number of ways to select k items from n, where the order does not matter.

[procedure] (factorial n)

returns the factorial of n.

[procedure] (find-critical-value p-function p-value)
[procedure] (fisher-z-transform r)

returns the transformation of a correlation coefficient r into an approximately normal distribution.

[procedure] (gamma-incomplete a x)
[procedure] (gamma-ln x)
[procedure] (permutations n k)

returns the number of ways to select k items from n, where the order does matter.

[procedure] (random-normal mean sd)

returns a random number distributed with specified mean and standard deviation.

[procedure] (random-pick items)

returns a random item from the given list of items.

[procedure] (random-sample n items)

returns a random sample from the list of items without replacement of size n.

[procedure] (sign n)

returns 0, 1 or -1 according to if n is zero, positive or negative.

[procedure] (square n)

Descriptive statistics

These functions provide information on a given list of numbers, the items. Note, the list does not have to be sorted.

[procedure] (mean items)

returns the arithmetic mean of the items (the sum of the numbers divided by the number of numbers).

(mean '(1 2 3 4 5)) => 3
[procedure] (median items)

returns the value which separates the upper and lower halves of the list of numbers.

(median '(1 2 3 4)) => 5/2
[procedure] (mode items)

returns two values. The first is a list of the modes and the second is the frequency. (A mode of a list of numbers is the most frequently occurring value.)

> (mode '(1 2 3 4))
(1 2 3 4)
1
> (mode '(1 2 2 3 4))
(2)
2
> (mode '(1 2 2 3 3 4))
(2 3)
2
[procedure] (geometric-mean items)

returns the geometric mean of the items (the result of multiplying the items together and then taking the nth root, where n is the number of items).

(geometric-mean '(1 2 3 4 5)) => 2.60517108469735
[procedure] (range items)

returns the difference between the biggest and the smallest value from the list of items.

(range '(5 1 2 3 4)) => 4
[procedure] (percentile items percent)

returns the item closest to the percent value if the items are sorted into order; the returned item may be in the list, or the average of adjacent values.

(percentile '(1 2 3 4) 50) => 5/2
(percentile '(1 2 3 4) 67) => 3
[procedure] (variance items)
[procedure] (standard-deviation items)
[procedure] (coefficient-of-variation items)

returns 100 * (std-dev / mean) of the items.

(coefficient-of-variation '(1 2 3 4)) => 51.6397779494322
[procedure] (standard-error-of-the-mean items)

returns std-dev / sqrt(length items).

 (standard-error-of-the-mean '(1 2 3 4)) => 0.645497224367903
[procedure] (mean-sd-n items)

returns three values, one for the mean, one for the standard deviation, and one for the length of the list.

> (mean-sd-n '(1 2 3 4))
5/2
1.29099444873581
4

Distributional functions

[procedure] (binomial-probability n k p)

returns the probability that the number of positive outcomes for a binomial distribution B(n, p) is k.

> (do-ec (: i 0 11) 
                (format #t "i = ~d P = ~f~&" i (binomial-probability 10 i 0.5)))
i = 0 P = 0.0009765625
i = 1 P = 0.009765625
i = 2 P = 0.0439453125
i = 3 P = 0.1171875
i = 4 P = 0.205078125
i = 5 P = 0.24609375
i = 6 P = 0.205078125
i = 7 P = 0.1171875
i = 8 P = 0.0439453125
i = 9 P = 0.009765625
i = 10 P = 0.0009765625
[procedure] (binomial-cumulative-probability n k p)

returns the probability that less than k positive outcomes occur for a binomial distribution B(n, p).

> (do-ec (: i 0 11) 
                (format #t "i = ~d P = ~f~&" i (binomial-cumulative-probability 10 i 0.5)))
i = 0 P = 0.0
i = 1 P = 0.0009765625
i = 2 P = 0.0107421875
i = 3 P = 0.0546875
i = 4 P = 0.171875
i = 5 P = 0.376953125
i = 6 P = 0.623046875
i = 7 P = 0.828125
i = 8 P = 0.9453125
i = 9 P = 0.9892578125
i = 10 P = 0.9990234375
[procedure] (binomial-ge-probability n k p)

returns the probability of k or more positive outcomes for a binomial distribution B(n, p).

[procedure] (binomial-le-probability n k p)

returns the probability k or fewer positive outcomes for a binomial distribution B(n, p).

Confidence intervals

[procedure] (binomial-probability-ci n p alpha)

returns two values, the upper and lower bounds on an observed probability p from n trials with confidence (1-alpha).

> (binomial-probability-ci 10 0.8 0.9)
0.724273681640625 
0.851547241210938
; 2 values

Hypothesis testing

(parametric)

(non parametric)

Sample size estimates

Correlation and regression

Significance test functions

Authors

Peter Lane wrote the scheme version of this library. The original Lisp version was written by Larry Hunter.

License

GPL version 3.0.

Requirements

Needs srfi-1, srfi-25, srfi-69, vector-lib, numbers, extras, foreign, format

Uses the GNU scientific library for basic numeric processing, so requires libgsl, libgslcblas and the development files for libgsl.

Version History

trunk, for testing