• egg

probdist

Probability density functions.

Usage

• (require-extension sampled-pdf)
• (require-extension normal-pdf)

Documentation

probdist is a library of routines to compute univariate or multivariate normal probability function for a given mean and covariance; and to approximate an arbitrary probability distribution as a weighted set of samples (sampled PDF).

This code is based on code in the dysii project by Lawrence Murray.

The sampled PDF algorithm is based on the paper by Kitagawa:

Monte Carlo Filter and Smoother for Non-Gaussian Nonlinear State Space Models_, Journal of Computational and Graphical Statistics, 1996, 5, 1-25.

Normal PDF

(define-datatype normal-pdf normal-pdf? (NormalPDF S mu sigma R sigmaInv sigmaDet ZI mu-zero? ))

A representation of a normal PDF:

S
number of dimensions
mu
expectation
sigma
covariance
R
upper triangular Cholesky decomposition of covariance matrix
sigmaInv
inverse of covariance matrix
determinant of covariance matrix
ZI
the constant factor (2*pi)^(-S/2) / prod(diag(R))
mu-zero?
true if expectation is zero
[procedure] make-normal-pdf:: S * MU * SIGMA -> NORMAL-PDF

Creates a new normal PDF object with the specified dimension, expectation, and covariance. If the dimension S is 1, then the expectation MU and the covariance SIGMA must be scalars; otherwise, they must be SRFI-4 f64 vectors. In the latter case, MU must be a vector of size S, and SIGMA must be a matrix of size S x S.

[procedure] normal-pdf:density:: NORMAL-PDF * X -> DENSITY

Computes the density of the distribution at the given point X. If the dimension S of the distribution is 1, then X must be a scalar, otherwise it must be an SRFI-4 f64 vector of length S.

[procedure] normal-pdf:sample:: NORMAL-PDF * X -> SAMPLE

Sample from the distribution. Let X be a sample from a standard normal distribution. Then the sample is SAMPLE = MU + R*X, where R is the upper triangular Cholesky decomposition of the covariance matrix.

[procedure] normal-pdf:expectation:: NORMAL-PDF -> MU

Return the expectation value of the distribution (scalar or f64vector).

[procedure] normal-pdf:covariance:: NORMAL-PDF -> SIGMA

Return the covariance value of the distribution (scalar or f64vector).

Sampled PDF

(define-datatype sampled-pdf sampled-pdf? (SampledPDF N W WXS MU SIGMA))

A representation of a weighted sampled PDF:

N
sample set size
W
total weight of the sample set
wxs
weighted samples (see below)
mu
expectation
sigma
covariance
(define-datatype weighted-samples weighted-samples? (WeightedSamples S senv make-senv))

A representation of weighted samples:

S
number of dimensions
senv
an ordered dictionary data structure that follows the API of rb-tree
make-senv
a procedure to create an empty samples dictionary
[procedure] make-sampled-pdf:: S * MAKE-SENV * XS [* WS * XCAR * XCDR * XNULL?] -> SAMPLED-PDF

Creates a new sampled PDF object with the specified dimension, samples environment creation procedure that follows the API of rb-tree, and weighted samples. The samples can be specified in one of two ways. If both XS and WS are provided, then XS is a sequence that contains the samples, and WS is a sequence that contains the corresponding weights. If only XS is provided, or WS is false, then XS must consist of cons cells, where the car is the weight, and the cdr is the sample. Optional arguments XCAR XCDR XNULL? could be used for sequences other than lists (e.g. streams).

[procedure] sampled-pdf:expectation:: SAMPLED-PDF -> MU

Return the expectation value of the distribution (scalar or f64vector).

[procedure] sampled-pdf:covariance:: SAMPLED-PDF -> SIGMA

Return the covariance value of the distribution (scalar or f64vector).

[procedure] sampled-pdf:find-sample:: U * SAMPLED-PDF -> X

Given a number u in [0,1], returns the sample x(i) such that \sum_{j=1}^{i-1} w_j < Wu <= \sum_{j=1}^{i} w_j, where W is the total weight of the sample set.

[procedure] sampled-pdf:normalize:: SAMPLED-PDF -> SAMPLED-PDF

Normalizes the sample weights to sum to 1.

[procedure] sampled-pdf:resample:: MAKE-SENV * SAMPLED-PDF [ * M] -> SAMPLED-PDF

Resamples the distribution. This produces a new approximation of the same distribution using a set of equally weighted sample points. Sample points are selected using the deterministic resampling method given in the appendix of Kitagawa. Optional argument M is the number of samples to take for the new distribution. If not given, defaults to the number of samples in the existing distribution.

Examples

;;
;; Multivariate test of normal-pdf.
;;
;; This test creates a 3-dimensional normal distribution with a
;; random mean and variance. It then samples from this distribution
;; and calculates the sample mean and variance for comparison.
;;

(require-extension srfi-1)
(require-extension srfi-4)
(require-extension srfi-4-utils)
(require-extension matrix-utils)
(require-extension blas)
(require-extension normal-pdf)
(require-extension random-mtzig)

(define-utility-matrices f64)

;; Number of samples
(define N 100000)

;; Number of dimensions (sample size)
(define S 3)

(define (f64vector-scale a x)
(let ((n (f64vector-length x)))
(dscal n a x)))

(define (f64vector-sum x y)
(let ((n (f64vector-length x)))
(daxpy n 1.0 x y)))

(define (f64vector-sub x y)
(let ((n (f64vector-length x)))
(daxpy n -1.0 y x)))

(define (f64vector-mul a b)
(f64vector-map (lambda (v1 v2) (* v1 v2)) a b))

(define f64matrix-map (make-matrix-map f64vector-ref f64vector-set!))

(define (upper M N A)
(f64matrix-map RowMajor M N A (lambda (i j v) (if (>= j i) v 0.0))))

;;
;;  Computes the arithmetic mean of a list of f64 vectors using the
;;  recurrence relation mean_(n) = mean(n-1) + (v[n] -
;;  mean(n-1))/(n+1)
;;
(define (mean s vs)
(let loop ((i 0) (vs vs) (mean (make-f64vector s 0.0)))
(if (null? vs)  mean
(loop (fx+ 1 i) (cdr vs)
(let ((d (f64vector-sub (car vs) mean)))
(f64vector-sum mean (f64vector-scale (/ 1 (+ 1 i)) d)))))))

(define (covariance s vs mean)
(let* ((cov  (matrix-zeros S S)))
(let ((ds (map (lambda (x) (f64vector-sub x mean)) vs)))
(let ((n (fold (lambda (d n)
(dsyr! RowMajor Upper s 1.0 d cov)
(fx+ 1 n)) 0 ds)))
(f64vector-scale (/ 1 (- n 1)) cov)))))

;; Given a vector of values in the range [0..1], scale all values in
;; the vector to the specified range.
(define (f64vector-urange x lo hi)
(if (< lo hi) (let ((d (- hi lo)))
(f64vector-map (lambda (x) (+ lo (* d x))) x))))

(define rng  (random-mtzig:init))

;;  distribution parameters
(define mu     (f64vector-urange (random-mtzig:f64vector-randu! S rng) -5 5))
(define sigma  (let ((x (f64vector-urange (random-mtzig:f64vector-randu! (* S S) rng) 0 5)))
(dgemm! RowMajor NoTrans Trans S S S
1.0 x x 0.0 (make-f64vector (* S S)))))

(define gpdf  (begin
(print "mu = " mu)
(print "sigma = " sigma)
(make-normal-pdf S mu sigma)))

(define input (list-tabulate N (lambda (i) (random-mtzig:f64vector-randn! S rng))))

(define samples
(let loop ((i 0) (input input) (samples (list)))
(if (null? input)  samples
(let ((samples (cons (normal-pdf:sample gpdf  (car input))  samples)))
(loop (fx+ 1 i) (cdr input) samples)))))

(if (<= N 100) (print "input = " input))
(if (<= N 100) (print "samples = " samples))

;; test for mean and covariance equality to one decimal place
(equal? (f64vector-map (lambda (x) (truncate (* 10 x))) mu)
(f64vector-map (lambda (x) (truncate (* 10 x)))
(mean S samples)))

(equal? (f64vector-map (lambda (x) (truncate (* 10 x))) (upper S S sigma))
(f64vector-map (lambda (x) (truncate (* 10 x)))
(covariance S samples (mean S samples))))


Ivan Raikov

Version history

1.7
Updated use of blas
1.6
Documentation converted to wiki format
1.5
Ported to Chicken 4.
1.4
Added some error checking in make-normal-pdf and a record printer for the normal-pdf datatype.
1.3
Build script updated for better cross-platform compatibility
1.2
More metafile fixes
1.1
Metafile fixes
1.0
Initial release

Copyright 2007-2011 Ivan Raikov and the Okinawa Institute of Science and Technology.

This program is free software: you can redistribute it and/or modify
<http://www.gnu.org/licenses/>.