You are looking at historical revision 27975 of this page. It may differ significantly from its current revision.

## hashes

Miscellaneous Hash Functions

## Documentation

A suite of hash procedures. All have the same signature. All return hash values in an unsigned 32-bit range. Not a one of these is suitable for cryptographic work!

The `HASH` name refers to one of the hash algorithm names below.

#### Usage

(require-extension {{HASH}})

#### Signatures

*[procedure]*

`(*{{HASH}} STRING LENGTH SEED) => unsigned-integer32`

Returns the hash of `STRING` of `LENGTH`, using `SEED`.

*[procedure]*

`({{HASH}} STRING [LENGTH [SEED]]) => unsigned-integer32`

Returns the hash of `STRING`. When `LENGTH` is missing (string-length `STRING`) is assumed. When `SEED` is missing 0 is assumed.

*[procedure]*

`({{HASH}}-primitive) => message-digest-primitive`

Returns the hash message-digest-primitive object.

### Hash Procedures

`HASH`- Description.
- RJMXHash
- Bob Jenkins' MIX Lookup 2 hash function.
- RJL3Hash
- Bob Jenkins' MIX Lookup 3 hash function.
- TWMXHash
- Thomas Wang's hash function with original MIX.
- TWSHMXHash
- Thomas Wang's hash function with shift MIX.
- TWSHMLMXHash
- Thomas Wang's hash function with shift-multiply MIX.
- TWMGMXHash
- Thomas Wang's hash function with Magic MIX.
- FNVHash
- Fowler/Noll/Vo 1 hash function. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - FNVAHash
- Fowler/Noll/Vo 1a hash function. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - PHSFHash
- Paul Hsieh's SuperFast hash function. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - RSHash
- Robert Sedgwick's "Algorithms in C" hash function.
- JSHash
- A bitwise hash function written by Justin Sobel. Ignores the
`SEED`when 0. - PJWHash
- Hash algorithm is based on work by Peter J. Weinberger of AT&T Bell Labs.
- ELFHash
- Similar to the PJW Hash function, but tweaked for 32-bit processors. It's the hash function widely used on most UNIX systems.
- BKDRHash
- This hash function comes from Brian Kernighan and Dennis Ritchie's book "The C Programming Language". It is a simple hash function using a strange set of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to be very similar to the DJB hash function
- SDBMHash
- This is the algorithm of choice which is used in the open source SDBM project. The hash function seems to have a good over-all distribution for many different data sets. It seems to work well in situations where there is a high variance in the MSBs of the elements in a data set.
- DJBHash
- An algorithm produced by Professor Daniel J. Bernstein and shown first to the world on the usenet newsgroup comp.lang.c. It is one of the most efficient hash functions ever published. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - NDJBHash
- Now favored by Bernstein. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - DEKHash
- An algorithm proposed by Donald E. Knuth in "The Art Of Computer Programming, Volume 3", under the topic of sorting and search chapter 6.4. Substitutes the algorithm's initial value when the
`SEED`is non-zero. - APHash
- Arash Partow's hash function. A hybrid rotative and additive hash function algorithm based around four primes 3, 5, 7, and 11.
- BRPHash
- A hash function by Bruno R. Preiss.
- PYHash
- The Python String Object hash function.
- ISPLHash
- The ISpell hash function.
- CRCHash
- The crc32 procedure wrapped as above. Ignores the
`SEED`when 0.

### TWUserMixHash Procedures

Thomas Wang's hash function with a user supplied MIX procedure.

#### Usage

(require-extension TWUserMixHash)

#### make-TWUserMixHash-primitive-procedure

*[procedure]*

`(make-TWUserMixHash-primitive-procedure MIXER [UNSAFE #f]) => procedure`

Returns a hash primitive procedure, `(procedure (scheme-object unsigned-integer32 unsigned-integer32) unsigned-integer32)`, for the procedure `MIXER`, `(procedure (unsigned-integer32) unsigned-integer32)`.

When `UNSAFE` no exception checking is performed.

#### make-TWUserMixHash

*[procedure]*

`(make-TWUserMixHash MIXER [UNSAFE #f]) => (procedure procedure procedure)`

Returns 3 values: `*HASH`, `HASH`, and `HASH-primitive`.

### Hash Utilities

*Note* the domain `number` below means `flonum` or `fixnum`, not the full numeric tower.

#### Usage

(require-extension hash-utils)

#### current-hash-seed

*[parameter]*

`(current-hash-seed [NEW-SEED]) => integer`

Gets or sets the current default hash seed. The initial value is 0.

#### make-fixnum-bounded-hash

*[procedure]*

`(make-fixnum-bounded-hash HASH [GETLEN string-length] [GETINT 0]) => procedure`

Returns a hash function with a SRFI-69 signature, but with a fixnum domain; i.e. `(procedure (* #!optional positive-fixnum) fixnum)`.

The `GETLEN` will be used to aquire the `OBJECT` length for the `HASH`.

The `GETINT` will supply the initial hash value.

#### make-bounded-hash

*[procedure]*

`(make-bounded-hash HASH [GETLEN string-length] [GETINT 0]) => procedure`

Returns a hash function with a SRFI-69 signature, but with a real domain; i.e. `(procedure (* #!optional positive-number) number)`.

The `GETLEN` will be used to aquire the `OBJECT` length for the `HASH`.

The `GETINT` will supply the initial hash value.

#### make-seeded-hash

*[procedure]*

`(make-seeded-hash HASH [SEED]) => procedure`

Returns a curried `HASH` of 1 or 2 arguments with the supplied `SEED`. When the seed is missing the (current-hash-seed) is assumed.

#### make-mask-hash

*[procedure]*

`(make-mask-hash HASH MASK) => procedure`

Returns a `HASH` with the hash value bitwise and'ed with the supplied `MASK`.

#### make-range-hash

*[procedure]*

`(make-range-hash HASH UPPER [LOWER]) => procedure`

Returns a `HASH` with the hash value restricted to the supplied exact interval, `[LOWER UPPER]`. When `LOWER` is missing 0 is assumed. The signature is that of the `HASH`.

#### make-real-hash

*[procedure]*

`(make-real-hash HASH) => procedure`

Returns a `HASH` with the hash value restricted to the interval, `[0.0 1.0]`. The signature is that of the `HASH`.

#### make-hash-procedure

*[procedure]*

`(make-hash-procedure HASH-PRIM [BYTE-LENGTH string-length]) => procedure`

Returns a hash procedure, `(procedure (scheme-object #!optional unsigned-integer32 unsigned-integer32) unsigned-integer32)`, for the hash primitive procedure `HASH-PRIM`.

#### make-hash-message-digest-primitive

*[procedure]*

`(make-hash-message-digest-procedures HASH-PRIM) => message-digest-primitive`

Returns the `message-digest-primitive` for the hash primitive procedure `HASH-PRIM`.

#### make-range-restriction

*[procedure]*

`(make-range-restriction UPPER [LOWER]) => procedure`

Returns a procedure of 1 argument, `(procedure (number) number)`. The arguments will be swapped if necessary so the range is `[LOWER UPPER]`. When `LOWER` missing 0 is assumed.

#### make-fixnum-range-restriction

*[procedure]*

`(make-fixnum-range-restriction UPPER [LOWER]) => procedure`

Returns a procedure of 1 argument, `(procedure (number) fixnum)`. The arguments will be swapped if necessary so the range is `[LOWER UPPER]`. When `LOWER` missing 0 is assumed.

`LOWER` & {UPPER} must be `fixnum`.

#### unsigned-integer32-ref

*[procedure]*

`(unsigned-integer32-ref OBJECT [INDEX])`

Returns the 32-bits at `INDEX` from `OBJECT` as an unsigned-integer32.

`INDEX` is a *word* index, not a *byte* index, and defaults to `0`.

`OBJECT` must be a `string`, `blob`, `pointer` or `locative` of 32-bit-aligned memory.

The procedure is **unsafe**.

#### unsigned-integer32-set!

*[procedure]*

`(unsigned-integer32-set! OBJECT NUMBER [INDEX])`

Sets the 32-bits at `INDEX` in `OBJECT` to `NUMBER`.

`INDEX` is a *word* index, not a *byte* index, and defaults to `0`.

`OBJECT` must be a `string`, `blob`, `pointer` or `locative` of 32-bit-aligned memory.

The procedure is **unsafe**.

### Hash Search

## Usage

(require-extension rabin-karp)

#### make-rabin-karp-string-search

*[procedure]*

`(make-rabin-karp-string-search SUBSTRINGS [TEST [HASH]])`

Returns a procedure of one argument, the search string, and two optional arguments, the start and end positions within the string. The search procedure returns a list of the matched substring and a list of the start and end positions of the match in the search string. Returns #f when no match found. Similar to the regex unit string-match procedure.

`SUBSTRINGS` is a list of strings. `TEST` is an equivalence procedure. `HASH` is a SRFI-69 compliant hash procedure.

## Requirements

miscmacros moremacros message-digest box crc

## Bugs and Limitations

- The hash values are platform specific.
`unsigned-integer32-ref`&`unsigned-integer32-set!`do not belong here.

## Author

## Version history

- 1.1.1
- Restricted TWMXHash to 32-bit computations.
- 1.1.2
- 1.1.1
- No -unboxing compile option.
- 1.1.0
- Added optional index to
`unsigned-integer32-ref`&`unsigned-integer32-set!`. - 1.0.0
- A Chicken 4 version of the Chicken 3 egg, with minor changes.

## License

Copyright (C) 2010 Kon Lovett. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.