You are looking at historical revision 34394 of this page. It may differ significantly from its current revision.

hashes

Miscellaneous Hash Functions

Documentation

A suite of hash procedures. All have the same signature. All return hash values in an unsigned 32-bit range. Not a one of these is suitable for cryptographic work!

The HASH name refers to one of the hash algorithm names below.

Usage

(require-extension {{HASH}})

Signatures

[procedure] (*{{HASH}} STRING LENGTH SEED) => unsigned-integer32

Returns the hash of STRING of LENGTH, using SEED.

[procedure] ({{HASH}} STRING [LENGTH [SEED]]) => unsigned-integer32

Returns the hash of STRING. When LENGTH is missing (string-length STRING) is assumed. When SEED is missing 0 is assumed.

[procedure] ({{HASH}}-primitive) => message-digest-primitive

Returns the hash message-digest-primitive object.

Hash Procedures

HASH
Description.
RJMXHash
Bob Jenkins' MIX Lookup 2 hash function.
RJL3Hash
Bob Jenkins' MIX Lookup 3 hash function.
TWMXHash
Thomas Wang's hash function with original MIX.
TWSHMXHash
Thomas Wang's hash function with shift MIX.
TWSHMLMXHash
Thomas Wang's hash function with shift-multiply MIX.
TWMGMXHash
Thomas Wang's hash function with Magic MIX.
FNVHash
Fowler/Noll/Vo 1 hash function. Substitutes the algorithm's initial value when the SEED is non-zero.
FNVAHash
Fowler/Noll/Vo 1a hash function. Substitutes the algorithm's initial value when the SEED is non-zero.
PHSFHash
Paul Hsieh's SuperFast hash function. Substitutes the algorithm's initial value when the SEED is non-zero.
RSHash
Robert Sedgwick's "Algorithms in C" hash function.
JSHash
A bitwise hash function written by Justin Sobel. Ignores the SEED when 0.
PJWHash
Hash algorithm is based on work by Peter J. Weinberger of AT&T Bell Labs.
ELFHash
Similar to the PJW Hash function, but tweaked for 32-bit processors. It's the hash function widely used on most UNIX systems.
BKDRHash
This hash function comes from Brian Kernighan and Dennis Ritchie's book "The C Programming Language". It is a simple hash function using a strange set of possible seeds which all constitute a pattern of 31....31...31 etc, it seems to be very similar to the DJB hash function
SDBMHash
This is the algorithm of choice which is used in the open source SDBM project. The hash function seems to have a good over-all distribution for many different data sets. It seems to work well in situations where there is a high variance in the MSBs of the elements in a data set.
DJBHash
An algorithm produced by Professor Daniel J. Bernstein and shown first to the world on the usenet newsgroup comp.lang.c. It is one of the most efficient hash functions ever published. Substitutes the algorithm's initial value when the SEED is non-zero.
NDJBHash
Now favored by Bernstein. Substitutes the algorithm's initial value when the SEED is non-zero.
DEKHash
An algorithm proposed by Donald E. Knuth in "The Art Of Computer Programming, Volume 3", under the topic of sorting and search chapter 6.4. Substitutes the algorithm's initial value when the SEED is non-zero.
APHash
Arash Partow's hash function. A hybrid rotative and additive hash function algorithm based around four primes 3, 5, 7, and 11.
BRPHash
A hash function by Bruno R. Preiss.
PYHash
The Python String Object hash function.
ISPLHash
The ISpell hash function.
CRCHash
The crc32 procedure wrapped as above. Ignores the SEED when 0.

TWUserMixHash Procedures

Thomas Wang's hash function with a user supplied MIX procedure.

Usage

(require-extension TWUserMixHash)

make-TWUserMixHash-primitive-procedure

[procedure] (make-TWUserMixHash-primitive-procedure MIXER [UNSAFE #f]) => procedure

Returns a hash primitive procedure, (procedure (scheme-object unsigned-integer32 unsigned-integer32) unsigned-integer32), for the procedure MIXER, (procedure (unsigned-integer32) unsigned-integer32).

When UNSAFE no exception checking is performed.

make-TWUserMixHash

[procedure] (make-TWUserMixHash MIXER [UNSAFE #f]) => (procedure procedure procedure)

Returns 3 values: *HASH, HASH, and HASH-primitive.

Hash Utilities

Note the domain number below means flonum or fixnum, not the full numeric tower.

Usage

(require-extension hash-utils)

current-hash-seed

[parameter] (current-hash-seed [NEW-SEED]) => integer

Gets or sets the current default hash seed. The initial value is 0.

make-fixnum-bounded-hash

[procedure] (make-fixnum-bounded-hash HASH [GETLEN string-length] [GETINT 0]) => procedure

Returns a hash function with a SRFI-69 signature, but with a fixnum domain; i.e. (procedure (* #!optional positive-fixnum) fixnum).

The GETLEN will be used to aquire the OBJECT length for the HASH.

The GETINT will supply the initial hash value.

make-bounded-hash

[procedure] (make-bounded-hash HASH [GETLEN string-length] [GETINT 0]) => procedure

Returns a hash function with a SRFI-69 signature, but with a real domain; i.e. (procedure (* #!optional positive-number) number).

The GETLEN will be used to aquire the OBJECT length for the HASH.

The GETINT will supply the initial hash value.

make-seeded-hash

[procedure] (make-seeded-hash HASH [SEED]) => procedure

Returns a curried HASH of 1 or 2 arguments with the supplied SEED. When the seed is missing the (current-hash-seed) is assumed.

make-mask-hash

[procedure] (make-mask-hash HASH MASK) => procedure

Returns a HASH with the hash value bitwise and'ed with the supplied MASK.

make-range-hash

[procedure] (make-range-hash HASH UPPER [LOWER]) => procedure

Returns a HASH with the hash value restricted to the supplied exact interval, [LOWER UPPER]. When LOWER is missing 0 is assumed. The signature is that of the HASH.

make-real-hash

[procedure] (make-real-hash HASH) => procedure

Returns a HASH with the hash value restricted to the interval, [0.0 1.0]. The signature is that of the HASH.

make-hash-procedure

[procedure] (make-hash-procedure HASH-PRIM [BYTE-LENGTH string-length]) => procedure

Returns a hash procedure, (procedure (scheme-object #!optional unsigned-integer32 unsigned-integer32) unsigned-integer32), for the hash primitive procedure HASH-PRIM.

make-hash-message-digest-primitive

[procedure] (make-hash-message-digest-procedures HASH-PRIM) => message-digest-primitive

Returns the message-digest-primitive for the hash primitive procedure HASH-PRIM.

make-range-restriction

[procedure] (make-range-restriction UPPER [LOWER]) => procedure

Returns a procedure of 1 argument, (procedure (number) number). The arguments will be swapped if necessary so the range is [LOWER UPPER]. When LOWER missing 0 is assumed.

make-fixnum-range-restriction

[procedure] (make-fixnum-range-restriction UPPER [LOWER]) => procedure

Returns a procedure of 1 argument, (procedure (number) fixnum). The arguments will be swapped if necessary so the range is [LOWER UPPER]. When LOWER missing 0 is assumed.

LOWER & {UPPER} must be fixnum.

unsigned-integer32-ref

[procedure] (unsigned-integer32-ref OBJECT [INDEX])

Returns the 32-bits at INDEX from OBJECT as an unsigned-integer32.

INDEX is a word index, not a byte index, and defaults to 0.

OBJECT must be a string, blob, pointer or locative of 32-bit-aligned memory.

The procedure is unsafe.

unsigned-integer32-set!

[procedure] (unsigned-integer32-set! OBJECT NUMBER [INDEX])

Sets the 32-bits at INDEX in OBJECT to NUMBER.

INDEX is a word index, not a byte index, and defaults to 0.

OBJECT must be a string, blob, pointer or locative of 32-bit-aligned memory.

The procedure is unsafe.

Usage

(require-extension rabin-karp)
[procedure] (make-rabin-karp-string-search SUBSTRINGS [TEST [HASH]])

Returns a procedure of one argument, the search string, and two optional arguments, the start and end positions within the string. The search procedure returns a list of the matched substring and a list of the start and end positions of the match in the search string. Returns #f when no match found. Similar to the regex unit string-match procedure.

SUBSTRINGS is a list of strings. TEST is an equivalence procedure. HASH is a SRFI-69 compliant hash procedure.

Requirements

miscmacros moremacros message-digest box crc

Bugs and Limitations

Author

kon lovett

Version history

1.2.0
Add raw-update.
1.1.1
Restricted TWMXHash to 32-bit computations.
1.1.2
1.1.1
No -unboxing compile option.
1.1.0
Added optional index to unsigned-integer32-ref & unsigned-integer32-set!.
1.0.0
A Chicken 4 version of the Chicken 3 egg, with minor changes.

License

Copyright (C) 2010 Kon Lovett. All rights reserved.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the Software), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED ASIS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.