persistent-hash-map

A persistent associative data structure.

  1. persistent-hash-map
    1. Overview
    2. Implementation notes
    3. Compatibility
      1. Chicken version requirement
      2. Compiler __builtin_popcount requirement
    4. API
      1. Persistent maps
      2. Transient maps
    5. About this egg
      1. Source
      2. Author
      3. Version history
      4. License

Overview

This egg provides a (mostly) direct port of ClojureScript's PersistentHashMap which is an implementation of Phil Bagwell's Hash Array Mapped Trie (HAMT). As an extension to the original HAMT, this implementation provides a persistent map data structure which means that update operations yield a new version without modifying the old one. In order to preserve space and time efficieny it employs structural sharing between different versions by means of path copying.

A persistent map can be converted to a transient map which can be mutated and later be turned back into a persistent map. This allows for more efficient modifications than the persistent map does at the expense of having to coordinate concurrent updates.

Preliminary benchmarks indicate that key lookup times are comparable but slightly worse than those of SRFI 69 hash tables whereas transient maps insertion times are about on par.

Implementation notes

The data structures provided by this egg (persistent and transient maps) use the equal?-hash function from the srfi-69 unit for hashing keys and thus equal? for comparing keys on hash collision.

Compatibility

Currently a few compatibility constraints apply to this extension.

Chicken version requirement

The current release only works with Chicken versions >= 4.7.5 as it makes heavy use of the type declaration system which was introduced with that version. Future releases are planned to be compatible with older Chicken versions, as well.

Compiler __builtin_popcount requirement

The current release requires a C compiler with support for __builtin_popcount. This applies at least to GCC since version 3.4 and LLVM / clang since version 1.5. If your processor supports a native popcount instruction you can enable it by passing -C -mpopcnt via the CSC_OPTIONS environment variable when installing this extension.

API

Persistent maps

The following procedures all operate on persistent maps. Updating procedures (such as map-add) always return a new map instead of mutating the passed map(s).

[procedure] (persistent-map [key value ...])

Returns a persistent map, optionally populated with the given key value pairs.

Example:

(persistent-map 'foo 1 'bar 2)
=> #<persistent-hash-map (bar . 2) (foo . 1)>
[procedure] (alist->map alist)

Constructs a persistent map from alist.

Example:

(alist->map '((foo . 1) (bar . 2)))
=> #<persistent-hash-map (bar . 2) (foo . 1)>
[procedure] (map->alist map)

Constructs an alist from map.

Example:

(map->alist (persistent-map 'foo 1 'bar 2))
=> ((bar . 2) (foo . 1))
[procedure] (map? x)

Checks whether x is a persistent map.

[procedure] (map-size map)

Returns the number of entries in map.

[procedure] (map-add map key value [key value ...])

Returns map with the given key value pairs added. Existing keys will be overridden.

Example:

(map-add (persistent-map 'foo 1) 'foo 2 'bar 3)
=> #<persistent-hash-map (bar . 3) (foo . 2)>
[procedure] (map-contains? map key)

Checks whether map contains an entry for key.

[procedure] (map-delete map key [key ...])

Returns map without the given keys. Non-existing keys will be ignored.

Example:

(map-delete (persistent-map 'foo 1 'bar 2) 'foo 'qux)
=> #<persistent-hash-map (bar . 2)>
[procedure] (map-ref map key #!optional not-found)

Returns the value associated with key in map or not-found (default #f) if key does not exist.

Example:

(define m (persistent-map 'foo 1))

(map-ref m 'foo 2) => 1
(map-ref m 'bar)   => #f
(map-ref m 'bar 2) => 2
[procedure] (map-ref-in map keys #!optional not-found)

Recursively looks up the list of keys in the nested map and returns the value associated with the innermost key or not-found (default #f) if one of the keys does not exist.

Example:

(define m (persistent-map 'foo (persistent-map 'bar (persistent-map 'qux 123))))

(map-ref-in m '(foo bar qux)) => 123
(map-ref-in m '(foo qux))     => #f
[procedure] (map-update-in map keys proc . args)

Recursively looks up the keys in the nested map like map-ref-in but applies proc to the existing value of the innermost key (or #f if it doesn't exist, yet) and the given args. For non-existing intermediate keys new persistent maps will be added.

Example:

(define m (persistent-map 'foo (persistent-map 'bar 1)))

(map-update-in m '(foo bar) + 1)
=> #<persistent-hash-map (foo . #<persistent-hash-map (bar . 2)>)>

(map-update-in m '(foo qux) (lambda (x) (or x 9)))
=> #<persistent-hash-map (foo . #<persistent-hash-map (bar . 1) (qux . 9)>)>
[procedure] (map-equal? x y)

Checks whether the given maps x and y are equal?. Values which are maps themselves will be recursively compared with map-equal?.

[procedure] (map-keys map)

Returns a list of the keys contained in map.

[procedure] (map-values map)

Returns a list of the values contained in map.

[procedure] (map-merge map1 map2)

Returns a new map with the entries of map2 merged into map1. Entries which are present in both maps will be overridden by those in map2.

[procedure] (map-reduce f init map)

Reduces the entries of map by applying them to f together with the current reduction value, starting with init.

Example:

(define m (persistent-map 'foo 1 'bar 2))

(map-reduce cons* '() m) => (bar 2 foo 1)
[procedure] (map-collect proc map)

Applies proc to each entry of map and collects the return values in a list as the final result.

Example:

(map-collect cons (persistent-map 1 2 3 4))
=> ((1 . 2) (3 . 4))
[procedure] (map-each proc map)

Applies proc to each entry of map for its side-effects. The return value is undefined.

Example:

(map-each (lambda (key value)
            (pp (list key: key value: value)))
          (persistent-map 'foo 1 'bar 2))

Output:

(key: foo value: 1)
(key: bar value: 2)

Transient maps

The following procedures all operate on transient maps which are mutable versions of persistent maps. A transient map can be turned into a persistent map and cannot be modified afterwards anymore.

[procedure] (map->transient-map map)

Returns a transient copy of the persistent map.

[procedure] (transient-map? x)

Checks whether x is a transient map.

[procedure] (persist-map! map)

Marks map as persistent and returns a persistent version of it. After persist-map! has been called on a transient map it can't be mutated anymore. Calling persist-map! twice on the same transient map is an error.

[procedure] (map-add! map key value [key value ...])

Adds the given key value pairs to map and returns it.

Example:

(define m (map->transient-map (persistent-map)))

(for-each
 (lambda (k v) (map-add! m k v))
 '(foo bar)
 '(1 2))

(persist-map! m)
=> #<persistent-hash-map (bar . 2) (foo . 1)>
[procedure] (map-delete! map key [key ...])

Deletes the given keys from map and returns it. Non-existing keys are ignored.

Example:

(define m
  (map->transient-map (persistent-map 'foo 1 'bar 2)))

(map-delete! m 'bar)
(persist-map! m)
=> #<persistent-hash-map (foo . 1)>

About this egg

Source

The source code is hosted at Bitbucket. Feel free to fork it and send pull requests there.

Author

Moritz Heidkamp, based on the ClojureScript implementation by Rich Hickey (c).

Version history

0.0.4
Fix transient maps to not modify the original persistent map
0.0.3
Actual 32 bit compatibility
0.0.2
32 bit compatibility (broken)
0.0.1
Initial release

License

Copyright (c) 2013, Moritz Heidkamp. All rights reserved.

The use and distribution terms for this software are covered by the Eclipse Public License 1.0.