pathfinder

Find files in a search path.

Overview

pathfinder provides an interface to finding files in a search path. During a path-find of a search term, each directory in the search path is traversed in order. A matcher is called once for each directory, returning the best match. Optionally, the match must also pass a pathfinder test -- for example, be a regular file -- and if it fails, the matcher can continue looking for matches. The first match that also passes the test is returned.

You can additionally find all shadowed matches with path-find-all. In this case, all successful matches in every directory in the search path are returned.

The entire directory contents (names only) are cached when a directory is read for the first time. File stat data is cached as needed; for example, testing that a filename is a regular file is done once, and only after it has been matched. It is possible to clear the caches to pick up any changes to the filesystem.

  1. pathfinder
  2. Overview
  3. API
    1. Finding files
    2. Constructors
    3. Matchers
      1. User-defined matchers
    4. Tests
      1. User-defined tests
    5. Miscellaneous
    6. Parameters
  4. Bugs and limitations
  5. About this egg
    1. Author
    2. Version history
    3. License

API

Finding files

[procedure] (path-find pf pathname #!optional matcher test)

Find the best (first) match for PATHNAME using pathfinder PF, and returns the absolute pathname if a match occurs or #f if not.

PATHNAME may be a filename such as "bar", which is looked for in each search path, or a relative path such as "foo/bar", in which "bar" is looked for in subdirectory "foo" under each search path. It can also be an absolute path; in this case, it is considered relative to the pathfinder root, confining the search to a single directory in the search path. If the resulting directory is not in or under the search path, #f is returned.

MATCHER is an optional pathfinder matcher which indicates the matching method; if #f or not present, the matcher assocated with the pathfinder object is used.

TEST is an optional pathfinder test which performs an additional test before admitting the file. If #f or not present, the test associated with the pathfinder object is used.

A very simple example:

; The file tree:
; /usr/bin/ls
; /usr/bin/rsync
; /usr/local/bin/rsync

(define pf (make-pathfinder '("/usr/local/bin" "/usr/bin")))
(path-find pf "ls")
  ; => "/usr/bin/ls"
(path-find pf "rsync")
  ; => "/usr/local/bin/rsync"
(path-find pf "flotz")
  ; #f

A more complicated example. Here we add a pathfinder root, relative search paths and absolute pathnames into the mix:

; The file tree:
; /home/hacker/usr/bin/ls
; /home/hacker/usr/local/bin/ls
; /home/hacker/usr/sbin/fsck

(define pf (make-pathfinder '("local/bin" "bin") root: "/home/hacker/usr")
(path-find pf "ls")
  ; => "/home/hacker/usr/local/bin/ls"
(path-find pf "/local/bin/ls")
  ; => "/home/hacker/usr/local/bin/ls"
(path-find pf "/bin/ls")         ;; Absolute search of /home/hacker/usr/bin
  ; => "/home/hacker/usr/bin/ls"
(path-find pf "/sbin/fsck")      ;; Maps to /home/hacker/usr/sbin, but it's not in search path
  ; => #f

For further examples, see the matchers pf:exact, pf:extensions and pf:compound.

[procedure] (path-find-all pf pathname #!optional matcher test)

Like path-find, but returns a list that also includes "shadowed" matches. In other words, MATCHER gets a chance to match multiple times in multiple directories.

; The file tree:
; /usr/bin/ls
; /usr/bin/rsync
; /usr/local/bin/rsync

(define pf (make-pathfinder '("/usr/local/bin" "/usr/bin")))
(path-find-all pf "ls")
  ; => ("/usr/bin/ls")
(path-find-all pf "rsync")
  ; => ("/usr/bin/rsync "/usr/local/bin/rsync")

For another example, see pf:extensions.

Note: rather than filtering the results on file type, size etc. after calling path-find-all, consider using pathfinder tests.

[procedure] (path-fold pf func init pathname #!optional matcher test)

Like path-find-all, this finds all shadowed matches for PATHNAME, performing a fold over the results. INIT is the initial seed and FUNC is called with arguments (X XS) where X is the current absolute pathname and XS is the accumulated seed.

Note: filtering the results on file type, size etc. is usually better done with pathfinder tests.

Constructors

[procedure] (make-pathfinder paths #!key matcher test root)

Construct a pathfinder object with search path PATHS. PATHS is a list of absolute or relative pathnames; if relative, they are relative to the pathfinder ROOT.

Keyword ROOT specifies the root of this pathfinder and defaults to the current directory. ROOT specifies both the base for relative PATHS, and the base for absolute search terms (see path-find). ROOT itself can be absolute or relative; if relative, it is relative to the current directory at the time the object is created.

Keyword MATCHER specifies a pathfinder matcher for this object, and defaults to (pathfinder-default-matcher), usually the exact filename matcher pf:exact. This becomes the default matcher for this pathfinder object, and can be overridden later in path-find.

Keyword TEST specifies a pathfinder test for this object, and defaults to (pathfinder-default-test), usually pf:regular-file?. This becomes the default test for this pathfinder object, and can be overridden later in path-find. To avoid filtering, use the test pf:any?.

;; unix path search, exact filename match
(define pu (make-pathfinder
            (string-split (get-environment-variable "PATH") ":")))
(path-find pu "csi")
  ; => "/usr/local/bin/csi"
(path-find pu "bash")
  ; => "/bin/bash"

;; windows path search, using filepath egg; extension match
(use filepath)
(define pw (make-pathfinder (filepath:get-search-path) 
            matcher: (pf:extensions '(".com" ".exe" ".bat"))))

(path-find pw "csi")
  ; => "X:\\chicken4\\bin\\csi.exe"
(path-find pw "command")
  ; => "c:\\windows\\system32\\command.com"

Matchers

[procedure] pf:exact

A matcher object suitable for passing to path-find or path-find-all. For each directory, this matcher checks that the current search term exactly matches a filename that is present under the directory.

[procedure] (pf:extensions exts)

Create a matcher object suitable for passing to path-find or path-find-all. For each directory, this matcher tries each extension in the EXTS list in order, by appending it to the current search term. If the extension was already present in the search term, it is not appended again.

The extensions in the EXTS list should include the dot, meaning ".exe" not "exe". It is legal to include a dot in the extension itself. The empty string "" is allowed as well, meaning the search term itself will be tried without any extension.

An example in which we search only one path, the Chicken repository path, with several extensions.

(define p (make-pathfinder (list (repository-path))))
(define eggs (pf:extensions '(".setup-info" ".so" ".import.so")))

(path-find p "matchable" eggs)
  ; => "/usr/local/lib/chicken/6/matchable.setup-info"
(path-find-all p "matchable" eggs)
  ; => ("/usr/local/lib/chicken/6/matchable.setup-info"
  ;     "/usr/local/lib/chicken/6/matchable.so"
  ;     "/usr/local/lib/chicken/6/matchable.import.so")

(path-find p "txpath" eggs)
  ; => "/usr/local/lib/chicken/6/txpath.so"
(path-find-all p "txpath" eggs)
  ; => ("/usr/local/lib/chicken/6/txpath.so"
  ;     "/usr/local/lib/chicken/6/txpath.import.so")

(path-find p "posix" eggs)
  ; => "/usr/local/lib/chicken/6/posix.import.so"
(path-find-all p "posix" eggs)
  ; => ("/usr/local/lib/chicken/6/posix.import.so")

Example in which we search subdirectories in multiple paths:

(define p (make-pathfinder (list "user" "template") root: "/var/www/htdocs"))
(define webs (pf:extensions '(".css" ".scss" ".js")))

(path-find p "js/jquery" webs)
  ; => "/var/www/htdocs/template/js/jquery.js"
(path-find p "css/screen" webs)
  ; => "/var/www/htdocs/user/css/screen.css"
(path-find-all p "css/screen" webs)
  ; => ("/var/www/htdocs/user/css/screen.css"
  ;     "/var/www/htdocs/user/css/screen.scss"
  ;     "/var/www/htdocs/template/css/screen.css"
  ;     "/var/www/htdocs/template/css/screen.scss")
(path-find p "css/screen.scss" webs)
  ; => "/var/www/htdocs/user/css/screen.scss"
(path-find-all p "css/screen.scss" webs)
  ; => ("/var/www/htdocs/user/css/screen.scss"
  ;     "/var/www/htdocs/template/css/screen.scss")
(path-find p "/template/css/screen" webs)
  ; => "/var/www/htdocs/template/css/screen.css"
[procedure] (pf:compound exts)

Create a matcher object suitable for passing to path-find or path-find-all. For each directory, this matches any files that match the current search term and additionally have zero or more of the extensions in the EXTS list. Furthermore, these results are returned sorted by extension priority, where the priority is higher for extensions earlier in the list. This behavior is like that of the Hike library for Ruby.

The extensions in the EXTS list should include the dot at the beginning, meaning ".exe" not "exe". It is not legal to use the empty string, nor to have a dot within the extension.

(define p (make-pathfinder '("app/assets" "vendor/plugins/3e8/assets"
                             "vendor/plugins/jquery" "template/assets")
           matcher: (pf:compound '(".scss" ".coffee" ".sass" ".scm" ".php"))
           root: "/var/zb-app"))

; tree under /var/zb-app
; ./app/assets/js/app.js
; ./app/assets/js/menu.js.coffee
; ./template/assets/css/menu.css
; ./template/assets/js/search.js
; ./template/assets/js/menu.js
; ./vendor/plugins/3e8/assets/css/menu.css.sass  
; ./vendor/plugins/3e8/assets/css/menu.css.scss
; ./vendor/plugins/3e8/assets/css/menu.css.scss.php
; ./vendor/plugins/3e8/assets/css/menu.css.scss.scm
; ./vendor/plugins/jquery/js/jquery.js

(path-find p "js/menu.js")
  ; => "/var/zb-app/app/assets/js/menu.js.coffee"
(path-find-all p "js/menu.js")
  ; => ("/var/zb-app/app/assets/js/menu.js.coffee"
  ;     "/var/zb-app/template/assets/js/menu.js")
(path-find p "js/search.js")
  ; => "/var/zb-app/template/assets/js/search.js"
(path-find p "js/jquery.js")
  ; => "/var/zb-app/vendor/plugins/jquery/js/jquery.js"
(path-find p "css/menu.css")
  ; => "/var/zb-app/vendor/plugins/3e8/assets/css/menu.css.scss"
(path-find-all p "css/menu.css")
  ; => ("/var/zb-app/vendor/plugins/3e8/assets/css/menu.css.scss"
  ;     "/var/zb-app/vendor/plugins/3e8/assets/css/menu.css.sass"
  ;     "/var/zb-app/vendor/plugins/3e8/assets/css/menu.css.scss.scm"
  ;     "/var/zb-app/vendor/plugins/3e8/assets/css/menu.css.scss.php"
  ;     "/var/zb-app/template/assets/css/menu.css")
(path-find p "css/menu.css" pf:exact)
  ; => "/var/zb-app/template/assets/css/menu.css"

User-defined matchers

You can create your own custom matchers, but the API is not yet finalized. If you wish to do this anyway, please have a look at how pf:extensions is implemented in the source code, and caveat schemer.

Tests

[procedure] (pf:regular-file? DIRENT FILENAME)

Pathfinder test which selects regular files, to be passed to a pathfinder matcher. This is the default filter, so specifying it is optional.

;; Find a regular file named "ls" in the search path.
(path-find p "ls" pf:exact pf:regular-file?)
[procedure] (pf:directory? DIRENT FILENAME)

Pathfinder test which selects directories, to be passed to a pathfinder matcher.

[procedure] (pf:any? DIRENT FILENAME)

Pathfinder test which will pass any file. It can be used when you don't want to filter anything.

;; Find any file or directory named ".emacs.d" in the search path,
;; overriding pathfinder p's default test of pf:regular-file?
(define p (make-pathfinder paths))
(path-find p ".emacs.d" pf:exact pf:any?)

;; Same thing, but associate the test with pathfinder p,
;; so we don't have to specify it in each find.
(define p (make-pathfinder paths test: pf:any?))
(path-find p ".emacs.d")

User-defined tests

It is possible to define custom pathfinder tests, useful if you want to test a file is executable or non-zero size or recent mtime and so on. Tests are two-argument procedures (DIRENT FILENAME) which return a boolean value indicating whether to accept the file that is currently being examined. The following helpers are useful.

[procedure] (dirent-stat DIRENT FILENAME)

Return the stat vector for FILENAME in DIRENT. If the stat fails (for example, if the file has gone missing since the dirent was populated) then #f is returned. Stat vectors are cached upon first access and subsequent stats are returned from cache.

(use posix-extras)
(define (pf:nonempty-file? de fn)
  (and-let* ((s (dirent-stat de fn)))
    (and (stat-regular-file? s)    ;; or, (pf:regular-file? de fn)
         (> (stat-size s) 0))))

(path-find pf "notes.txt" pf:exact pf:nonempty?)
[procedure] (dirent-pathname DIRENT FILENAME)

Return the absolute pathname associated with FILENAME in DIRENT. You might use this when performing a query that cannot be answered by the stat cache.

(define (pf:execute-access? de fn)
  (file-execute-access?
   (dirent-pathname de fn)))
[procedure] (dirent-directory DIRENT)

Returns the directory name associated with DIRENT.

Miscellaneous

[procedure] (pathfinder-reset pf)

Reset the cached directory and file stat data in pathfinder PF, allowing you to pick up any changes that have occurred in the search path.

[procedure] (pathfinder-stat pf pathname)

Using PF's stat cache, stat absolute PATHNAME and return its stat vector (as in posix's file-stat). For example, this could be used on the filenames returned by path-find-all in order to filter them further.

Note that this returns #f for a file in any directory which has not yet been traversed during a search operation.

It is recommended to write your own pathfinder test instead and use dirent-stat, which is more efficient and will additionally work even with plain path-find.

[procedure] (pathfinder-paths PF)

Return the search path list of pathfinder PF. The paths may not equal what you passed to the constructor -- they may have been adjusted for the pathfinder root, normalized, tilde expanded, etc.

[procedure] (pathfinder-root PF)

Return the root of the pathfinder, an absolute pathname.

[procedure] (pathfinder-matcher PF)

Return the pathfinder matcher procedure associated with PF.

[procedure] (pathfinder-test PF)

Return the pathfinder test procedure associated with PF.

Parameters

[parameter] (pathfinder-default-test test)

Default test assigned to new pathfinder objects; it is performed after matching a filename. The default value is pf:regular-file?. To not filter the results at all, set this to pf:any?.

[parameter] (pathfinder-default-matcher matcher)

Default matcher assigned to new pathfinder objects. The default value is pf:exact.

Bugs and limitations

About this egg

Author

Jim Ursetto

Some design inspiration as well as handling of compound extensions was taken from the MIT-licensed Hike library by Sam Stephenson.

Version history

0.2
API change (pathfinder test now specified in find, not matcher); add pf:any? identity test; a few new accessors and helpers; bugfixes
0.1
Initial release

License

Copyright (c) 2011, Ursetto Consulting, Inc. MIT license.