Fancypants

  1. Fancypants
    1. Description
    2. Author
    3. Requirements
    4. Documentation
      1. Rulesets
      2. Constants
      3. Helper procedures
    5. Changelog
    6. License

Description

SXML/HTML automatic ligature detection and quote smartening.

Author

Peter Bex

Requirements

This egg has no dependencies, but in order to use it you will need sxml-transforms.

Documentation

Fancypants is a fairly simple set of functions plus an SXSLT ruleset to automagically convert SXML with plain-ASCII strings to typographically enhanced Unicode strings. Ligatures are added and quotes are educated ie, opening quotes are curled to the left while closing quotes are curled the other way. An example piece of SXML:

(sxml-apply-rules
  '(blockquote "\"The affable Estonian wasn't fired\","
	       " said the --- strangely afflicted ---"
	       " flying monkey at the office.")
  (make-fancy-rules)
  (make-smart-quote-rules))

When rendered, looks like the following:

“The affable Estonian wasn’t fired”, said the — strangely afflicted — flying monkey at the office.

Which looks like this without using fancypants:

"The affable Estonian wasn't fired", said the --- strangely afflicted --- flying monkey at the office.

As you can see, the quotes are curled correctly, the three minuses are converted to real emdashes (but this wiki renders them incorrectly, unfortunately) and the 'fi', 'ffl', 'fl' and 'ff' characters are replaced by ligatures that merge the characters in a nice way.

A word of warning: How the ligatures are displayed depends heavily on the particular font being used and the implementation of the fonts. For example, on a Mac, most MS Corefonts are apparently modified by Apple to support all ligatures, while the basic Corefonts by Microsoft (as found under Windows and many Unix installations) are lacking ligatures in most fonts. Consider this before using Fancypants' ligature capability (the fi and ff ligatures are reasonably safe to use in most cases, though). Testing on a number of platforms is, unfortunately, still a good idea while doing webdevelopment.

Fancypants was inspired by SmartyPants and, more specifically, Mikhail Wolfson's ligatures hack for SmartyPants.

Rulesets

There are two rulesets: one for auto-conversion of ligatures and other types of character combinations to Unicode and one for smartening quotes. Both rulesets are generated by functions.

[procedure] (make-fancy-rules [exceptions] [character-map])

Create a ruleset that performs ASCII->Unicode mappings for all entries in the character-map argument. character-map defaults to default-map (see below).

Please note that the order matters because the replacement algorithm employes a nongreedy search. Place prefixes of other matches after them and there is no problem. The symbols in exceptions are the tags to leave alone (ie, nothing below these is fancified) and defaults to default-exceptions (see below).

[procedure] (make-smart-quote-rules [exceptions] [quotes])

Create a ruleset that educates quotes. quotes defines the strategy of how to translate quotes to smart quotes. See the documentation for all-quotes for more info on the structure of this argument. Please note that here, the order doesn't matter because the replacement algorithm uses simple regexes. The symbols in exceptions are the tags to leave alone. (ie, under these nothing has its quotes changes)

exceptions defaults to default-exceptions and quotes defaults to all-quotes (see below).

Constants

[constant] default-exceptions

This constant is a list of all the tags (symbols) that are ignored by default.

These are: (head script pre code kbd samp @).

[constant] default-ligature-map

An alist of default ASCII sequences that are translated to ligatures by make-fancy-rules.

Contains mappings for 'ffi', 'ffl', 'ff', 'fi', 'fl' and 'ft'. The mapping for 'st' is intentionally left out because this ligature is too elaborate to use in body copy. You could easily define a ruleset for eg headings that does include the 'st' ligature (it's Unicode character fb06).

[constant] default-punctuation-map

An alist of default ASCII punctuation sequences to translate to 'fancy' Unicode versions. Contains mappings for '...' => '…', '..' => '‥', '. . .' => '…', '---' => '—' and '--' => '–'.

[constant] default-arrow-map

An alist of default ASCII sequences to translate to 'fancy' Unicode versions. This contains several types of arrows. Useful mostly for mathematical texts and 'evaluates to' examples.

[constant] default-map

The default map to use for fancifying text. This is simply a concatenation of default-ligature-map, default-punctuation-map and default-arrow-map.

[constant] all-quotes

The quote characters in here to be translated by make-smart-quotes. Remove any you don't want to have handled.

The structure of an entry in this list is:

 (pre match post how counts?)

pre is the part of the string that's before the quote to match, post is the string that is after the match. These are all irregex literals.

how is one of the following symbols: single, double, single-open, double-open, single-close or double-close.

counts? is a boolean describing whether the quote should influence the nesting of subsequent quotes or not. (ie, "isn't" => #f, since the ' is not a quote which matches a preceding quote or which is matched by a subsequent quote).

Helper procedures

These procedures are used internally by Fancypants, but they are probably useful enough to export, so here they are.

[procedure] (fancify string character-map)

Perform simple substitution of all ASCII character strings in the character-map alist to their Unicode character within string.

[procedure] (smarten-quotes sxml quotes exceptions)

Smarten the sxml. Translates only the strings in the quotes argument, and skips all tag names in the exceptions list

Changelog

License

 Copyright (c) 2006-2011, Peter Bex (peter.bex@xs4all.nl)
 All rights reserved.
 
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
 1. Redistributions of source code must retain the above copyright
    notice, this list of conditions and the following disclaimer.
 2. Redistributions in binary form must reproduce the above copyright
    notice, this list of conditions and the following disclaimer in the
    documentation and/or other materials provided with the distribution.
 3. Neither the name of author nor the names of any contributors may
    be used to endorse or promote products derived from this software
    without specific prior written permission.
 
 THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
 ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
 LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
 A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
 HOLDERS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
 SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
 LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
 DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
 THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
 OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.