Types (historical revision 30951) - The CHICKEN Scheme wiki

You are looking at historical revision 30951 of this page. It may differ significantly from its current revision.

manual

Types

Types

A dynamically typed language like Scheme does not restrict the type of values bound or assigned to variables to be constant troughout the run-time of a program. This provides a lot of flexibility and makes it easy to get code up and running quickly, but can make maintenance of larger code bases more difficult as the implicit assignment of types to variables done by the programmer has to be "recovered" when the code is inspected or debugged again. Statically typed languages enforce distinct types for all variables, optionally providing type-inference to compute types without requiring the user to specify explicit type declarations in many cases.

If the compiler has some knowledge of the types of local or global variables then it can help in catching type-related errors like passing a value of the wrong type to a user-defined or built-in procedure. Type-information also can be used to generate more efficient code by omitting unnecessary type-checks.

CHICKEN provides an intra-procedural flow-analysis pass and two compiler options for using type-information in this manner:

-specialize will replace certain generic library procedure calls with faster type-specific operations.

-strict-types makes type-analysis more optimistic and gives more opportunities for specialization, but may result in unsafe code if type-declarations are violated.

Note that the interpreter will always ignore type-declarations and will not perform any flow-analysis of interpreted code.

Declaring types

Type information for all core library units is available by default. User-defined global variables can be declared to have a type using the (declare (type ...)) or : syntax.

:

[syntax] (: IDENTIFIER TYPE)

Declares that the global variable IDENTIFIER is of the given type.

the

[syntax] (the TYPE EXPRESSION)

Equivalent to EXPRESSION, but declares that the result will be of the given type. Note that this form always declares the type of a single result, the can not be used to declare types for multiple result values. TYPE should be a subtype of the type inferred for EXPRESSION, the compiler will issue a warning if this should not be the case.

assume

[syntax] (assume ((VARIABLE TYPE) ...) BODY ...)

Declares that at the start of execution of BODY .., the variables will be of the given types. This is equivalent to

(let ((VARIABLE (the TYPE VARIABLE)) ...) 
  BODY ...)

define-type

[syntax] (define-type NAME TYPE)

Defines a type-abbreviation NAME that can be used in place of TYPE. Type-abbreviations defined inside a module are not visible outside of that module.

Type syntax

Types declared with the type declaration (see Declarations) or : should follow the syntax given below:

TYPE	meaning
`deprecated`	any use of this variable will generate a warning
VALUETYPE

VALUETYPE	meaning
`(or VALUETYPE ...)`	"union" or "sum" type
`(not VALUETYPE)`	non-matching type (*)
`(struct STRUCTURENAME)`	record structure of given kind
`(procedure [NAME] (VALUETYPE ... [#!optional VALUETYPE ...] [#!rest [VALUETYPE]]) . RESULTS)`	procedure type, optionally with name
`(VALUETYPE ... [#!optional VALUETYPE ...] [#!rest [VALUETYPE]] -> . RESULTS)`	alternative procedure type syntax
`(VALUETYPE ... [#!optional VALUETYPE ...] [#!rest [VALUETYPE]] --> . RESULTS)`	procedure type that is declared to modify locally held state
`(VALUETYPE -> VALUETYPE : VALUETYPE)`	predicate procedure type
`(forall (TYPEVAR ...) VALUETYPE)`	polymorphic type
COMPLEXTYPE
BASICTYPE
TYPEVAR	`VARIABLE` or `(VARIABLE TYPE)`

BASICTYPE	meaning
`*`	any value
`blob`	byte vector
`boolean`	true or false
`char`	character
`eof`	end-of-file object
`false`	boolean false
`fixnum`	word-sized integer
`float`	floating-point number
`list`	null or pair
`locative`	locative object
`null`	empty list
`number`	fixnum or float
`pair`	pair
`pointer-vector`	vector or native pointers
`pointer`	native pointer
`input-port` `output-port`	input- or output-port
`procedure`	unspecific procedure
`string`	string
`symbol`	symbol
`true`	boolean true
`vector`	vector

COMPLEXTYPE	meaning
`(pair TYPE1 TYPE2)`	pair with given component types
`(list-of TYPE)`	proper list with given element type
`(list TYPE1 ...)`	proper list with given length and element types
`(vector-of TYPE)`	vector with given element types
`(vector TYPE1 ...)`	vector with given length and element types

RESULTS	meaning
`*`	any number of unspecific results
`(RESULTTYPE ...)`	specific number of results with given types

RESULTTYPE	meaning
`undefined`	a single undefined result
`noreturn`	procedure does not return normally
VALUETYPE

(*) Note: no type-variables are bound inside (not TYPE).

Note that type-variables in forall types may be given "constraint" types, i.e.

 (: sort (forall (e (s (or (vector-of e) (list-of e))))
           (s (e e -> *) -> s)))

declares that sort is a procedure of two arguments, the first being a vector or list of an undetermined element type e and the second being a procedure that takes two arguments of the element type. The result of sort is of the same type as the first argument.

Some types are internally represented as structure types, but you can also use these names directly in type-specifications - TYPE corresponds to (struct TYPE) in this case:

Structure type	meaning
`u8vector`	SRFI-4 byte vector
`s8vector`	SRFI-4 byte vector
`u16vector`	SRFI-4 byte vector
`s16vector`	SRFI-4 byte vector
`u32vector`	SRFI-4 byte vector
`s32vector`	SRFI-4 byte vector
`f32vector`	SRFI-4 byte vector
`f64vector`	SRFI-4 byte vector
`thread`	SRFI-18 thread
`queue`	see "data-structures" unit
`environment`	evaluation environment
`time`	SRFI-18 "time" object
`continuation`	continuation object
`lock`	lock object from "posix" unit
`mmap`	memory mapped file
`condition`	object representing exception
`hash-table`	SRFI-69 hash-table
`tcp-listener`	listener object from "tcp" unit

Additionally, some aliases are allowed:

Alias	Type
`any`	`*`
`immediate`	`(or eof null fixnum char boolean)`
`port`	`(or input-port output-port)`
`void`	`undefined`

For portability the aliases &optional and &rest are allowed in procedure type declarations as an alternative to #!optional and #!rest, respectively.

Predicates

Procedure-types of the form (DOM -> RNG : TYPE) specify that the declared procedure will be a predicate, i.e. it accepts a single argument of type DOM, returns a result of type RNG (usually a boolean) and returns a true value if the argument is of type TYPE and false otherwise.

Purity

Procedure types are assumed to be not referentially transparent and are assumed to possibly modify locally held state. Using the (... --> ...) syntax, you can declare a procedure to not modify local state, i.e. not causing any side-effects on local variables or data contain in local variables. This gives more opportunities for optimization but may not be violated or the results are undefined.

Using type information in extensions

Type information of declared toplevel variables can be used in client code that refers to the definitions in a compiled file. The following compiler options allow saving type-declarations to a file and consulting the type declarations retained in this manner:

-emit-type-file FILENAME writes the type-information for all declared definitions in an internal format to FILENAME.

-types FILENAME loads and registers the type-information in FILENAME which should be a file generated though a previous use of -emit-type-file.

If library code is used with require-extension or (declare (unit ...)) and a .types file of the same name exists in the extension repository path, then it is automatically consulted. This allows code using these libraries to take advantage of type-information for library definitions.

Note that procedure-definitions in dynamically loaded code that was compiled with -strict-types will not check the types of their arguments which will result in unsafe code. Invoking such procedures with incorrectly typed arguments will result in undefined program behaviour.

Optimizations done by specialization

If argument types are known, then calls to known library procedures are replaced with non-checking variants (if available). Additionally, procedure checks can be omitted in cases where the value in operator position of a procedure call is known to be a procedure. Performance results will vary greatly depending on the nature of the compiled code. In general, specialization will not make code that is compiled in unsafe mode any faster: compilation in unsafe mode will omit most type checks anyway. But specialization can often improve the performance of code compiled in safe (default) mode.

Specializations can also be defined by the user:

define-specialization

[syntax] (define-specialization (NAME ARGUMENT ...) [RESULTS] BODY)

NAME should have a declared type (for example by using :) (this is currently not checked). Declares the calls to the globally defined procedure NAME with arguments matching the types given in ARGUMENTS should be replaced by BODY (a single expression). If given, RESULTS (which follows the syntax given above under "Type Syntax") narrows the result type(s) if it differs from the result types previously declared for NAME. ARGUMENT should be an identifier naming the formal parameter or a list of the form (IDENTIFIER TYPE). In the former case, this argument specializes on the * type. User-defined specializations are always local to the compilation unit in which they occur and can not be exported. When encountered in the interpreter, define-specialization does nothing and returns an unspecified result.

Note that the exact order of specialization application is not specified and nested specializations may result in not narrowing down the result types to the most specific type, due to the way the flow-analysis is implemented. It is recommended to not define "chains" of specializations where one variant of a procedure call is specialized to another one that is intended to specialize further. This can not always be avoided, but should be kept in mind.

Note that the matching of argument types is done "exactly". This means, for example, that an argument type specialized for list will not match null: even though null is a subtype of list and will match during normal flow-analysis, we want to be able to control what happens when a procedure is called with exactly with a list argument. To handle the case when it is called with a null argument, define another specialization for exactly that type or use an (or ...) type-specifier.

There is currently no way of ensuring specializations take place. You can use the -debug o compiler options to see the total number of specializations performed on a particular named function call during compilation.

compiler-typecase

[syntax] (compiler-typecase EXP (TYPE BODY ... [(else BODY ...)]) ...)

Evaluates EXP and executes the first clause which names a type that matches the type inferred during flow analysis as the result of EXP. The result of EXP is ignored and should be a single value. If a compiler-typecase form occurs in evaluated code, or if it occurs in compiled code but specialization is not enabled, then it must have an else clause which specifies the default code to be executed after EXP. If no else clause is given and no TYPE matches, then a compile-time error is signalled.

Caveats

Assignments make flow-analysis much harder and remove opportunities for optimization. Generally you should avoid using a lot of mutations of both local variables and data held in local variables. It may even make your code do unexpected things when these mutations violate type-declarations.

Note that using threads which modify local state makes all type-analysis pointless.

Previous: Modules

Next: Declarations