Wiki
Download
Manual
Eggs
API
Tests
Bugs
show
edit
history
You can edit this page using
wiki syntax
for markup.
Article contents:
== Allocating C Structures Under the Control of the CHICKEN GC [[toc:]] This tip describes how to allocate C structures such that they will be under the control of the CHICKEN garbage collector. CHICKEN has an unusual garbage collection scheme, roughly following the procedure described in [[http://home.pipeline.com/~hbaker1/CheneyMTA.html|Cheney on the MTA, Part II]]. Please keep in mind the manual sections [[/manual/Data representation|Data representation]] and [[/manual/Interface to external functions and variables|Interface to external functions and variables]] while you read this. The code in this section will be a mix of C and Scheme; recall that you can embed C code directly in your Scheme programs using {{(foreign-declare CODE-STRING)}}. === CHICKEN GC Basics The CHICKEN GC expects all non-immediate objects to have a "tag" value in their first {{sizeof(C_header)}} bytes. This tag stores three pieces of information: * The type of the object (i.e. {{C_STRING_TYPE}}, {{C_VECTOR_TYPE}}, etc). * Various GC tags (one which is relevant to us is {{C_BYTEVECTOR_BIT}}). * The size of the object (in bytes if it's a bytevector type, or in words if it's not). The manual section [[/manual/Data representation|Data representation]] explains this. === Allocating Atomic Data "Atomic" data are data which contain no pointers. For example, suppose we want to have the following C structure definition for point-mass bodies in a three-dimensional gravitational simulation: <enscript highlight=c> typedef struct { C_header tag; double m; double q[3]; double p[3]; } body; </enscript> Note that a {{body}} structure contains no pointers. We ''could'' use a declaration like <enscript highlight=scheme> (define-foreign-record body (scheme-object tag) (double m) (double q 3) (double p 3)) </enscript> to represent this object within CHICKEN. Unfortunately, if we do that, the procedures {{body-m}}, {{body-q}}, etc, will expect to have foreign pointer objects which contain a pointer to a {{body}} object. But, this body object must be allocated outside the GC-managed heap, because the pointer in a foreign-pointer is not scanned during garbage collection. To manage it, we must either set a finalizer for the pointer object which contains it (using {{set-finalizer!}}) or deallocate it by hand. Neither of these is an attractive option for lots of data (maybe we're running a [[http://www.mpa-garching.mpg.de/galform/millennium/|Big Simulation]]). So, we'll have to do things by hand. First, let's define the body tag. Since bodies will be fixed-size bytevectors, we'll use <enscript highlight=c> static const C_header BODY_TAG = (sizeof(body) - sizeof(C_header)) | C_BYTEVECTOR_BIT | C_STRING_TYPE; </enscript> This stores the size of the data in a body struct (minus the tag size), sets the bit which tells the GC that we're dealing with a bytevector, and gives a body the (arbitrary) scheme type {{string}}. Now we define a {{make-body}} Scheme procedure, which takes a mass, f64vector position, and f64vector momentum and produces a body: <enscript highlight=scheme> (define make-body (foreign-primitive scheme-object ((double m) (f64vector q) (f64vector p)) #<<END body result; result.tag = BODY_TAG; result.m = m; memcpy(result.q, q, 3*sizeof(double)); memcpy(result.p, p, 3*sizeof(double)); C_return (&result); END )) </enscript> Note the return type of {{scheme-object}}, since a {{body}} is a scheme object. Note also, that we allocate {{result}} on the stack, and "return" the address of {{result}} at the end of the function. (The {{foreign-primitive}} actually never returns; it calls its continuation with the return value.) We can define a {{body?}} predicate. This predicate is not exact---it cannot distinguish between bodies and strings of the same length---but there is no way to make it exact with CHICKEN's memory model as it stands now. <enscript highlight=c> static C_word body_p(C_word obj) { if (C_immediatep(obj)) { return C_SCHEME_FALSE; } else if (C_block_header(obj) == BODY_TAG) { return C_SCHEME_TRUE; } else { return C_SCHEME_FALSE; } } </enscript> and <enscript highlight=scheme> (define body? (foreign-lambda scheme-object "body_p" scheme-object)) </enscript> Defining accessors is similarly straightforward. === Allocating non-atomic data "Non-atomic" data are data which contain only pointers. Suppose we have a binary tree datastructure which can store bodies from the above example (contrived, I know, but it should illustrate the technique): <enscript highlight=c> struct btree_struct { C_header tag; body *b; struct btree_struct *left; struct btree_struct *right; }; typedef struct btree_struct btree; </enscript> Now we define a tag which indicates that this is a vector-like object (in fact, it will be a Scheme vector), and its size: <enscript highlight=c> static const C_header BTREE_TAG = ((sizeof(btree) - sizeof(C_header)) / sizeof(C_word)) | C_VECTOR_TYPE; </enscript> Note that, for non-bytevector objects, the size part of the header is the size in {{C_word}}s, not bytes. From here, things follow the examples in the last section. === Allocating mixed data You can't. Sorry. Break your data up into one atomic struct which stores all the non-pointers, and then a "pointer" struct which stores a pointer to the atomic data, and all other pointers involved in your datastructure. === Allocating large data This is also not a good idea---it's easy to overflow the stack (which is the nursery in the generational GC that CHICKEN uses). It's non-trivial to allocate some data directly in the second generation. If you absolutely must do this, then have a look at {{C_allocate_vector}} in the {{runtime.c}} file of the CHICKEN distribution. Good luck. === Why Would You Want to do This? ==== Speed CHICKEN's allocation is '''fast'''. Much faster than {{malloc}} and {{free}}---it just requires bumping the stack pointer. So, if you're looking for speed (which you probably are if you're writing C code for CHICKEN) you're much better off using it than trying to {{malloc}} your datastructures and stuff them into CHICKEN's {{pointer}} type. ==== Convenience If you {{malloc}} your datastructures, you'll have to either * Explicitly {{free}} them when you're done OR * Register a finalizer for them using {{set-finalizer!}} The first option is not attractive if you have complicated data with lots of pointers between structures (that's why GC's were invented, after all). The second option is OK for a couple of objects, but CHICKEN doesn't play very well with millions of finalizers registered. If you want to allocate a lot of objects, don't register finalizers for them.
Description of your changes:
I would like to authenticate
Authentication
Username:
Password:
Spam control
What do you get when you subtract 24 from 4?