Database Interface Discussion

Proposed Interface

Core Functionality

These functions should be optimized for speed.

[procedure] (dbi:connect TYPE #!key FILE PORT USER PASS etc)
Connect to a database. TYPE is a symbol, like mysql, postgres, sqlite, etc. Returns a connection object.

Possibly put the type, file, and port info into a URI-type string as JDBC does? That would make the whole thing a lot more extensible; it would be nice to do it both ways, having the separate keyword parameters also. -- vincent

[procedure] (dbi:query CONNECTION SQL)
Query the database (see the mysql egg)
[procedure] (dbi:num-rows CONNECTION)
Return the number of rows produced by the last query (see mysql)
[procedure] (dbi:fetch-row CONNECTION)
Return the next row (see mysql).
[procedure] (dbi:insert-id CONNECTION)
Returns the ID generated by the last insert statement.

I'd like to see support for prepared statements. Especially in transactional environments, on some DBMSs they can result in big performance wins. And a DBMS that doesn't support them can easily implement stubs (e.g., prepare creating a record that holds the original string). Also, another vote for streams. -- vincent

Convenience Functions

These functions should be optimized for convenience in i.e. quick and dirty scripts. Perhaps these types of functions should be their own library on top of DBI? Perhaps there are even more convenient functions we could have?

[procedure] (dbi:query-list CONNECTION SQL)
Returns a list of result rows.
[procedure] (dbi:query-fold CONNECTION PROC SQL SEED)
fold
[procedure] (dbi:query-map CONNECTION PROC SQL)
map
[procedure] (dbi:query-for-each CONNECTION PROC SQL SEED)
for-each

Stream Functions

[procedure] (dbi:query-stream CONNECTION SQL)
Return an SRFI-40/SRFI-41 stream of rows.

Open issues

Row Representation

How are rows returned from a query represented?

Ozzi proposes either an association list or a plain list.

Plain list or vector would be my choice. There should be a separate way of getting the field names. Also, there's the issue of how field values are represented. For those data types that correspond to Scheme values, the obvious conversion would be fine; for other datatypes it may not be so obvious.There also needs to be a blob API, so that one can read a piece of a blob without having to ship the whole thing over the network. -- vincent

Null Representation

How are null values represented? We need an object that corresponds to SQL null. I don't think this object should be nil. -- vincent

Return values

What should be returned from query functions. For example, could we return the insert id of a row instead of having a separate function for that? How about the number of rows in a result?

Type Conversion

Following is a rough comparison of how the existing eggs do this. It needs to be fleshed out some more.

Type Postgresql MySQL Sqlite
TEXT string string string
BYTEA string string -
CHAR char string -
BPCHAR char string -
BOOL boolean string -
INT8 fixnum or inexact string ?
INT4 fixnum or inexact string ?
INT2 fixnum string ?
FLOAT8 inexact string ?
FLOAT4 inexact string ?
ABSTIME unsupported (returned as text) string -
RELTIME unsupported (returned as text) string -
DATE vector string -
TIME vector string -
TIMESTAMP vector string -
TIMESTAMPTZ vector string -
INTERVAL unsupported2 string -
NUMERIC fixnum or inexact string ?
OID fixnum or inexact string -
NULL pg:sql-null-object #f (void)