There are not a lot of choices of relational database interfaces for PLT - part of the reason is just that the community is not yet the size of Perl or Python, so not as many infrastructure packages yet.
However, PLT does have the interfaces for three of the most important open source databases -
postgresql,
mysql, and
sqlite, so your open source database needs are likely addressed. But the challenge is that each sports a different interface, so you couldn't really abstract the database away to the degree that you can in other languages.
sqlid was probably the closest thing for PLT to have a single interface and multiple drivers, but it hasn't appeared to be active for a while, so I decided to take the plunge to create an abstracte database interface so all three packages can function uniformly. Hopefully it can attract collaborations from driver writers.
Intoducing bzlib/dbi - extensible abstract database interface for PLT
Inspired by
Perl's DBI, the package
bzlib/dbi
follows the separation between the interface and the driver, so if you want to use a particular driver, you just need to require:
(require (planet bzlib/dbi)
(planet bzlib/dbd-<driver>))
Currently there are three such drivers available:
All of the four packages are released under LGPL.
To connect to the database, you just need to issue
connect
and pass it the symbol that identifies the driver, along with the arguments expected by the underlying driver. Once you have required
bzlib/dbi
and the appropriate driver, for example, to connect against
schematics/spgsql
:
(define handle (connect 'spgsql '#:server "localhost" '#:port 5432 ...))
And to connect against
jaymccarthy/sqlite
:
(define handle (connect 'jsqlite ':temp:)) ;; creates a temp database
To connect against
jaz/mysql
:
(define handle (connect 'jazmysql "localhost" ...))
Note that if the underlying driver expects keyword arguments, pass in the keyword with a quote, so PLT will treat it as an argument rather than a keyword for
connect
(which will result in an exception, since
connect
has no keyword arguments).
To disconnect, just do:
(disconnect handle)
Query and Named Parameters
The goal of
bzlib/dbi
is to keep the interface as simple as possible, so there is currently only one function that handles the queries:
(query handle "select * from ..." args)
The arguments is a list of pairs of symbol and values. It is currently not optional (so if you have no arguments you would still have to pass in
'()
).
The query statements takes in
named parameteres instead of ordinal parameters, so instead of writing:
select * from table1 where c1 = ? and c2 = ?
You write
select * from table1 where c1 = ?c1 and c2 = ?c2 -- ?c1 & ?c2 are named parameters.
Then you pass in
'((c1 . val1) (c2 . val2))
as the argument.
The nice thing about named parameters is that you do not have to worry about matching the order of the parameters, and you only need to specify the value once if you have to pass in the same value to multiple positions. The name can consist of regular alphanumeric characters, underscores, dashes, and exclamation mark.
bzlib/dbi
currently standardizes on using question mark as the placeholder and translates the query and the parameters to their corresponding version for each of the drivers. This can be customized.
Prepared Statements
bzlib/dbi
also provides the ability to create prepared statements:
;; (prepare handle name-symbol query)
;; example
(prepare handle 'select-table1 "select * from table1 where c1 = ?c1 and c2 = ?c2")
(query handle 'select-table1 `((c1 . ,val1) (c2 . ,val2)))
Basically - the prepared statement is created and kept along with the handle, and identified via the name symbol. And it is run via the same
query
interface by passing in the name symbol (instead of the query) along with the arguments.
Unlike other database interfaces where the prepared object is returned to the caller for passing around,
bzlib/dbi
hides them to 1) simplify the interface, and 2) minimize the chance of error when the same underlying handle is used by multiple threads at once - I settle on this design choice when
running into issues with a prepared statement run by a separate thread where the handle was terminated by its originating thread so you wouldn't have to manage the same object with multiple threads.
Transactions
You can trigger transactions manually with the SQL statement.
(begin-trans handle)
(commit handle)
(rollback handle)
Or you can use the
with-trans
syntax, which wraps the transaction management with exception handling - so if there are errors triggered it would automatically cause rollback of the transaction.
(with-trans (handle1 ...)
exp ...)
You can wrap multiple database handles and if one errors all will be rolled back - this is a simple distributed transaction.
Data Type Mapping
The data type mapping depends on the underlying driver, but the following are the mapping for the general types:
- integer <-> sql integer
- number <-> sql float
- srfi/19 date time <-> sql date/time
- string <-> sql string/text
- bytes <-> sql binary/blobs
- null <-> sql null
For
bzlib/dbd-spgsql
, there is also a mapping for the array type as a list.
The recordset is returned as a list of lists. The first row holds the column names (so there will always be one row in a select query even if there are no results).
Additional Query Helpers
The following provides additional helpers that wraps around
query
to simplify the developer's tasks in certain situations:
exec
- used for non select queries
rows
- strips the column names and just return the rows
row
- returns the first row (and raise error if no rows)
row/false
- returns the first row or #f
if there are no rows
exists?
- a shorthand for row/false
(looks more natural for people used to SQL)
cell
- returns the first cell of the first row or throw an error if there are no rows
cell/null
- returns the first cell of the first row or returns null
cell/false
- returns the first cell of the first row or returns #f
All of the above helpers also takes in an optional converter parameter right after the args parameter, so you can directly process the result before it returns. The converter should be a function that can take in as many arguments as there are columns in the query. The full signature of these helpers look like:
(query-helper handle symbol-or-query key/val-args converter)
Extensible Database Interface
An important principle of the design is to make the database interface extensible, so code can be reused and new database driver can be developed more easily (of course it would still require intimate knowledge with the particular database's protocol). Hence the interface is extensible with the chaining of drivers.
The following drivers are currently available for extending the capability of the three database drivers.
- app - provides a kill-safe interface over the raw database driver
- pool - provides a database connection pool interface over the raw database driver
Kill-Safe App Driver
To make use of the kill-safe app driver, you need to pass in the
'app
driver identifier to
connect
:
(define kill-safe-handle (connect 'app #f <inner-driver-args> ...))
So for example, if you want to make
schematics/spgsql
kill-safe, use the following:
(define handle (connect 'app #f 'spgsql '#:server "localhost" '#:port 5432 ...))
And if you want to do so with
jaymccarthy/sqlite
(which doesn't do much since
jaymccarthy/sqlite
is FFI-based and already serializes all threads):
(define handle (connect 'app #f 'jsqlite "database.db"))
What happens is that
bzlib/dbi
will initiate the app driver, which then initiates the underlying database driver. The
#f
arg is the argument that tells the app driver whether there is a pool driver that helps handle the response. You only need to explicitly pass this value in if you are initiating app driver directly.
Database Connection Pool
The database connection pool driver further builds on top of the app driver and manages the database connections. It works in conjunction with the app driver. For example - to initiate a database pool of maximum 10 database connections:
(define pool-handle (connect 'pool 10 'spgsql '#:server "localhost" '#:port 5432 ...))
You do not need to specify the app driver in this case, since the pool driver does that for you internally.
The pool driver is a bit special in that it does not directly handle the queries, but offers up one of its connections to the caller and let the caller use the connection until the caller thread dies. To retrieve a connection from the pool driver, do:
(query pool-handle 'connect '())
which would return the underlying app driver handle.
The app and pool driver demonstrates how
bzlib/dbi
can be extended. You can certainly extend it further to fulfill your needs (an example of possible future extension would be to create a load balancer handle that sits in front of a database cluster).
TODO
There are a lot more the design of
bzlib/dbi
can accomodate, but one of the important thing to do is to collaborate with the driver writers so there can be a joint effort. Since
bzlib/dbi
is the new kid on the block, the effort will have to come later after people have chance to test out the interface. But now you can switch between the three major database with relative ease.