Thursday, September 17, 2009

Create a Driver for bzlib/dbi (1) - DBI Internals

Since we claimed bzlib/dbi as an extensible abstract database interface for PLT Scheme, let's take a look on how to extend it.  As an exercise we'll create a database driver that uses the filesystem as the database. 

Motivation 

What we want is to use the filesystem as a simple form of database, where the path of the file is the key, and the content is the value.  We should be able to select the data, update, delete, and insert new data.  Let's see what we can come up with.

Overview of Creating an Database Driver

The first step is to require bzlib/base (contains bzlib/base/registry) and bzlib/dbi (contains the interface that we need to extend):

#lang scheme/base
(require (planet bzlib/base) (planet bzlib/dbi)) 

Then we need to create four functions for the driver that matches the following signature:

(connect driver . args) ;; a connect function that takes in the database driver definition, and a variable list of arguments
(disconnect handle) ;; a disconnect function that takes in the database handle
(prepare handle statement) ;; a prepare function that takes in the database handle and a statement
(query handle statement args) ;; a query function that takes in the database handle, the statement (or the key to the prepared statement, and the args as a key/value plist

The above four functions are sufficient for creating a database driver, and if we need to provide custom transaction functions, we will customize three additional functions:

(begin-trans handle) ;; to provide the begin transaction clause for the database handle
(commit handle) ;; to provide the commit transaction clause for the database handle
(rollback handle) ;; to provide the rollback transaction clause for the database handle 

Overriding the transaction functions is an optional step as you can use the default version, which are named default-begin, default-commit, and default-rollback.  All of the default version basically issues the corresponding SQL statements to the underlying database connection.

Then we just need to register the functions as a database driver:

 
;; we'll explain meaning of registry-set! and the driver structure later 
(registry-set! drivers <driver-name-symbol> 
          (make-driver <connect-function>
                       <disconnect-function>
                       <query-function>
                       <prepare-function>
                       <begin-trans-function or default-begin>
                       <commit-trans-function or default-commit>
                       <rollback-trans-function or default-rollback>))

Voila - we now have a new database driver.   Let's say we pass in 'file as the <driver-name-symbol, then we can use the driver with the following:

#lang scheme/base
(require (planet bzlib/dbi) <your-driver-module-path>)
(define handle (connect 'file <arg1> ...)) 

The above form the basis of writing a database driver.

Note - Since bzlib/dbi wraps around the calls to the underlying driver, its full behavior is largely determined by the underlying driver.  For example - the driver might not support SQL queries, transactions, or even atomicities.  Once our example is fully fleshed out, we'll see the constraint of the underlying driver's capability have on the bzlib/dbi.

Before we proceed to write the driver, we should finish defining the definiton of driver, handle, etc., so we will know how to use them in our custom driver:

Database Driver Internals

The database driver structure consists of the connect, disconnect, query, prepare, begin-trans, commit, and rollback function we passe in:

(define-struct driver (connect disconnect query prepare begin commit rollback))  

And the make-driver function above shows its usage.  And the driver is the first parameter passed into your custom connect function, which should return a database handle structure.

The database handle structure consists of four properties: the underlying database driver, the actual database connection object (can be anything), a registry (an extensible dictionary) of the prepared queries, and a mutable transaction count:

(define-struct handle (driver conn query (trans #:mutable))) 

Registry - Extensible Dictionary 


Registry is an extensible dictionary that holds other dictionary objects called registry-table (it can be anything resembling key/value storage, even if the keys are not unique) in bzlib/base/registry, and has the following interface:

(define-struct registry (get set del (table #:mutable)))  
(registry-ref registry key default) ;; returns the underlying registry value or default (just a value, not a thunk)
(registry-set! registry key value) 
(registry-del! registry key) 

And by default there are three different registries provided by bzlib/base/registry:

(make-hash-registry) ;; returns a registry with a mutable hash as the underlying registry table
(make-immutable-hash-registry) ;; returns a registry with an immutable hash as the registry table
(make-cond-registry) ;; returns a registry with a list of pair of procedures as the underlying registry table - useful for constructing extensible conditional statement. 

By default, make-immutable-hash-registry is probably most useful in crafting database driver.

Prepared Queries and Placeholder Queries 


The handle registry holds a list of prepared queries, which holds the query and the "prepared" version of the query:

;; query - holds the phq - placeholder query object
;; inner - holds the wrapped prepared object (or anything else that represents the actual prepared statement) 
(define-struct prepared (query inner)) 

The prepared-query field holds a parsed version of the string query called phq (stands for placeholder-query), which holds the following fields:

;; query - the original raw query
;; converted - the converted version of the query - generally converting from named placeholder to ordinal placeholder
;; args - a list of the named placeholder so we can map the named arguments into their ordinal positions 
(define-struct phq (query converted args))  

The prepared structure is designed to wrap around existing prepared statement objects that schematics/spgsql and jaz/mysql packages provide, but you can of course extend the structure to hold other values as you see fit, as long as your driver code tie all of the above together correctly.  The same goes for the rest of the structures.

In the next installment we'll talk about how to tie all of them together in our quest for a simple filesystem-based database driver.  Stay tuned.

No comments:

Post a Comment