Caveats about Filesystems and the Driver
Since each filesystem has different behaviors it is difficult to guarantee particular performance characteristics about the drivers. For now the following are the caveats:
- No transactional support - it takes more effort to build in transactional support for bare filesystems - something we will not tackle for now
- No atomic writes for Windows filesystem - as Windows filesystem does not fully support atomic rename (Windows program holds lock on the file if it is opened during the rename and will cause an error), we also cannot make guarantee that writes will be successful in Windows. Under Unix variants we can guarantee atomic writes
- SQL statement mappings - for now we will not support SQL statements
- Prepared statements - since we are not supporting complex SQL mappings, there is no reason to have prepared statement capabilities; i.e. the
prepare
call will be a no op.
- manage all files within a single directory as the database data
- open files by path and return their contents
- listing the files in a particular directory (within the base directory)
- save data against a particular path (this can either be an "insert" or "update" operation) atomically on unix platform (might cause errors on Windows given the limitation of Windows platform)
- delete a particular file
- delete a particular directory
- create a particular directory (even without the intermediate directories)
Connect, Disconnect, Prepare, and Transaction Handling
Since we want to have a directory representing the root of the database, our database connection is really the root directory:
#lang scheme/base
(require (planet bzlib/base)
(planet bzlib/dbi)
)
(define (file-connect driver path)
(assert! (directory-exists? path)) ;; assert! comes from bzlib/base
(make-handle driver path (make-immutable-hash-registry) 0))
Disconnect is even more straight forward, since there isn't any external resources that have to be released:
(define (file-disconnect handle)
(void))
And since prepare is out of scope, it is also a NOOP:
(define (file-prepare handle stmt)
(void))
Furthermore, transaction support is also out of scope - we have more NOOPs:
(define (file-begin handle)
(void))
(define (file-commit handle)
(void))
(define (file-rollback handle)
(void))
The default transaction functions will not suffice here since they issue the corresponding SQL statements against the handle.
Assuming we have the corresponding
file-query
defined we now have a complete driver with:
(registry-set! drivers 'file
(make-driver file-connect
file-disconnect
file-query
file-prepare
file-begin
file-commit
file-rollback))
Now we just need to flesh out file-query
, which is the meat of the driver: List the Files In a Path
Let's do it one step at a time - and the first step would be to list the files in a path. Keep in mind that the path is a *path* within the root directory, and since we have control over the specifications, let's make the path appears as an absolute path with Unix syntax.
Example - let's say we want to check the files located at path
/foo/bar
, and the root directory of the database is /var/bzlib/data/
, then the combined path should be /var/bzlib/data/foo/bar
.The call through query would then look like:
(query handle 'list `((path . "/foo/bar"))) ;; notice it's a symbol
Which should then return a list of paths within
/foo/bar
(let's say there are 3 files, abc.txt
, def.txt
, ghi.txt
), which would look like:
'("/foo/bar/abc.txt" "/foo/bar/def.txt" "/foo/bar/ghi.txt")
Note the returned paths are also absolute paths within the database - yes, this filesystem database does not work with relative paths.
Allright - the following code will satisfy our list needs so far:
(require (planet bzlib/file))
(define (file-query handle stmt (args '()))
(define (path-helper path)
(if (not path)
(handle-conn handle)
(build-path* (handle-conn handle) path)))
(define (convert-path path)
(relative-abs-path (handle-conn handle) (path-helper path)))
(case stmt
((list)
(let ((path (path-helper (assoc/cdr 'path args))))
(if (directory-exists? path)
(map convert-path (directory-list path))
#f)))
(else
(error 'file-query "unknown statement: ~a" stmt))))
The not-yet released package bzlib/file
contains utility functions for manipulating paths and files on top of scheme/path
and scheme/file
, a couple of which are introduced here:build-path*
- used to build paths in similar fashion asbuild-path
, except the trailing segments can themselves be a full path instead of individual segments (i.e. the following is legal forbuild-path*
:(build-path* "/var/data" "/abc/def/ghi" "/foo/bar") ;; => "/var/data/abc/def/ghi/foo/bar"
relative-abs-path
: used to return the relative path against a base in the absolute form we specified above:(relative-abs-path "/var/data/" "/var/data/abs/def/ghi") ;; => "/abc/def/ghi"
By default,
bzlib/dbi
does not enforce any sort of return format, so you can simply return the results as you see fit. However, if you wish to use the query helper functions such as rows
, cell/false
, etc, you'll need to ensure the data are return as a list of rows, where each row is also a list of cells (the cells can be anything, with the scheme null as database NULL), and the first row is the list of column names. Their usages are also optional, but you'll need to inform your users whether your driver works with those helper functions.Of course - you can have your cake and eat it too, if you provide two separate drivers - one driver does not work with the query helper functions, but the second driver extends the first driver by wrapping around the query results and convert it into the recordset format. For our current example we can do the following:
(define (file-recordset-query handle stmt (args '()))
(let ((value (file-query handle stmt args)))
(case stmt
((list)
(if (not value) value
(map list (cons "path" value)))))))
(registry-set! drivers 'file/rs
(make-driver file-connect
file-disconnect
file-recordset-query
file-prepare
file-begin
file-commit
file-rollback))
Now the
'file/rs
driver will work with rows
, cell
, cell/false
, etc.Reading File(s)
Let's add one more capability in this post - let's read the files based on passed in path. And to make it interesting, we'll take in multiple paths, so something like this:
(query handle 'open `((path . "/abc.txt") (path . "/def.txt") (path . "/ghi.txt"))
;; => (listof bytes?)
Below accomplish the goal:
(define (file-query handle stmt (args '()))
(define (path-helper path)
(if (not path)
(handle-conn handle)
(build-path* (handle-conn handle) path)))
(define (convert-path path)
(relative-abs-path (handle-conn handle) (path-helper path)))
(define (get-paths)
(map path-helper (map cdr (filter (lambda (kv)
(equal? (car kv) 'path))
args))))
(case stmt
((list)
(let ((path (path-helper (assoc/cdr 'path args))))
(if (directory-exists? path)
(map convert-path (directory-list path))
#f)))
((open)
(map file->bytes (get-paths)))
(else
(error 'file-query "unknown statement: ~a" stmt))))
And to make it work for
'file/rs
driver, we should also update it correspondingly:
(define (file-recordset-query handle stmt (args '()))
(let ((value (file-query handle stmt args)))
(case stmt
((list)
(if (not value) value
(map list (cons "path" value))))
((open)
(map list (cons "content" value))))))
Now our drivers will read in all of the file contents as bytes and return them. Stay tuned for the addition of other features...
No comments:
Post a Comment