Tuesday, September 29, 2009

Handling Time Zone in Scheme (3): Computing Offsets

This is the third post of the timezone handling series. Please see the previous posts for additional details:
  1. motivation and overview
  2. parsing zoneinfo database
At this point we are able to parse the zoneinfo database files into a giant structure that contains all of the applicable zones and rules. Let's see how to utilize the data.

There are two main purposes for using the time zone information, the first is to figure out the actual offset for a given zone for a particular date:
(tz-offset (build-date 2009 7 1 ...) "America/Los_Angeles") ;; => -25200 
(tz-offset (build-date 2009 12 1 ...) "America/Los_Angeles") ;; => -28800 
And the second use is to convert between different timezones of the same date:

(tz-convert (build-date 2009 7 1 0 0 0) "America/Los_Angeles" "GMT") 
;; => (date 2009 7 1 7 0 0 0 0) 
(tz-convert (build-date 2009 7 1 0 0 0) "America/Los_Angeles" "Asia/Taipei") 
;; => (date 2009 7 1 15 0 0 0 28800)

It should be clear that tz-convert can be built on top of tz-offset, so we first should figure out how tz-offset to compute the offsets.

Basic Idea
The zoneinfo database structure more or less denotes how we should be making use of the library:
  1. look up the zone - find the zone that you are looking for
  2. based on the zone, map to the correct set of rules by testing which rules is applicable (based on the UNTIL field of the zone value)
  3. apply the set of rules against the date to find the proper daylight saving offset

With that in mind, let's first load the parsed structure into something more useable. Let's convert it into a hashtable loading the zones by name, which in turn holds the references to the applicable rules, off of which we can then use to compute actual offsets. Assuming the zoneinfo database has been serialized to a file, then the following would load the data back in:

(define (make-zoneinfo zi)
  (make-links (make-zones (caddr zi) (make-rules (cadr zi)))
              (cadddr zi)))

(make-zoneinfo (call-with-input-file path read))
Basically - we'll first load the rules, and then pass the rules into make-zones, which will then be passed to make-links to complete building the hashtable.

With the following structure definitions for zone and rule, we have:

(define-struct zone (offset rule format until) #:transparent)

(define-struct rule (from to type month date time offset format) #:transparent)

(define (make-rules rules)
  (make-immutable-hash (map (lambda (kv)
                              (cons (car kv)
                                    (map (lambda (rule)
                                           (apply make-rule rule)) 
                                         (cdr kv))))
                            (cdr rules))))

(define (make-zones zones rules)
  (make-immutable-hash (map (lambda (kv)
                              (cons (car kv)
                                    (map (lambda (zone)
                                           (define (helper offset rule format until)
                                             (make-zone offset (hash-ref rules rule #f) format until))
                                           (apply helper zone))
                                         (cdr kv))))
                            (cdr zones))))

And make-link left-folds over the zones and the links list:


(define (make-links zones links)
  (define (helper zones key rest)
    (if (null? rest) 
        zones
        (helper (hash-set zones (car rest) 
                          (hash-ref zones key)) 
                key (cdr rest))))
  (foldl (lambda (link zones)
           (let ((key (car link)) 
                 (links (map car (cdr link))))
             (if (hash-ref zones key #f)
                 (helper zones key links)
                 zones)))
         zones 
         (cdr links)))

Now we'll be able to refer to a specific zone by (hash-ref zones "America/Los_Angeles"), which will return a list of zone structure that in turn contains the applicable rule lists.

Computing Standard Offset

Now we have the structure, let's try to compute the offsets. The easiest type of offsets to compute is the standard offsets (i.e. without daylight savings applied). All we need to do is to ref the zones, and then figure out which duration that the date falls into, and finally retrieve the offset value:

(define zones (load-zoneinfo)) 

(define (tz-standard-offset date zone-name)
  (define (until/offset-helper until offset)
    (define (until->date year month day hour minute second type)
      (date->seconds (build-date year month day hour minute second #:tz offset)))
    (cons (if (not until) 
              +inf.0
              (apply until->date until))
          offset))
  (define (match-until/offset-helper date u/o)
    (cond ((null? u/o) 
           (error 'tz-standard-offset "invalid zone ~a for date ~a" zone-name date))
          ((<= date (caar u/o))
           (cdar u/o)) 
          (else
           (match-until/offset-helper date (cdr u/o)))))
  (define (helper date zone) 
    (let ((u/o
           (sort 
            (map until/offset-helper
                (map zone-until zone)
                (map zone-offset zone))
            (lambda (u/o1 u/o2) 
              (<= (car u/o1) (car u/o2))))))
      (match-until/offset-helper (date->seconds date) u/o)))
  (helper date (hash-ref zones zone-name))) 
The until/offset-helper computes the list of until (in terms of seconds from epoch) and offset pairs from the zones. Then the match-until/offset-helper will iterate through the pairs to find one where the date is less than the until (and because the until/offset is sorted by the offset incrementally, it'll find the correct zone) value, which will be our offset.

Calculating the Daylight Saving Offset

Calculating the daylight saving offset, unfortunately, is quite a bit more complicated, but the tz-standard-offset gives us a starting point, because it helps us find the applicable zone that we are interested. What we need to do is that instead of returning the until/offset pair, we want to return the until/rules pair.

(define (tz-daylight-saving-offset date zone-name)
  (define (until/rules-helper until offset rules)
    (define (until->date year month day hour minute second type)
      (date->seconds (build-date year month day hour minute second #:tz offset)))
    (cons (if (not until) 
              +inf.0
              (apply until->date until))
          rules))
  (define (match-until/rules-helper date u/r)
    (cond ((null? u/r) 
           (error 'tz-standard-offset "invalid zone ~a for date ~a" zone-name date))
          ((<= date (caar u/r))
           (tz-rules-offset (seconds->date date) (cdar u/r))) 
          (else
           (match-until/rules-helper date (cdr u/r)))))
  (define (helper date zone) 
    (let ((u/r
           (sort 
            (map until/rules-helper
                (map zone-until zone)
                (map zone-offset zone)
                (map zone-rule zone))
            (lambda (u/r1 u/r2) 
              (<= (car u/r1) (car u/r2))))))
      (match-until/rules-helper (date->seconds date) u/r)))
  (helper date (hash-ref zones zone-name)))

The code take the same approach - find the correct rule to apply, and then try to apply the rules against the date (by using tz-rules-offset, which we have not yet written).

Determine the Applicable Rules

Unfortunately, which at a glimpse the rule structure look similar to the zone structures, they are nowhere as easy to work with. For starters, zones nicely form a range of time that we can simply walk through by comparing the dates, but rules do not. Instead, they simply describes when a particular offset will apply by a particular time within a year, but we cannot easy to determine which is the next rule that'll apply. Instead, we must calculate them.

Since each rule has a FROM and TO field that denotes the years for which the rule should be applicable, it's easy to know whether or not a particular rule should be considered:

(define (applicable-rules date rules)
  (define (applicable? rule)
    (<= (rule-from rule) (date-year date) (rule-to rule)))
  (filter applicable? rules)) 
But this alone is not enough. Consider the following two rules for America/Los_Angeles:

  (2007 +inf.0 - 3 (match 0 >= 8) (2 0 0 w) 3600 "D")
  (2007 +inf.0 - 11 (match 0 >= 1) (2 0 0 w) 0 "S")
They specifies that on 2009, the offset of +3600 starts on 3/8/2009, and then the offset of 0 starts on 11/1/2009. So what should be the offset for 1/8/2009? For that we'll have to calculate the applicable rule from the previous year, 2008, which will tell us starting from 11/2/2008, the offset should be 0. This means we'll have to figure out the applicable of the previous year as well (and we should pair the year with the rule so we won't get confused later):

(define (applicable-rules date rules)
  (define (by-year year rules)
    (define (applicable? rule)
      (<= (rule-from rule) year (rule-to rule)))
    (map (lambda (rule)
           (cons year rule)) 
         (filter applicable? rules))) 
  (append (by-year (date-year date) rules)
          (by-year (sub1 (date-year date)) rules))) 

At this moment we have the applicable rules - we'll have to figure out the exact time when the rules applies. For that it means we'll need to be able to convert the rule itself into a particular date structure. We'll talk about that in the next post. Stay tuned.

No comments:

Post a Comment