Tuesday, September 22, 2009

Network Development in PLT: Overview

One of the nice thing about PLT Scheme is that you can write native scheme network apps, rather than having to use FFI to interface with C libraries for the network apps. The nice thing about using native scheme approach is that it works with PLT's threads, unlike FFI, which blocks all PLT threads (and effectively synchronizes the network access). And according to Geoffrey, FFI also have the disadvantage of having to worry about whether it expects 32-bit or 64-bit pointers.

The challenge with network development though is that we seldom have to do it, so it's hard to know all of the ins and outs of network development. This series of posts will provide a tutorial of doing networking development. The goal is to keep the principles as general as possible so it applies toward other programming languages, but obviously this is a scheme blog.

General Concept

We'll quickly go over the basic network architecture concept for the sake of completeness.

In general, networking involves clients and servers interacting with each other by sending information to each other and (if necessary) wait for responses from the other party.

Clients are programs that sends out the requests, and servers are programs that "fulfill" the requests (and possibly send back a response).  A program can be both a client and a server at the same time.

The information they send each other are serialized to bytes, which will be reconstructed by the receivers into meaningful representations that they can interpret.

Depending on the nature of the work, clients and servers might only need to send information to each other once and be done (HTTP is one such protocol), but sometimes they need to communicate back and forth to accomplish the task (SMTP is one such protocol), and such situation it might be necessary for the client and the server to keep track of the state in order to manage the work.

So, at a high level architecture, we need to focus on the following in order to do network development:

  • manage the connection (initiation, termination, etc) 
  • serialize and send information to the other party 
  • receive information from the other party and interpret the meaning (and act accordingly)
  • track and manage the state if the protocol requires it 
The above applies to all network developments.  The specific details and the associated complexity comes down to the specific protocols you are developing.

Let's take a look at what PLT offers to help us handle each of the needs.

Network Connection Management 

By default PLT offers network programming in TCP, UDP, and SSL.  You can require them into your module:

(require scheme/tcp scheme/udp openssl)

We'll focus on just TCP network development in this tutorial since most network protocols will be TCP-based.

Initiate Connections 

If you are developing a client, you can initiate a client connection with:

(tcp-connect <host> <port>) ;; => input-port? output-port?

The function returns one input-port and one output-port for the communications.  If you need to send info to the server, write to the output-port; conversely, if you need to receive information from the server, read from the input-port.

If you are developing a server, then you can initiate a server with the following:

(define listener (tcp-listen <port>))
(tcp-accept listener) ;; => input-port? output-port?

The tcp-listen set the server up to "listen" at the port for connections, and the tcp-accept will basically block until there are connections from clients, at which an input-port and an output-port will be returned, just like tcp-connect.  In this case, if you need to send info to the client, write to the output-port, and read from the input-port to receive information from the client.

Closing Connections 

Since the TCP connections for both the client and the servers are abstracted into the ports, closing the ports equal closing the connections.

Only one side needs to close the connection to actually terminate it, of course, and this means that you'll have to handle situations where the connection is terminated unexpectedly by capturing exn:fail:network.  This can occur quite frequently in practice since many networks can experience outages, the server can be overwhelmed, etc.

To fully terminate the server (i.e. disassociate from the port) - you'll need to do the following:

(tcp-close listener)

PLT offers a function called tcp-abandon-port that allows you to close one port without closing the underlying connection (and keep the other port open).  This comes in handy in single request/response model (or the last request/response in a multi request/response model) where you no longer need to accept input or write output to the other party.

In Between 

The above are general for every network apps - all network apps need to initiate and terminate the connections.  What happens in between is completely dependent on the protocol.  But there are of course commonalities among the protocols:
  • most protocols fit the request/response model - request, then response (more complex protocols will have the server send *notifications* to clients that weren't initiated by the client) 
  • many protocols are line oriented - the commands are terminated by CRLF 
  • many protocols are text oriented - it means the numbers are written as decimals rather than as 2's complements 
  • for the binary protocols - most are in the network byte order of big endians
Here's where you will use all of the read and write related functions a lot.  And for some more complex protocol logics you might even find writing lexers and parsers helpful.  For us to know exactly what we would be doing here though, we need to have an example protocol.

For that - we'll try to write a memcached client.  We'll start with that in the next post.  Stay tuned.

No comments:

Post a Comment