I am going to describe two current techniques for building web applications and services that produce/consume data either by Request-Response or by Publish-Subscribe.
All web browsers support HTTP which is inherently a request-response protocol typically built on top of TCP. It is used to serve HTML documents (for human consumption) but can also be used to serve other documents types such as raw text, XML, SVG or JSON (Javascript Object Notation). JSON documents are typically for machine consumption (as are XML and SVG) but with the advantage of being human friendly. HTTP supports a small, fixed set of methods for client-server interaction of which GET, POST, PUT, and DELETE are oft used. The addressing scheme used by HTTP is the familiar URL notation which identifies resources uniformly (hence Uniform Resource Locator). The methods are verbs and resources are nouns – this is an intentional design feature of HTTP.
REST is a term coined by Roy Fielding in his Ph. D. thesis about a decade ago which stands for Representational State Transfer. It is a design philosophy that exploits the the technical features of HTTP to build web-based services (aka Remote Procedure Calls). It has a strong emphasis on the noun-resource and verb-method correlation which distinguishes it from SOAP and XML-RPC. To design RESTful services, these four questions must be addressed:
- What are the URLs (resources)?
- What is the data format (state representation)?
- What methods are supported at each URL (verbs)?
- What status codes can be returned (400, 401, 404, 500, etc)?
Let’s consider a simple system which exposes its ethernet interfaces as resources.
1. URLs:
/resources/network/<interface name>
eg: /resources/network/eth02. State representation:
A JSON dictionary with the two attributes: ipv4_address, ipv4_netmask
3. Methods:
GET – to get the interface state
PUT – to modify the interface state (i.e. change interface address)4. Status codes:
GET – either OK (200) or Not Found (404)
PUT – OK (200), Not Found (404) or Bad Request (400)
POST and DELETE – Not Implemented (405)
As you can see, RESTful design lends itself to very clean and intuitive apis. An implication of using HTTP is that the browser must always initiate the requests. The server cannot push data to the browser at will. For fast changing state data, the browser must constantly poll the server. HTTP requests have considerable overhead, are short-lived, and most browsers limit the number of simultaneous active connections to the server. All of this means that there are performance limitations with HTTP polling.
An alternative data consumption model for rapidly changing data is publish-subscribe. The consumer (browser client) subscribes to the resource on the server. The server either publishes the data periodically or only when there is new data to publish. In either strategy, the browser only needs to process the data as it arrives and does not need to keep issuing requests to the server.
This would be easiest if HTTP had a ‘SUBSCRIBE’ method but the IETF and W3C have settled on an alternative specification for publish-subscribe capability on the internet. The name for this new specification is Web Sockets which aims to provide two-way, connection-oriented socket communication between the client and server. Arguably, bi-di sockets can lead to more powerful data transfer models than just publish-subscribe. While the specification is not finalized yet, preliminary support has landed in the Firefox trunk and Chrome unstable branches.
While sockets are a powerful interface, they are too low-level and most programmers would still prefer higher-level abstractions like publish-subscribe. There are several programming libraries out there that aim to provide these programmer abstractions. One of these is called Orbited. It is a combination client-side Javascript API and Twisted Python proxy server.
Orbited’s technical inspiration is Twisted, which can be both a good thing and a bad thing. Like Twisted, it supports several protocols, such as Echo, STOMP, IRC, XMPP, etc and it is extensible so new protocols can be added. A simple example of a STOMP client is given below:
// create the client object using the STOMP messaging protocol
client = STOMPClient();
// callback when connection is successful
client.onconnectedframe = function () {
// subscribe to the health channel
client.subscribe('/channel/health');
};
// callback when data is received
client.onmessageframe = function (data) {
// ... process the data
};
// ... other callbacks omitted for simplicity ...
// connect to the server:
client.connect('localhost', 61663);
The Orbited daemon is deployed on the server where it listens for incoming connections and forwards them based on its configuration. Eg: All port 80 traffic is sent to Apache/Quixote while all port 61663 traffic is sent to a Twisted daemon.
Another advantage of Orbited is its portability. It works just the same in browsers that do not have Web Socket support as it does in more bleeding-edge browsers. It does so in a transparent manner by emulating socket communication by using a series of fall-back strategies which are collectively known as Comet (and are at best, the bane of all web application developers due to their hackish nature). Thankfully Orbited abstracts out all of that complexity.
While I have focused on Orbited, I should also note that there are other solutions available out there. Eg: Tornado web server, Twisted Nevow, Strophe.js, Kaazing, etc. It’s not clear what advantages these offer over one another or which one is likely to come out ahead in the long term. However, the overall concepts are the same.
It also worth mentioning that Web Sockets are just one of the proposals under consideration by the IETF. The competing proposals include Reverse HTTP, Beep, and Asynchronous HTTP. However for now it seems that Web Sockets seem to be getting the most attention. Each of these proposals have unique strengths and weaknesses which is why the IETF committee is also weighing a completely alternative approach whereby HTTP proxies, servers, and clients will negotiate and select the best protocol available for bidirectional communication (eg: UDP). Suffice it so say, techniques for publish-subscribe on the web are evolving.
Further reading:
http://us.pycon.org/media/2010/talkdata/PyCon2010/021/PyCon2010_Restful_Web_Services_.pdf
http://www.thebitsource.com/tech-conferences/pycon-real-time-python-web-development/
Comments welcome.
Web Sockets: the right answer? http://bit.ly/c9cPOJ
Blog alert (for the techies)Techniques for Request-Response and Publish-Subscribe in the Browser http://ow.ly/1bXzd
How would your stomp client example handle authentication?
@brian I’m not sure what is the best way to handle authentication with STOMP but I suspect it will depend on the message queue being used. If you need advanced session management, I would recommend looking into XMPP/Jabber as well (although XMPP is quite complicated).
RT Techniques for Request-Response and Publish-Subscribe in the Browser « bitshaq http://rt.nu/pl6dv6
RT @slmnhq Techniques for Request-Response and Publish-Subscribe in the Browser « bitshaq http://rt.nu/bwrtzz