pieterh wrote on 06 Jun 2011 19:14
Unprotocol enlightenment comes one step at a time. While it can be tempting to try to use generic solutions, it's often wiser to divide problems into classes, each with an optimal solution. Let's look at Cheap and Nasty, two essential patterns for unprotocol design.
Cheap is a pattern for low-volume chatty dialogs, and Nasty is a pattern for high-volume data flows. Realistic unprotocols often need to combine both chatty dialogs and high-volume data flows, and it's a common mistake — which I'm well aware of, having made it more than once — to try to use a single design to cover both these.
Let's start with Cheap. Cheap is essentially synchronous, verbose, descriptive, and unstable. A Cheap message is full of rich information that changes practically for each application. Your goal as designer is to make this information easy to encode and to parse, trivial to extend for experimentation or growth, and highly robust against change both forwards and backwards. The ideal Cheap pattern looks like this:
- It uses a simple self-describing structured encoding for data, be it XML, JSON, tnetstrings, or some other. Any encoding is fine so long as there are standard simple parsers for it.
- It uses a straight request-reply model where each request has a success/failure reply. This makes it trivial to write correct clients and servers for a Cheap dialog.
- It doesn't try, even marginally, to be fast. Performance doesn't matter when you do something once a minute, or once a second.
A good example of a Cheap protocol would be HTTP. But over ØMQ, you can create a Cheap protocol relatively trivially. A Cheap parser is something you take off the shelf, and throw data at. It shouldn't crash, shouldn't leak memory, should be highly tolerant, and should be relatively simple to work with. That's it.
Now let's look at Nasty. Nasty is essentially asynchronous, terse, silent, and stable. A Nasty message carries minimal information that practically never changes. Your goal as designer is to make this information ultrafast to parse, and possibly even impossible to extend and experiment with. The ideal Nasty pattern looks like this:
- It uses a hand-optimized binary layout for data, where every bit is precisely crafted.
- It uses a pure asynchronous model where one or both peers send data without acknowledgments (or if they do, they use sneaky asynchronous techniques like credit-based flow control).
- It doesn't try, even marginally, to be friendly. Performance is all that matters when you are doing something several million times per second.
A good example of a Nasty protocol would be ØMQ's publish-subscribe protocol, which is so brutally terse that it feels more like a weapon invented by Iain M. Banks than a network protocol. A Nasty parser is something you write by hand, which writes or reads bits, bytes, words, and integers individually and precisely. It rejects anything it doesn't like, does no memory allocations at all, and never crashes.
You'll see people trying to use Cheap everywhere ("everything is an XML message!"), or Nasty for everything ("Our wire level protocol is 100% binary!"). Or, some intermediate solution that is neither Cheap, nor Nasty ("we use protobufs for everything!"). The results will be mediocre. Don't be afraid of going to extremes, when you're writing software that has to be extremely good.
Remember: Cheap for the chatty stuff, Nasty for the real work. Enjoy!
Comments