pieterh wrote on 29 Mar 2014 13:12
While ZeroMQ gives you a powerful communications engine to use in many different ways, building a conventional server is still fairly solid work. We use the ROUTER socket for that, managing the state for each individual client connection. Today I'll present a new tool — zproto — that generates whole servers, in C, from state machine models.
Quick Background to zproto
zproto is based on work I did for Chapter 7 of the ZeroMQ book, used in FileMQ, Zyre, and several other projects. It's a collection of code generation tools that take models and turn them into perfect code.
iMatix used to do code generation as our main business. We got… very good at it. There are lots of ways to generate code, and the most powerful and sophisticated code generator ever built by mankind lives on Github.com at imatix/gsl. It's an interpreter for programs that eat models (self-describing documents) and spew out text of any shape and form.
The only problem with sophisticated magic like GSL is that it quickly excludes other people. So in ZeroMQ I've been very careful to not do a lot of code generation, only opening that mysterious black box when there was real need.
The first case was in CZMQ, to generate the classes for ZeroMQ socket options. Then in CZMQ, to generate project files (for various build systems) from a single project.xml file. Yes, we still use XML models. It's actually a good use case for simple schema-free XML.
Then I started generating binary codecs for protocols, starting with FILEMQ. We used these codecs for a few different projects and they started to be quite solid. Something like a protobufs for ZeroMQ. You can see that the generated code looks as good as hand-written code. It's actually better: more consistent, with fewer errors.
Finally, I drew back on an even older iMatix speciality, which was state machines. My first free software tool was Libero, a great tool for designing code as state machines and producing lovely, solid engines in pretty much any language. Libero predates GSL, so isn't as flexible. However it uses a very elegant and simple state-event-action model.
The Libero model is especially good at designing server-side logic, where you want to capture the exact state machine for a client connection, from start to end. This happens to be one of the more solid problems in ZeroMQ architecture: how to build capable protocol servers. We do a lot of this, it turns out. You can do only so much with low-level patterns like pub-sub and push-pull. Quite soon you need to implement stateful services.
So this is what I made: a GSL code generator that takes a finite-state machine model inspired by Libero, and turns out a full working server. The current code generator produces C (that builds on CZMQ). In this article I'll explain briefly how this works, and how to use it.
The State Machine Model
State machines are a little unusual, conceptually. If you're not familiar with them it'll take a few days before they click. The Libero model is fairly simple and high level, meant to be designed and understood by humans:
- The machine exists in one of a number of named states. By convention the machine starts in the first state.
- In each state, the machine accepts one of a set of named events. Unhandled events are either ignored or cause the machine to die, depending on your taste.
- Given an event in a state, the machine executes a list of actions, which correspond to your code.
- After executing the actions the machine moves to the next state. An empty next state means "stay in same state".
- In the next state, the machine either continues with an internal event produced by the previous actions, or waits for an external event coming as a protocol command.
- Any action can raise an exception event that interrupts the flow through the action list and to the next state.
- A defaults state collects events that may occur in any state, together with their actions.
The zproto Server Model
The zproto_server_c.gsl code generator outputs a single .h file called an engine that does the hard work. If needed, it'll also generate you a skeleton .c file for your server, which you edit and build. It doesn't re-create that file again, though it will append new action stubs.
The server is a class that exposes an API like this (taken from the zeromq/zbroker zpipes_server, a good example):
// Create a new zpipes_server
zpipes_server_t *
zpipes_server_new (void);
// Destroy the zpipes_server
void
zpipes_server_destroy (zpipes_server_t **self_p);
// Load server configuration data
void
zpipes_server_configure (zpipes_server_t *self, const char *config_file);
// Set one configuration path value
void
zpipes_server_setoption (zpipes_server_t *self, const char *path, const char *value);
// Binds the server to an endpoint, formatted as printf string
long
zpipes_server_bind (zpipes_server_t *self, const char *format, ...);
// Self test of this class
void
zpipes_server_test (bool verbose);
Rather than run the server as a main program, you write a main program that creates and works with server objects. These run as background services, accepting clients on a ZMQ ROUTER port. The bind method exposes that port to the outside world.
Here's the state machine for the ZPIPES server:
<class
name = "zpipes_server"
title = "ZPIPES server"
proto = "zpipes_msg"
script = "zproto_server_c"
>
This is a server implementation for the ZPIPES protocol
<include filename = "license.xml" />
<!-- State machine for a client connection -->
<state name = "start">
<event name = "INPUT" next = "reading">
<action name = "open pipe for input" />
<action name = "send" message = "INPUT OK" />
</event>
<event name = "OUTPUT" next = "writing">
<action name = "open pipe for output" />
<action name = "send" message = "OUTPUT OK" />
</event>
</state>
This state allows two protocol commands, READ and CLOSE.
Names of states, events, and actions are case insensitive.
By convention we use uppercase for protocol events:
<state name = "reading">
<event name = "READ" next = "expecting chunk">
<action name = "expect chunk on pipe" />
</event>
<event name = "CLOSE">
<action name = "close pipe" />
<action name = "send" message = "CLOSE OK" />
<action name = "terminate" />
</event>
</state>
This internal state shows how we make a decision in the
state machine. These are three internal events; they can
happen come immediately from 'expect chunk on pipe', or
come later from various places. The server can ask for a
wakeup event (timeout expired). We can also get events
from other clients' machines.
<state name = "expecting chunk">
<event name = "have chunk" next = "reading">
<action name = "clear pending reads" />
<action name = "fetch chunk from pipe" />
<action name = "send" message = "READ OK" />
</event>
<event name = "pipe terminated" next = "reading">
<action name = "clear pending reads" />
<action name = "send" message = "END OF PIPE" />
</event>
<event name = "timeout expired" next = "reading">
<action name = "clear pending reads" />
<action name = "send" message = "TIMEOUT" />
</event>
</state>
<state name = "writing">
<event name = "WRITE" next = "writing">
<action name = "store chunk to pipe" />
<action name = "send" message = "WRITE OK" />
</event>
<event name = "CLOSE">
<action name = "close pipe" />
<action name = "send" message = "CLOSE OK" />
<action name = "terminate" />
</event>
</state>
The defaults state handles an 'exception_event' (sends
FAILED back to the client). It also specifies that any
unrecognized protocol command (an '$other' event) will
also cause the engine to send FAILED back to the client.
<state name = "defaults">
<event name = "exception">
<action name = "send" message = "FAILED" />
<action name = "terminate" />
</event>
<event name = "$other">
<action name = "send" message = "FAILED" />
</event>
</state>
We can define API methods to pass commands back from the
application through to the engine.
<!-- API methods
<method name = "somename" return = "number">
One-line description of method here
<argument name = "some value" type = "string">What is this argument?</argument>
<argument name = "some value" type = "number">What is this argument?</argument>
</method>
-->
</class>
The generated server manages clients automatically. There's various options to expire dead clients, and so on — read the zproto_server_c.gsl code to understand more. The state machine applies to *single client connection* from start to finish.
There are two predefined actions: "send" and "terminate". Other actions map to functions in your own code. The engine calls these functions as needed, passing the client context:
static void
store_chunk_to_pipe (client_t *self)
{
// State machine guarantees we're in a valid state for writing
assert (self->pipe);
assert (self->writing);
// Always store chunk on list, even to pass to pending reader
zlist_append (self->pipe->queue, zpipes_msg_get_chunk (self->request));
if (self->pipe->reader) {
send_event (self->pipe->reader, have_chunk_event);
assert (zlist_size (self->pipe->queue) == 0);
}
}
There are a set of methods your actions can call:
// Set the next event, needed in at least one action in an internal
// state; otherwise the state machine will wait for a message on the
// router socket and treat that as the event.
static void set_next_event (client_t *self, event_t event);
// Raise an exception with 'event', halting any actions in progress.
// Continues execution of actions defined for the exception event.
static void raise_exception (client_t *self, event_t event);
// Set wakeup alarm after 'delay' msecs. The next state should
// handle the wakeup event. The alarm is cancelled on any other
// event.
static void set_wakeup_event (client_t *self, size_t delay, event_t event);
// Execute 'event' on specified client. Use this to send events to
// other clients. Cancels any wakeup alarm on that client.
static void send_event (client_t *self, event_t event);
For More Information
Though the Libero documentation is quite old now, it's useful as a guide to what's possible with state machines. The Libero model added superstates, substates, and other useful ways to manage larger state machines.
The current working example of the zproto server generator is the zeromq/zbroker project, and specifically the zpipes_server class.
You can find GSL on Github and there's a old backgrounder for the so-called "model oriented programming" we used at iMatix.
If you have something to say, comment below or come discuss on zeromq-dev.
Comments
MOP seems to be the right tool to create new protocols. I discovered it when reading through the zguide. I am working on a new protocol based on ZMQ and Protobuf.
Do you have some tips for creating good models or is this a matter of experience?
It is a matter of experience. It is like designing a language or an API: keep the level of abstraction high enough that you don't have to be repetitive, yet keep it simple enough that people don't go crazy trying to understand it.
Good models are hard to make. Happily with MOP it's cheap to make and test models.
Portfolio
There is an xml parser in nodejs ,cheerio, which can be used to metacompile xml documents to servers.
Cheerio uses the jquery syntax who is widely known to people.
The GSL tool is something that's badly missing in every programmer's toolbox. What prevents it to catch on IMO is the syntax is kind of inconsistent and hard to learn and use. And of course, dealing with XML is pain it the ass. If it was based on JSON, it would improve the usability of the tool by an order of magnitude.
Since we put GSL onto github, it's gotten a small but growing community. Still arcane stuff, though :)
Portfolio