Writing My Network Protocol in Rust
A dev shows us some Rust code he used to configure network protocols. Follow along and know the rust off your coding skills!
Join the DZone community and get the full member experience.
Join For FreeAfter spending a lot of time writing my network protocol in C, I decided that it would be a nice exercise to do the same in Rust. I keep getting back to this language because I want to like it. I hope that having a straightforward task and the passage of time will make things easier than before.
I gotta say, the compiler feels a lot nicer now. Check this out:
I really like it. This feels like the compiler is actively trying to help me get in the right path…
And this error message made me even happier:
Thank you, this is clear, concise and to the point.
I can indeed report that the borrow checker is indeed very much present, but at least now it will tell you what is the magic incantation you need to make it happen:
This is my third or forth foray into Rust, and I stopped a few times before because the learning curve and the… Getting Things Done curve just weren’t there for me. In this case, however, I’m getting a very different feel all around. There is a lot less messing around and a lot more actually getting to where I want to go. I’m still significantly slower, naturally, since I’m learning idioms and the mechanics of the language, but there is a lot less friction. And the documentation has been top notch, and I’m leaning on that a lot.
In particular, one of my pet peeves seems to have been resolved. Composite error handling is now as simple as:
extern crate custom_error;
custom_error! {
ConnectionError
Io{source: io::Error} = "unable to read from the network",
Utf8{source: std::str::Utf8Error} = "Invalid UTF8 character sequence",
Parse{origin: String} = "Unable to parse command: {origin}"
}
impl ConnectionError {
fn parsing(origin: &str) -> ConnectionError {
ConnectionError::Parse{ origin: origin.to_string() }
}
}
Removing the ceremony from error handling is a great relief, I have to say.
It took me a couple of evenings to get to the point where I can setup a TCP server, accept a connection, and parse the command. Let me see if I can break it up to manageable parts, even though the whole thing is about 100 lines of code, this is packed.
I’m certainly feeling the fact that Rust actually have a runtime library. I mean, C has one, but it is pretty anemic to say the least. And actually having an OOTB solution for packaging is not really an option today, it is a baseline requirement.
Here is the code that handles the connection itself.
fn handle_connection(mut stream: TcpStream) -> Result<(), ConnectionError> {
stream.write(b"OK\r\n")?;
let mut cmd_buffer = Vec::new();
loop {
let consumed_bytes = {
let msg = read_full_message(&stream, &mut cmd_buffer)?;
let cmd_str = std::str::from_utf8(msg)?;
let cmd = parse_cmd(&cmd_str)?;
dispatch_cmd(&stream, cmd)?;
msg.len()
};
cmd_buffer.drain(0 .. consumed_bytes);
}
}
By itself, it is pretty simple. It starts by letting the client know that we are ready (the OK msg) and then read a message from the client, parse it, and dispatch it. Rinse, repeat, etc.
There are a couple of interesting things going on here that might be worth your attention:
- The read_full_message() function works on bytes, and it is responsible for finding the message boundaries. The code is meant to handle pipelined messages, partial reads, etc.
- Once we have the range of bytes that represent a single message, we translate them to UTF8 strings and use string processing to parse the command. This is simpler than the tokenization we did in C.
- After processing a message, we drain the data we already process and continue with the data already in the buffer. Note that we basically reuse the same buffer, so hopefully there shouldn’t be too many allocations along the way.
The buffer handling is the responsibility of the read_full_message()
function, which looks like:
lazy_static! {
static ref msg_break : TwoWaySearcher<'static> = {
TwoWaySearcher::new("\r\n\r\n".as_bytes())
};
}
fn read_full_message<'a>(mut stream: &TcpStream, buffer: &'a mut Vec<u8>) -> Result<&'a [u8], ConnectionError> {
let mut to_scan = 0;
let mut tmp_buf = [0; 256];
loop {
match msg_break.search_in(&buffer[to_scan..]) {
None => to_scan = min(buffer.len() - 3, 0),
Some(msg_end) => return Ok(&buffer[0..(to_scan + msg_end)])
}
let read = stream.read(&mut tmp_buf)?;
if read + buffer.len() > 8192 {
return Err(ConnectionError::MessageTooBig)
}
buffer.extend_from_slice(&tmp_buf[0..read]);
}
}
I’m using an inefficient method, where I’m reading only 256 bytes at a time from the network, mostly to prove that I can process things that come over in multiple calls. But the basic structure is simple:
- Use TwoWaySearcher to search for a byte pattern in the buffer. This is basically the
memmem()
call. - If the byte pattern (\r\n\r\n) is found, we have a message boundary and can return that to the caller.
- If the message boundary isn’t found, we need to read more data from the network.
- I’m playing some tricks with the
to_scan
variable, avoiding the case where I need to scan over data that I have already scanned. - I’m also validating that I’m never reading too much from the network and abort the connection if we can’t find the message boundary in a reasonable size (8KB).
What remains is some string parsing, which ended up being really easy, since Rust has normal string processing routines.
struct Cmd<'a> {
args: Vec<&'a str>,
headers: HashMap<&'a str, &'a str>,
}
fn parse_cmd<'a>(cmd_str: &'a str) -> Result<Cmd, ConnectionError> {
let mut lines = cmd_str.lines();
let cmd_line = match lines.next() {
None => {
return Err(ConnectionError::parsing(cmd_str));
}
Some(v) => v,
};
let mut cmd = Cmd {
args: cmd_line.split(' ').collect(),
headers: HashMap::new(),
};
for line in lines {
let parts: Vec<&str> = line.splitn(2, ':').collect();
if parts.len() != 2 {
return Err(ConnectionError::parsing(line));
}
cmd.headers.insert(parts[0].trim(), parts[1].trim());
}
Ok(cmd)
}
And this is pretty much it, I gotta say. For that amount of code, it took a long time to get there, but I’m pretty happy with the state of the code. Of course, this is still pretty early in the game and it isn’t really doing anything really interesting. The TCP server can accept only a single connection (breaking the connection will kill the server, at this point), error handling is the same as not having a single catch in the entire system, etc.
What I expect to be… interesting is the use of SSL and concurrent (hopefully async) I/O. We’ll see where that will take us…
Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments