Over a million developers have joined DZone.

Rust: Systems Programming With a Safety Net

While Rust is a fairly new language, it's gaining a lot of traction. Read on for some details about the language and how it can be leveraged.

· Web Dev Zone

Start coding today to experience the powerful engine that drives data application’s development, brought to you in partnership with Qlik.

The Rust programming language is a systems programming language from Mozilla that had its first stable release in 2015. Rust is a relatively young programming language but, in my experience at Postmates, it has remarkable utility, for certain kinds of problems.

Previously on Codeship, we’ve talked about coming to Rusty from Ruby, thanks to Daniel Clark. In this article, we’ll discuss Rust with a particular emphasis on its effective application niche, as well as the challenges an adopter of the language in 2016 will face.

The Modern Landscape of Systems Programming

Unfortunately, as a term of art, “systems programming” is a bit on the broad side. A “systems program” is one that is intended to service other software components and has some form of mechanical sympathy.

A systems programming language, then, is one which gives the programmer facilities to map to underlying hardware resources.

Historically, the use of such languages has been a tightrope walk. Consider that C and C++ give programmers significant control over layout of structures in memory and — to a lesser degree on modern processors with modern compilers — flow control primitives that map to underlying machine primitives. But they don’t give programmers many guarantees about the correctness of the resulting program, either in its side effect on the machine in terms of memory access or in its logical operation. This has caused a great deal of trouble over the years.

Programming language theory has been focused on solving these problems, and practices that address them have become wide-spread.

Memory safety issues, for example, are broadly addressed by garbage collection. However, this technique restricts programmer access to memory layout, exempting it from systems programming domains.

Advances in the compilation of strongly typed, pure functional constructs have made a mathematical sort of programming more feasible. But this technique, which requires a Sufficiently Advanced Compiler, does not map cleanly onto a model of the target hardware’s underlying execution.

Some languages, like Ada, target the systems space but are awkward to use, for want of helpful compilers or ergonomics born of theory subsequent to the language’s development.

A Breakdown of Features in Rust

Rust is intended to resolve memory safety issues. It targets C++’s niche of relatively high-level code with convenient mapping to machine resources.

The Rust website declares that the language offers:

  • Zero-cost abstractions.
  • Move semantics.
  • Guaranteed memory safety.
  • Threads without data races.
  • Trait-based generics.
  • Pattern matching.
  • Type inference.
  • Minimal runtime.
  • Efficient C bindings.

I’ll be honest, this list is a little inside baseball. The list breaks down into three broad categories.

Rust and Efficiency

The first category of features promises that Rust code will be roughly as efficient as an equivalent C/C++ program and that reasoning about code efficiency ought to be possible with a bit of study. These are:

  • Zero-cost abstractions.
  • Minimal runtime.
  • Efficient C bindings.

Rusty and Memory Safety

The second category includes features that offer memory safety without sacrificing control over memory layout. This is enforced at compilation time through the ownership model, a nifty bit of type-theory application that statically checks that each variable access and modification is safe. These are:

  • Move semantics.
  • Guaranteed memory safety.
  • Threads without data races.

Rust and the Type System

The third category emphasizes use of the type system to reduce logic defects in a way that is familiar to anyone who has worked in ML or Haskell:

  • Trait-based generics.
  • Pattern matching.
  • Type inference.

If you haven’t used ML or Haskell, no worries. The main show here is that it’s possible to represent your programs as transformations over data and have guarantees about these transforms checked at compilation time. It takes some getting used to.

However, programming in such a style does remove a whole class of defects and allows for greater program modularity. The two papers linked here refer extensively to functional programming languages, but such transformations exist outside of them, as in Rust or the C++ algorithms library.

Where Rust Differs from C++

Both Rust and C++ target the systems programming domain by providing generic programming, ready access to hardware-level resources, efficient compilation, and a “pay for what you use” approach to higher-level abstractions. Rust resolves a class of bugs that plague C++, as a result of being invented in a time with more advanced type logics and faster computers to use for compilation.

Rust benefits significantly from hindsight. The language incorporates positive feature ambitions from existing languages, but moreover it avoids the safety defects of these languages by design.

Compare also to D, which integrated a garbage collector in its early days and has struggled to make it optional. Rust has no garbage collection — only smart pointer types — but work is in progress to provide one for situations where GC is awfully handy.

The community has invested heavily in tooling and documentation to make the language convenient to use as well as approachable to programmers unfamiliar with systems programming languages. There are many ways to get involved with the community if you’ve got questions while learning.

The Code of Rust

In my opinion, Rust is a pleasure to program in. A colleague and I were recently experimenting with the behavior of different operating systems when their UDP kernel buffers fill up. We wanted to know what the packet per second numbers looked like when:

  • Each packet is a 64-bit, unsigned, big-endian integer,
  • All communication is localhost, and
  • We ignore potential out-of-order receipt.

Here’s the consumer, called consumer.rs:

#![feature(integer_atomics)]

use std::sync::atomic::{AtomicU64, ATOMIC_U64_INIT, Ordering};
use std::{thread, time};
use std::net::UdpSocket;

static GLOBAL_PER_SECOND: AtomicU64 = ATOMIC_U64_INIT;

#[inline]
fn u8tou64abe(v: &[u8]) -> u64 {
    v[7] as u64 + ((v[6] as u64) << 8) + ((v[5] as u64) << 16) + ((v[4] as u64) << 24) +
    ((v[3] as u64) << 32) + ((v[2] as u64) << 40) + ((v[1] as u64) << 48) +
    ((v[0] as u64) << 56)
}

fn main() {
    let socket = UdpSocket::bind("0.0.0.0:2387").unwrap();

    // Once per second swap 0 onto the top of GLOBAL_PER_SECOND
    // and print the previous value to the squishy meat-beast waiting
    // for that information. 
    //
    // These will be the number of packets _received_ per second. 
    thread::spawn(move || {
        let one_second = time::Duration::from_millis(1000);
        loop {
            thread::sleep(one_second);
            println!("{}", GLOBAL_PER_SECOND.swap(0, Ordering::Relaxed));
        }
    });

    let mut buf: [u8; 8] = [0; 8];

    let (sz, _) = socket.recv_from(&mut buf).expect("oops");
    let mut cur = u8tou64abe(&buf[..sz]);

    // Loop around forever pulling packets off the socket. Inspect
    // each packet and compare to `cur`, printing if the difference
    // between the last packet seen isn't 1. Implies dropped packets
    // since we're going to ignore out-of-order.
    loop {
        let (sz, _) = socket.recv_from(&mut buf).expect("oops");
        let cnt = u8tou64abe(&buf[..sz]);

        if (cnt - 1) != cur {
            println!("GAP: {}", cnt - cur);
        }
        cur = cnt;

        GLOBAL_PER_SECOND.fetch_add(1, Ordering::Relaxed);
    }
}

Notice that the #![feature(integer_atomics)] signals that we’re using an unstable feature. It hasn’t been blessed yet as a part of the standard library. This means we’ll need to compile on a nightly compiler (I discuss the distinction between compiler kinds later). Assuming your rustc version is agreeable, you can compile consumer.rs like so:

> rustc -C opt-level=3 -C target-cpu=native consumer.rs

You now have a native executable consumer sitting on disk. The producer, called producer.rs, is:

#![feature(integer_atomics)]

use std::sync::atomic::{AtomicU64, ATOMIC_U64_INIT, Ordering};
use std::{thread, time};
use std::net::{UdpSocket, Ipv4Addr, SocketAddrV4};

static GLOBAL_PER_SECOND: AtomicU64 = ATOMIC_U64_INIT;

#[inline]
fn u64tou8abe(v: u64) -> [u8; 8] {
    [(v >> 56) as u8,
     (v >> 48) as u8,
     (v >> 40) as u8,
     (v >> 32) as u8,
     (v >> 24) as u8,
     (v >> 16) as u8,
     (v >> 8) as u8,
     v as u8]
}

fn main() {
    let addr = SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 0);
    let dest = SocketAddrV4::new(Ipv4Addr::new(0, 0, 0, 0), 2387);

    let socket = UdpSocket::bind(addr).unwrap();
    socket.set_nonblocking(true).unwrap();

    // Once per second swap 0 onto the top of GLOBAL_PER_SECOND
    // and print the previous value to the squishy meat-beast waiting
    // for that information. 
    //
    // These will be the number of packets _sent_ per second. 
    thread::spawn(move || {
        let one_second = time::Duration::from_millis(1000);
        loop {
            thread::sleep(one_second);
            println!("{}", GLOBAL_PER_SECOND.swap(0, Ordering::Relaxed));
        }
    });

    let mut cnt: u64 = 0;
    loop {
        socket.send_to(&u64tou8abe(cnt), dest).unwrap();
        cnt = cnt.wrapping_add(1); // loop to 0 when we reach std::u64::MAX
        GLOBAL_PER_SECOND.fetch_add(1, Ordering::Relaxed);
    }
}

It’s compiled in the same way as consumer.rs:

> rustc -C opt-level=3 -C target-cpu=native producer.rs

Both programs are broadly similar and rely on POSIX sockets, bit fiddling, and hardware-supported atomic integers to get their work done. Equivalent programs in C++ were longer and more challenging to write. Performance between the two languages was indistinguishable, as to be expected from programs doing syscalls, bit fiddling, and little else.

Meanwhile we get many more safety guarantees from the Rust compiler. I think that’s pretty neat!

The Cons of Rust

Nothing’s perfect, right? Rust is a new language, and there are many areas where things are still sort of a research project.

This entire discussion is incredibly helpful for gauging Rust’s fitness for purpose for a variety of domains of application. In my work, the sticking points are:

  • Cross-platform async IO is kind of an open question. Long-term techniques are being actively worked on, and existing libraries are intentionally low-level.
  • Building web services in Rust is mix and match, depending on your ambitions. Are We Web Yet? tracks and explains Rust’s progress in this area.
  • Cross-compilation to embedded environments is a challenge. There is built-in support to disable the already minimal runtime called no_std, but it’s not something you might do just for giggles yet. Building no_std binaries is a nightly-compiler proposition. Speaking of which…
  • Rust follows a nightly/beta/stable release model. Only nightly allows for the compilation of Rust programs with unstable features enabled. The serde serialization library is a little goofy to integrate into stable builds as a result, while clippy requires a nightly compiler. I tend to develop in nightly and deploy stable. Continuous integration catches accidental incompatibilities between nightly/stable, but it’s a pain, even so.

So. Should I Use Rust?

It depends on what you’re building.

At Postmates, I’ve been working on telemetry gathering systems for product teams. These are wee little daemons that are meant to run on-system and collect information about the running application without interfering with the running application’s good function. Classic systems programming problem.

Concerned about some of Rust’s rough spots, I strongly considered using C++. I even built a working prototype in both Rust and Modern C++14.

In my estimation, both programs were roughly of the same difficulty to master, except that maintaining the same memory allocation patterns in C++ would require a more expert understanding of the language, compared to Rust which would simply fail to compile. That is, it was easier to accidentally goof up in C++.

Otherwise, both programs were roughly the same in terms of performance. Both languages might frustrate my colleagues should I get hit by a bus, but only one would keep things safe by design.

Rust is a fine language. It’s not a silver bullet by any means, and you shouldn’t go rewriting the world in it.

But if you’re looking to get into systems programming, learn Rust. That’s especially true if you’re coming from dynamic languages: The Rust Programming Language is written specifically for you. If you’re thinking about building a new system in C or C++ and can tolerate a little experimentation, maybe go for Rust instead.

I’m willing to bet Rust is a language with staying power. I think in the coming years, we’ll see more important lower-level software written in it. The safety net that the compiler provides without having to sacrifice performance ambitions is just too good to pass up.

Create data driven applications in Qlik’s free and easy to use coding environment, brought to you in partnership with Qlik.

Topics:
website ,rust ,code ,language ,systems ,safety ,programming

Published at DZone with permission of Brian Troutwine, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

SEE AN EXAMPLE
Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.
Subscribe

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}