Comparing a Simple C Function to Rust

20 Jul 2023

Coming from an embedded C background, I started to notice when reviewing rust code that I didn’t need to focus as much on function input validation. I wanted to capture my current understanding here.

One often hears the following saying for functional programs, and at times rust:

It compiles and just works

I think this is an oversimplification, but hopefully the rest of this post better communicates why some of that feeling may occur.

This isn’t meant to be a rust is better than C post, just trying to explain where one area of rust reduces cognitive load, allowing more mental capacity to focus on other areas of development.

C Implementation

Let’s jump right into a simple C function that one might encounter:

int get_thing(const char * name, thing_type * thing) {
    ...
}

This function will use name to find and fill out the passed in thing. It returns an int to communicate if the function failed or not.

Ignoring how we use name to find thing, we can dig into some of what this implementation may need:

The function should probably check for a null pointer to name
Not often practical to check, but name should be null terminated.
The function should probably check for a null pointer to thing
We may want to memset or have a default for thing even if the function errors out.

These items are some sharp edges of C, for good and bad.

One could argue against name as a const char * and that this should use an enumeration. The example is meant to capture common inputs. Strings are seen in many C APIs.

Rust Implementation

Translating more or less one for one into rust we get.

fn get_thing(name: &str, thing: &mut ThingType) -> u32 {
    ...
}

We’ll break down some of this signature for those unfamiliar with rust.

name: &str is an argument named name with a type of &str. &str is the common rust type for string references. The compiler will ensure that the &str points to valid memory and has a valid size.

thing: &mut ThingType is an argument named thing. It is a mutable reference of ThingType. By default in rust, all variables and arguments are immutable, meaning they cannot be modified. When one wants to modify a variable or argument they must use the mut keyword. One can think of it as the opposite as const in C. The compiler will ensure that thing is not a null pointer.

-> u32 is the return type of the function. It is an unsigned 32 bit integer.

Comparing to the C function we can see there are a few less gotchas.

name can not be a null pointer. The compiler will ensure this.
The compiler will ensure that name has a valid length.
thing can not be a null pointer. The compiler will ensure this.
thing will have to be initialized prior to the function call. The compiler will ensure this. Initializing by the caller could be a bad thing as it puts more burden on the caller and it could hurt performance by setting values that will be overwritten in the get_thing call.

Using the Result Type

The function could be further updated to follow common rust idioms.

Rust code often uses the Result<T, E> type for error handling. This type is ingrained in the language such that there is a question mark ? operator which makes working with Results ergonomic to use.

Coming from C the Result type took me a bit to fully appreciate. I’ll attempt to communicate its usage. If we update the rust function to use the Result type we get:

fn get_thing(name: &str) -> Result<ThingType, u32> {
    ...
}

This function will return either a ThingType or a u32. This means when one has an error they will only be able to see a u32 value, the compiler will enforce this. If the function succeeds it will only return a ThingType. The caller no longer needs to look at the u32 the presence of the ThingType tells them it succeeded.

An example usage is:

match get_thing("fish") {
   Ok(thing) => println!("The thing is {thing}"),
   Err(value) => println!("The error code was {value}"),
}

The match statement is similar to a C switch statement, however the compiler will ensure that you have logic for both the Ok and the Err branches.

The Result type has two states Ok() and Err(). In this case the Ok() state contains a ThingType and the match statement allows us to get access to it via the variable that we named thing. We could have written Ok(foo) => ... and referenced it as foo. The Err() state contains the u32 error code, which we reference as value.

There are a few other ways to use the Result type, but with any of them the compiler will force you to acknowledge that the two different states.

The returning of the ThingType assumes that it’s not too large. One could return the result code and item in C by wrapping them in a struct, but this isn’t often done.

Summary

While this example function was simple, I think it is a common C API idiom, and hopefully it communicates how there are a few less concerns to look for when writing and reviewing the rust version.

I intentionally left out discussion of ownership, borrowing, and the ability to use unsafe in rust. While one can introduce similar gotchas in rust using unsafe code, it isn’t the norm.

There are some software practices that can reduce the C gotchas. For example using static analyzers or organizational coding policies of when and where null pointers are checked. These require extra work and diligence on top of the language usage. Often static checkers can be difficult to get working at the right level of signal to noise.