Rust's Move by Default

11 Jun 2022

As a C developer coming to rust, one thing that has taken a while for me to get used to is rust’s move behavior.

Often C developers learn to avoid passing large structures around because these incur a copy cost. This practice can often times lead into a general avoidance of pass by value except when using primitive types (int, char, float, etc.).

We’ll go through a basic example of passing by value versus passing by pointer (or reference in rust) and see how the differ in both the code written and the resultant assembly.

Below we have a struct which keeps a hash value and the number of times it’s been hashed. It’s a bit of a nonsensical toy idea, but should serve the purpose. We’re going to run three iterations of this hash for pass by value versus pass by pointer to see how it looks.

typedef struct {
    /// The number of times the data has been hashed 
    int instances;
    /// The current hash
    char hash[64];
} some_hash_type;

Pass By Value

For those newer to programing or not from a C (or C++) background, pass by value may be the most intuitive way to think about things. We take the value in, we modify it and then we return the modified value.

some_hash_type hash_by_value(some_hash_type hash_tracker){
    int index;
    hash_tracker.instances++;
    index = hash_tracker.instances % sizeof(hash_tracker.hash);
    hash_tracker.hash[index]++;
    return hash_tracker;
}

Don’t try to understand the hash logic too much it’s pretty nonsensical and is really just a fancy counter of its own. I didn’t want to have to pull in a library with a real hash function.

The operation of three iterations would look like:

some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_tracker = hash_by_value(hash_by_value(hash_by_value(hash_tracker)));

Using the embed feature of Compiler Explorer we can see what this looks like and it’s assembly.

The important thing to notice is the movups instruction calls.

        movups  xmm0, xmmword ptr [rbp - 288]
        movups  xmm1, xmmword ptr [rbp - 272]
        movups  xmm2, xmmword ptr [rbp - 256]
        movups  xmm3, xmmword ptr [rbp - 240]
        movups  xmmword ptr [rax + 48], xmm3
        movups  xmmword ptr [rax + 32], xmm2
        movups  xmmword ptr [rax + 16], xmm1
        movups  xmmword ptr [rax], xmm0

It is here that we need to disambiguate the term move. For assembly, things like moveups, it is copying data from one location to another. For languages like rust and C++ move means that an item may stay in place in storage, but the original creator of the object no longer needs it and will not try to re-use the object’s storage location.

I’m not happy with that disambiguation of move but the assembly instruction name deviates from higher level language definition. Probably why rust prefers the term ownership.

The movups are 128 bit copies. We’re doing 8 of these or copying 1024 bytes. The example structure happens to be just over 512. So it looks like we’re copying the value into and out of each function invocation.

Pass By Pointer

As mentioned above pass by value results in copying in order to take the value in and to return it. When using pass by pointer one only needs to copy the storage location of the data and that storage location is directly modified.

Often times in C, when using pass by pointer, functions will have a boolean or integer result in order to communicate failures and return codes. A version of this hash logic might look like:

void hash_by_pointer(some_hash_type * hash_tracker){
    int index;
    hash_tracker->instances++;
    index = hash_tracker->instances % sizeof(hash_tracker->hash);
    hash_tracker->hash[index]++;
}

If one were to use this for 3 iterations it would look like:

some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_by_pointer(&hash_tracker);
hash_by_pointer(&hash_tracker);
hash_by_pointer(&hash_tracker);

Since this is void return we could make it like the pass by pointer and return the pointer passed in.

Side note, I was a C developer for over a decade before I realized memcpy() returns the destination.

some_hash_type * hash_by_pointer(some_hash_type * hash_tracker){
    int index;
    hash_tracker->instances++;
    index = hash_tracker->instances % sizeof(hash_tracker->hash);
    hash_tracker->hash[index]++;
    return hash_tracker;
}

Then our operation could look almost like the pass by value version:

some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_by_pointer(hash_by_pointer(hash_by_pointer(&hash_tracker)));

And the Compiler Explorer version of this:

Comparing

If you look at the assembly of the two versions there are a couple of differences.

Looking at just the function definitions we can see that the pass by value seems to be manipulating the stack a bit more to make room for the struct.

hash_by_value:                          # @hash_by_value
        push    rbp
        mov     rbp, rsp
        sub     rsp, 16
        mov     rax, rdi
        mov     qword ptr [rbp - 16], rax       # 8-byte Spill
        lea     rsi, [rbp + 16]

Where things really start to show up is when one looks at the usage of the functions in my_hasher() in the Compiler Explorer samples. The pass by pointer version has none of the movups instructions that were so prevelant in the pass by value.

Rust Pass by Value (Move)

Lets look at how rust does this. When rust does a pass by value it’s normally considered a move of the value. Again move here does not imply an assembly mov instruction.

Passing by value in rust would look something like

pub struct SomeHashType {
    instances: u32,
    hash: [u8; 64],
}

pub fn hash_by_value(mut hash_tracker: SomeHashType) -> SomeHashType {
    hash_tracker.instances += 1;
    let index = hash_tracker.instances as usize % hash_tracker.hash.len();
    hash_tracker.hash[index] += 1;
    hash_tracker
}

pub fn my_hasher() -> SomeHashType {
    let mut hash_tracker = SomeHashType{ instances: 0, hash: [0; 64]};
    hash_by_value(hash_by_value(hash_by_value(hash_tracker)))
}

And the Compiler Explorer is

Looking at the assembly it is overall a bit more. There are some reasons rust is both a memory safe language and not quite as fast as C by default. For example this chunk here is panicking if we exceeded the bounds of the hash array:

.LBB0_5:
        mov     rdi, qword ptr [rsp + 8]
        lea     rdx, [rip + .L__unnamed_2]
        mov     rax, qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
        mov     esi, 64
        call    rax
        ud2

The important bit is here:

        call    qword ptr [rip + example::hash_by_value@GOTPCREL]
        lea     rdi, [rsp + 160]
        lea     rsi, [rsp + 232]
        call    qword ptr [rip + example::hash_by_value@GOTPCREL]
        mov     rdi, qword ptr [rsp + 8]
        lea     rsi, [rsp + 160]
        call    qword ptr [rip + example::hash_by_value@GOTPCREL]

You’ll notice there are no memcpy or movups between the function calls.

Rust Pass by Reference

Rust allows one to pass by reference. This is very similar to C’s pass by pointer.

Pass by reference would look like:

pub struct SomeHashType {
    instances: u32,
    hash: [u8; 64],
}

pub fn hash_by_reference(hash_tracker: &mut SomeHashType) -> &mut SomeHashType {
    hash_tracker.instances += 1;
    let index = hash_tracker.instances as usize % hash_tracker.hash.len();
    hash_tracker.hash[index] += 1;
    hash_tracker
}

pub fn my_hasher() -> SomeHashType {
    let mut hash_tracker = SomeHashType{ instances: 0, hash: [0; 64]};
    hash_by_reference(hash_by_reference(hash_by_reference(&mut hash_tracker)));
    hash_tracker
}

The Compiler Explorer output is:

There are a few differences with the rust pass by value and the rust pass by reference but for the most par they are fairly close .

One may notice the ommision of a memcpy in the pass by reference version. This is because these examples are without optimizations. One can compile with optimizations or change the pass by value example to be the following to get rid of the memcpy that isn’t present in the pass by reference.

pub fn my_hasher() -> SomeHashType {
    hash_by_value(hash_by_value(hash_by_value(SomeHashType{ instances: 0, hash: [0; 64]})))
}

The pass by reference uses mov instructions for the argument memory location, while the pass by value uses the lea instruction using a bit more indirection, and a bit more loading cost. But this extra cost generally won’t come close to the cost of copying values around.

Summary

Passing by value in rust does not have the same negative impacts as pass by value in C. As such the stigma for pass by value in C should be avoided in rust.

It does appear at the micro level pass by reference in rust may have less assembly instructions than pass by value. However the two outputs are probably close enough to require one to profile with timings before trying to argue one way or another in the general case.

While it wasn’t discussed here, pass by value in rust can provide some API benefits due to rusts ownership rules.

One may notice that all of these examples were done without compiler optimizations turned on. When turning on optimizations for the rust examples the function calls get inlined resulting in more assembly bloat. While LTO(link time optimizations) may inline functions from other modules I wanted to focus on the common case for non local functions.