Rust's Move by Default
As a C developer coming to rust, one thing that has taken a while for me to get used to is rust’s move behavior.
Often C developers learn to avoid passing large structures around because these incur a copy cost. This practice can often times lead into a general avoidance of pass by value except when using primitive types (int, char, float, etc.).
We’ll go through a basic example of passing by value versus passing by pointer (or reference in rust) and see how the differ in both the code written and the resultant assembly.
Below we have a struct which keeps a hash value and the number of times it’s been hashed. It’s a bit of a nonsensical toy idea, but should serve the purpose. We’re going to run three iterations of this hash for pass by value versus pass by pointer to see how it looks.
typedef struct {
/// The number of times the data has been hashed
int instances;
/// The current hash
char hash[64];
} some_hash_type;
Pass By Value
For those newer to programing or not from a C (or C++) background, pass by value may be the most intuitive way to think about things. We take the value in, we modify it and then we return the modified value.
some_hash_type hash_by_value(some_hash_type hash_tracker){
int index;
hash_tracker.instances++;
index = hash_tracker.instances % sizeof(hash_tracker.hash);
hash_tracker.hash[index]++;
return hash_tracker;
}
Don’t try to understand the hash logic too much it’s pretty nonsensical and is really just a fancy counter of its own. I didn’t want to have to pull in a library with a real hash function.
The operation of three iterations would look like:
some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_tracker = hash_by_value(hash_by_value(hash_by_value(hash_tracker)));
Using the embed feature of Compiler Explorer we can see what this looks like and it’s assembly.
The important thing to notice is the movups
instruction calls.
movups xmm0, xmmword ptr [rbp - 288]
movups xmm1, xmmword ptr [rbp - 272]
movups xmm2, xmmword ptr [rbp - 256]
movups xmm3, xmmword ptr [rbp - 240]
movups xmmword ptr [rax + 48], xmm3
movups xmmword ptr [rax + 32], xmm2
movups xmmword ptr [rax + 16], xmm1
movups xmmword ptr [rax], xmm0
It is here that we need to disambiguate the term move. For assembly, things
like moveups
, it is copying data from one location to another. For languages
like rust and C++ move means that an item may stay in place in storage, but
the original creator of the object no longer needs it and will not try to re-use
the object’s storage location.
I’m not happy with that disambiguation of move but the assembly instruction name deviates from higher level language definition. Probably why rust prefers the term ownership.
The movups
are 128 bit copies. We’re doing 8 of these or copying 1024 bytes.
The example structure happens to be just over 512. So it looks like we’re
copying the value into and out of each function invocation.
Pass By Pointer
As mentioned above pass by value results in copying in order to take the value in and to return it. When using pass by pointer one only needs to copy the storage location of the data and that storage location is directly modified.
Often times in C, when using pass by pointer, functions will have a boolean or integer result in order to communicate failures and return codes. A version of this hash logic might look like:
void hash_by_pointer(some_hash_type * hash_tracker){
int index;
hash_tracker->instances++;
index = hash_tracker->instances % sizeof(hash_tracker->hash);
hash_tracker->hash[index]++;
}
If one were to use this for 3 iterations it would look like:
some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_by_pointer(&hash_tracker);
hash_by_pointer(&hash_tracker);
hash_by_pointer(&hash_tracker);
Since this is void
return we could make it like the pass by pointer and return
the pointer passed in.
Side note, I was a C developer for over a decade before I realized
memcpy()
returns the destination.
some_hash_type * hash_by_pointer(some_hash_type * hash_tracker){
int index;
hash_tracker->instances++;
index = hash_tracker->instances % sizeof(hash_tracker->hash);
hash_tracker->hash[index]++;
return hash_tracker;
}
Then our operation could look almost like the pass by value version:
some_hash_type hash_tracker;
memset(&hash_tracker, 0, sizeof(hash_tracker));
hash_by_pointer(hash_by_pointer(hash_by_pointer(&hash_tracker)));
And the Compiler Explorer version of this:
Comparing
If you look at the assembly of the two versions there are a couple of differences.
Looking at just the function definitions we can see that the pass by value seems to be manipulating the stack a bit more to make room for the struct.
hash_by_value: # @hash_by_value
push rbp
mov rbp, rsp
sub rsp, 16
mov rax, rdi
mov qword ptr [rbp - 16], rax # 8-byte Spill
lea rsi, [rbp + 16]
Where things really start to show up is when one looks at the usage of the
functions in my_hasher()
in the Compiler Explorer samples.
The pass by pointer version has none of the movups
instructions that were so
prevelant in the pass by value.
Rust Pass by Value (Move)
Lets look at how rust does this. When rust does a pass by value it’s normally
considered a move of the value. Again move here does not imply an assembly mov
instruction.
Passing by value in rust would look something like
pub struct SomeHashType {
instances: u32,
hash: [u8; 64],
}
pub fn hash_by_value(mut hash_tracker: SomeHashType) -> SomeHashType {
hash_tracker.instances += 1;
let index = hash_tracker.instances as usize % hash_tracker.hash.len();
hash_tracker.hash[index] += 1;
hash_tracker
}
pub fn my_hasher() -> SomeHashType {
let mut hash_tracker = SomeHashType{ instances: 0, hash: [0; 64]};
hash_by_value(hash_by_value(hash_by_value(hash_tracker)))
}
And the Compiler Explorer is
Looking at the assembly it is overall a bit more. There are some reasons rust is both a memory safe language and not quite as fast as C by default. For example this chunk here is panicking if we exceeded the bounds of the hash array:
.LBB0_5:
mov rdi, qword ptr [rsp + 8]
lea rdx, [rip + .L__unnamed_2]
mov rax, qword ptr [rip + core::panicking::panic_bounds_check@GOTPCREL]
mov esi, 64
call rax
ud2
The important bit is here:
call qword ptr [rip + example::hash_by_value@GOTPCREL]
lea rdi, [rsp + 160]
lea rsi, [rsp + 232]
call qword ptr [rip + example::hash_by_value@GOTPCREL]
mov rdi, qword ptr [rsp + 8]
lea rsi, [rsp + 160]
call qword ptr [rip + example::hash_by_value@GOTPCREL]
You’ll notice there are no memcpy
or movups
between the function calls.
Rust Pass by Reference
Rust allows one to pass by reference. This is very similar to C’s pass by pointer.
Pass by reference would look like:
pub struct SomeHashType {
instances: u32,
hash: [u8; 64],
}
pub fn hash_by_reference(hash_tracker: &mut SomeHashType) -> &mut SomeHashType {
hash_tracker.instances += 1;
let index = hash_tracker.instances as usize % hash_tracker.hash.len();
hash_tracker.hash[index] += 1;
hash_tracker
}
pub fn my_hasher() -> SomeHashType {
let mut hash_tracker = SomeHashType{ instances: 0, hash: [0; 64]};
hash_by_reference(hash_by_reference(hash_by_reference(&mut hash_tracker)));
hash_tracker
}
The Compiler Explorer output is:
There are a few differences with the rust pass by value and the rust pass by reference but for the most par they are fairly close .
One may notice the ommision of a memcpy
in the pass by reference
version. This is because these examples are without optimizations. One can
compile with optimizations or change the pass by value example to be the
following to get rid of the memcpy
that isn’t present in the pass by
reference.
pub fn my_hasher() -> SomeHashType {
hash_by_value(hash_by_value(hash_by_value(SomeHashType{ instances: 0, hash: [0; 64]})))
}
The pass by reference uses mov
instructions for the argument memory location,
while the pass by value uses the lea
instruction using a bit more indirection,
and a bit more loading cost. But this extra cost generally won’t come close to
the cost of copying values around.
Summary
Passing by value in rust does not have the same negative impacts as pass by value in C. As such the stigma for pass by value in C should be avoided in rust.
It does appear at the micro level pass by reference in rust may have less assembly instructions than pass by value. However the two outputs are probably close enough to require one to profile with timings before trying to argue one way or another in the general case.
While it wasn’t discussed here, pass by value in rust can provide some API benefits due to rusts ownership rules.
One may notice that all of these examples were done without compiler optimizations turned on. When turning on optimizations for the rust examples the function calls get inlined resulting in more assembly bloat. While LTO(link time optimizations) may inline functions from other modules I wanted to focus on the common case for non local functions.