avatarAnkit Tanna

Summary

The article provides an in-depth explanation of memory management in Rust, focusing on the differences between stack memory usage by variables and vectors, and how vectors are stored in heap memory due to their dynamic size.

Abstract

The Rust Programming Language article delves into the intricacies of memory allocation, contrasting the stack memory usage of fixed-size variables with the heap memory allocation required for vectors, which have an unknown size at compile time. It illustrates how Rust handles memory for variables with examples, showing how values are pushed onto and popped from the stack during execution. The article emphasizes that Rust does not explicitly clear memory but overrides it, which mirrors CPU register behavior. For functions returning values, Rust reserves stack space for the return value. However, when it comes to vectors, Rust uses a VecMetadata struct to store metadata about the vector on the stack, while the actual data is held in the heap. The metadata includes the first element's index, the length of the vector, and its capacity. The article also discusses the performance implications of vector resizing and suggests using the with_capacity method to preallocate memory for vectors when their size is known.

Opinions

  • The author suggests that understanding stack and heap memory is crucial for Rust developers.
  • The article implies that Rust's approach to memory management, particularly its lack of explicit memory clearing, is efficient and aligns with low-level CPU behavior.
  • The author advocates for preallocating vector capacity when possible to avoid performance penalties associated with memory reallocation.
  • The article conveys that Rust's memory management system is designed to be both safe and performant, leveraging stack memory for fixed-size data and heap memory for dynamic data structures like vectors.
The Rust Programming Language

The Rust Programming Language — Vectors — Stack Memory vs Heap Memory

Memory usage under the hood by variables vs vectors. Understand stack and heap memory once and for all.

As I said in the previous article, the usage of memory is different for arrays (and other variables) vs the vectors. This is fundamentally due to the nature of vectors being of an unknown size during the runtime of a program.

We’ll deep dive into usage of memory with the help of graphical examples below. To summarise something that we learnt a few articles ago here, memory is representation of series of bits in a sequence and we make sense out of it by clubbing those bits to a series of 1 byte (8 bits), 2 bytes (16 bits) and etc and generating a value out of those bits. See the image below:

Memory representation of bits to bytes

Let’s take an example of below code snippet:

fn increment_decrement(num: u8) {
    print_nums(num + 1, num - 1);
}

fn print_nums(x: u8, y: u8) {
    println!("{} {}", x, y);
}

increment_decrement(42);

The code is fairly simple. We have a function called increment_decrement, which as the name suggests, takes a number as an input and prints the incremented value and decremented value using print_nums method. There is a lot of memory usage going on here. There are a bunch of numbers being passed along the functions and different methods, starting with num. New numbers, incremented and decremented numbers, are being generated using num+1 and num-1. And then finally these variables are passed to print_nums via x and y.

Let’s run this code line by line and see what impact does it have in the memory. Remember that our memory, when the program runs, is like a stack. As and when a program encounters a variable, it pushes the variable to the stack and stores it in the memory.

Initial state looks like below:

Initial state of the memory

So now, when the line #9 is executed, Rust sees that that the function takes in 42 as an input and hence it decides to store 42 in the stack of the memory.

increment_decrement function execution started

So after line #9 is executed, it takes us to line #1 for the execution of the function. But now, the function needs to determine what num is. In an absolute abstract way, how does the function determine what num is? It goes to the stack and says, please get me the value on the stack_length-1 index in the stack memory. And it returns 42.

Execution of line #2

So now, num is to be passed to print_nums after increment and decrement operations on num which is 42. The evaluation results in 43 and 41. Now we need to store 43 and 41 and our stack_length becomes 3.

Execution of line #5

Now that after line #2 executes, we will now have to move to line #5 function print_nums which has a responsibility of evaluating x and y. Again, it can fetch x and y from the stack memory using the stack[stack_length-2] and stack[stack_length-1] respectively.

Now the println!() picks up x and y and prints them.

Execution of line #6

After this line, we do not have any purpose of 41 and 43. Hence we need to remove it from the memory? Incorrect, Rust Language does not go out of the way and erase the stuff from the memory. It just ignores them and makes the stack_length=1. The memory bits are still filled with 43 and 41. So it looks like something below:

After execution of line #6

As you notices, the memory bits for 43 and 41 are simply ignored. Subsequently, when the entire program completes the execution, there is no purpose of 42. So it looks like below:

After entire program has completed it’s execution

So what happens when we call the same program with a different number, say 10:

Calling the same function with a different number.
Calling the same function with a different number.
Calling the same function with a different number.
Calling the same function with a different number.
Calling the same function with a different number.
Calling the same function with a different number.

A function which returns a value:

What happens if we have a function that returns a value?

Take an example of below code snippet:

fn double_and_return(num: u8) -> u8 {
    return num * 2;
}

let x = double_and_return(30);

In the above case, double_and_return, it returns a u8. When Rust knows that the function has to return a u8 value, it goes and reserves the first index in the memory for the return value. So, the next element in the stack in our case would be 30.

Initial state before the code starts getting executed
When Line #5 is executed
When line #2 is executed
When function finally returns the value

As you can see, when the function finally returns a value, the yellow reserved space in the stack memory is filled with the return value, which is 60 in our case. Rust does not explicitly go out of the way to wipe memory in the stack. It just keeps overriding it. That’s what actually happens at the register level in the CPU as well. The bits are flipped from either 0 or 1 but never wiped off once it’s flipped.

You can imagin what would happen with a tuple, struct or an array. We would register the amount of space in the stack memory because we know that they are of finite length.

So, if a function returns a tuple of 3 elements, we go ahead and reserve 3 spaces for it in the stack memory.

fn double_thrice(num: u8) -> (u8, u8, u8) {
    return (num * 2, num * 2, num * 2);
}

let x = double_thrice(30);

So in the above case, our reserved space would look something like below:

When a function returns a tuple (or struct or array)

Here comes the pain, Vectors — vec![]:

Let’s take an example of below code snippet. A function is returning a Vector of u8.

fn double_and_return(x: u8): Vec<u8> {
    //...
}

What should be the amount of space recerved on the stack when a Vector is returned? Is it finite/infinite? It’s basically unknown. And we can’t deal with a situation when multiple functions return Vectors just using stack.

To be returnable to a stack, size must be known at the compile time. In any language.

So, the actual question is What can we actually return on the stack which is able to represent values in a vector? We can return a struct which holds the metadata about the vector. So we would not be returning all the elements inside the vector but some data which represents the values inside the Vector. We will call it VectMetadata. Since structs fixed in size during the entire lifecycle of the program, they can be returned on the stack of the memoy. Take a look at how a Vector metadata struct looks like.

struct VectMetadata {
    first_elem_index: usize
    length: usize;
    capacity: usize
}

Each and every vector holds a meta information about itself which allows us to accurately map it to a space in memory. This meta information about the vectors is absolutely crucial for Rust to determine the elements of a Vector. The 3 properties inside the struct are as follows:

first_elem_index is like a starting point of your vector in the heap memory. length is for how long is the vector. capacity is like the amount of space allocated for any future elements that will be added onto the vector.

The type of each of the property is known which is usize and hence we know that it is a finite size. So it can be stored on stack. So we’ve figured out where to store the metadata, but where to the actual u8 elements of the Vec go? It goes into the heap. Along with stack memory, a program also gets access to the Heap Memory.

A heap memory is a separate piece of memory which has it’s own sequence of bytes. So, let’s take for example our function returns a vector of [1, 2, 3]. The vector metadata is first_elem_index=2, length=3 and capacity=24. This means that, first element of the vector, i.e. 1 would be placed in the 2nd index of the heap memory. A length of 3 indicates that vector spans from 2nd index through to 4th index. And capacity is 24 bits.

So let’s see how does the vector in the memory look like when we execute a program.

Using heap for storing vectors

What if we push one more element to the vector. Now, our vector is [1, 2, 3, 4].

After pushing one more element in the vector

If we are lucky enough, we’ll get a new slot in the heap. We’ll simply update the length of the VectMetadata and the capacity and add the element to heap.

What if we push one more element to the vector, [1, 2, 3, 4, 5], but we are not lucky enough and don’t get a slot in the heap memory.

We push one more element to the vector but do not get the slot

The length does not get immediately updated. The capacity does not get immediately updated. Then comes the Memory Management System(or book management system) into picture. Memory Management System has to now go and find a slot in the heap memory which has 5 consecutive slots open. Once it finds a slot which has 5 consecutive slots available for storing the vector elements in the sequence, it goes and copies 1, 2, 3 and 4, then it updates the length to 5 and capacity to 40 and after that it goes and adds 5 to the vector. This can be really performance intensive. So, Rust suggests, if you are aware of the capacity of the vector, please do mention it. This will help in elevating the performance bottlenecks of a program running on a machine. Rust also provides us with ::with_capacity(5) on Vectors so that we can go ahead and use that to define a vector with a specific capacity and it can be stored on the stack memory rather than heap memory.

I hope you liked this article about Stack Memory vs Heap Memory. Please share your feedback in the comments section.

You can subscribe to my newsletter about The Rust Programming Language here. You can read about all the articles in this series here.

Rust
Memory Improvement
Performance
Web Development
Rust Programming Language
Recommended from ReadMedium