Async Rust Explained - Pinning

One of Rust's core strengths is its emphasis on memory safety and performance. However, certain programming patterns, such as self-referential structures and asynchronous programming, require careful handling to maintain safety guarantees. This is where pinning comes into play. In this blog post, we'll explore pinning in Rust in detail, understand why it's essential, especially in asynchronous contexts with Tokio runtime, and learn how to use it correctly.

What Is Moving?

In Rust, moving refers to transferring ownership of data from one variable to another. When you move a value, the original variable becomes invalid, and the new variable owns the data.

let x = String::from("Hello");
let y = x; // x is now invalid; y owns the String

Moving is crucial for memory safety, preventing issues like double-free errors. However, moving data can change its memory address, which can be problematic for structures that rely on stable memory locations, such as self-referential structs or asynchronous tasks that might be moved across threads.

What Is Pinning?

Pinning in Rust ensures that the memory address of a value remains constant; that is, the value cannot be moved. Pinning is essential when you have data that must not change its location in memory after it has been created.

Pinning is achieved using the Pin type, which wraps a pointer type (like &mut T or Box<T>) and guarantees that the pointee will not move.

use std::pin::Pin;

let value = Box::new(42);
let pinned_value = Pin::new(value);

Once a value is pinned, it cannot be moved, which is critical for ensuring the safety of self-referential structures and certain asynchronous patterns.

The `Pin` and `Unpin` Traits

`Pin`

The Pin<P> type is a wrapper around a pointer type P. It provides the guarantee that the data it points to will not be moved in memory.

pub struct Pin<P> {
    pointer: P,
}

`Unpin`

The Unpin trait is an auto trait (similar to Send and Sync) that indicates whether it's safe to move a pinned value. Most types in Rust are Unpin by default, meaning they can be moved even if they are pinned.

pub auto trait Unpin {}

To prevent a type from being Unpin, you can implement it in such a way that it does not automatically implement Unpin. This is where PhantomPinned comes into play.

Interaction Between `Pin` and `Unpin`

If a type is Unpin, it means it can be safely moved even when pinned.
If a type is not Unpin, it cannot be moved after being pinned.

`Box::pin` vs `Pin::new(Box::new(...))`

When you want to pin a value on the heap, you have two options:

Using `Box::pin`

let pinned_value = Box::pin(MyStruct { /* fields */ });

Using `Pin::new` with `Box::new`

let boxed_value = Box::new(MyStruct { /* fields */ });
let pinned_value = Pin::new(boxed_value);

Both approaches achieve the same result: a pinned value on the heap. However, Box::pin is more idiomatic and concise.

What Is `PhantomPinned` For?

PhantomPinned is a marker type used to prevent a type from automatically implementing Unpin. Since most types are Unpin by default, you need a way to indicate that your type should not be Unpin.

use std::marker::PhantomPinned;

struct MyStruct {
    // fields
    _marker: PhantomPinned,
}

By including PhantomPinned as a field, you ensure that MyStruct does not implement Unpin, and thus cannot be moved once pinned.

Pinning to Stack and Heap

Pinning on the Heap

Pinning on the heap is straightforward using Box::pin:

let pinned_value = Box::pin(MyStruct { /* fields */ });

Heap-pinned values are useful when you need the value to outlive the current scope or be shared across threads.

Pinning on the Stack

You can also pin values on the stack using Pin::new:

let mut value = MyStruct { /* fields */ };
let pinned_value = Pin::new(&mut value);

However, stack-pinned values are limited by the stack frame's lifetime and cannot be moved out of the current scope.

Practical Example

Let's explore a self-referential struct and see how pinning helps.

Self-Referential Struct Without Pinning

Attempting to create a self-referential struct without pinning can lead to undefined behavior.

struct SelfReferential {
    data: String,
    ptr: *const String,
}

impl SelfReferential {
    fn new(data: String) -> Self {
        SelfReferential {
            ptr: &data, // ERROR: 'data' does not live long enough
            data,
        }
    }
}

This code won't compile because data does not live long enough for ptr to be valid.

Correcting with Pinning

use std::pin::Pin;
use std::marker::PhantomPinned;

struct SelfReferential {
    data: String,
    ptr: *const String,
    _marker: PhantomPinned,
}

impl SelfReferential {
    fn new(data: String) -> Pin<Box<Self>> {
        let mut boxed = Box::pin(SelfReferential {
            data,
            ptr: std::ptr::null(),
            _marker: PhantomPinned,
        });

        let self_ref = unsafe { boxed.as_mut().get_unchecked_mut() };
        self_ref.ptr = &self_ref.data as *const String;
        boxed
    }

    fn get_ref(&self) -> &String {
        unsafe { &*self.ptr }
    }
}

In this corrected version:

We use Box::pin to allocate the struct on the heap and pin it.
PhantomPinned ensures the struct is not Unpin.
We initialize ptr after pinning to ensure the address of data is stable.

Self-Referential Struct `mem::swap` Example

Attempting to move a pinned self-referential struct can cause safety issues.

use std::mem;
use std::pin::Pin;

let mut a = SelfReferential::new(String::from("Hello"));
let mut b = SelfReferential::new(String::from("World"));

// Attempting to swap 'a' and 'b'
mem::swap(&mut a, &mut b); // ERROR: cannot move a pinned value

The compiler prevents this operation because swapping would move the pinned values, violating the pinning guarantees.

Why Is Pinning Important in Async Rust?

Asynchronous programming in Rust heavily relies on futures, which may be moved across threads by the executor (like Tokio). If a future contains self-referential structures or relies on its memory address being stable, moving it can cause undefined behavior.

Pinning ensures that once a future is polled, its memory address remains stable throughout its execution, even if it is moved across threads.

Example Without Pinning

async fn example() {
    let data = String::from("Hello");
    let ptr = &data;
    // Use 'ptr' asynchronously
}

This code is unsafe because data might be moved, invalidating ptr.

Correcting with Pinning

use std::pin::Pin;
use std::marker::PhantomPinned;

struct MyFuture {
    data: String,
    ptr: *const String,
    _marker: PhantomPinned,
}

impl MyFuture {
    fn new(data: String) -> Self {
        MyFuture {
            data,
            ptr: std::ptr::null(),
            _marker: PhantomPinned,
        }
    }
}

impl std::future::Future for MyFuture {
    type Output = ();

    fn poll(self: Pin<&mut Self>, _cx: &mut std::task::Context<'_>) -> std::task::Poll<Self::Output> {
        let self_ref = unsafe { self.get_unchecked_mut() };
        if self_ref.ptr.is_null() {
            self_ref.ptr = &self_ref.data as *const String;
        }
        println!("Data: {}", unsafe { &*self_ref.ptr });
        std::task::Poll::Ready(())
    }
}

In this example:

MyFuture is not Unpin due to PhantomPinned.
The poll method uses Pin<&mut Self> to guarantee that self will not be moved.
We safely initialize ptr after the future is pinned.

Tokio Runtime and Pinning

How Tokio Handles Async Operations

Tokio is a popular asynchronous runtime in Rust that allows you to write non-blocking, asynchronous code. Tokio's executor can move futures between threads to balance the load and improve performance.

use tokio::task;

#[tokio::main]
async fn main() {
    let handle = task::spawn(async {
        // Async operations
    });

    handle.await.unwrap();
}

Why Pinning Is Important with Tokio

Since Tokio can move your futures between threads, any references or pointers inside the future that rely on the memory address being stable can become invalid if the future is moved.

Potential Issue

async fn example() {
    let data = String::from("Hello");
    tokio::spawn(async {
        let ptr = &data; // ERROR: 'data' may not live long enough
        // Use 'ptr' asynchronously
    });
}

In this code:

data is owned by the outer future.
The inner future holds a reference to data.
If Tokio moves the inner future to another thread, data may not be valid in that context.

Correcting with Pinning

To safely handle this, you can pin the data to ensure its memory address remains valid, even if the future is moved across threads.

use std::pin::Pin;
use std::marker::PhantomPinned;
use tokio::task;

struct PinnedData {
    data: String,
    ptr: *const String,
    _marker: PhantomPinned,
}

impl PinnedData {
    fn new(data: String) -> Pin<Box<Self>> {
        let mut boxed = Box::pin(PinnedData {
            data,
            ptr: std::ptr::null(),
            _marker: PhantomPinned,
        });

        let self_ref = unsafe { boxed.as_mut().get_unchecked_mut() };
        self_ref.ptr = &self_ref.data as *const String;
        boxed
    }
}

#[tokio::main]
async fn main() {
    let pinned_data = PinnedData::new(String::from("Hello"));

    let handle = task::spawn(async move {
        // Use 'pinned_data' safely
        println!("Data: {}", unsafe { &*pinned_data.ptr });
    });

    handle.await.unwrap();
}

In this corrected version:

PinnedData is pinned on the heap, ensuring its memory address is stable.
The inner async block can safely use pinned_data, even if moved across threads by Tokio.

Moving Futures Across Threads

Tokio's scheduler may move tasks between threads to optimize resource utilization. If your future contains references to stack variables or relies on memory addresses, moving it can invalidate those references.

Pinning ensures that the data the future relies on is safely anchored in memory, preventing undefined behavior when the future is moved.

Conclusion

Pinning in Rust is a powerful tool that ensures the memory addresses of certain values remain constant, which is essential for:

Self-referential structs that hold pointers or references to their own data.
Asynchronous programming, where futures may be moved across threads by executors like Tokio.

By understanding and correctly using Pin, Unpin, PhantomPinned, and the related patterns, you can write safe and efficient Rust code that leverages advanced features without compromising memory safety.

What Is Moving?

What Is Pinning?

The Pin and Unpin Traits

Pin

Unpin

Interaction Between Pin and Unpin

Box::pin vs Pin::new(Box::new(...))

Using Box::pin

Using Pin::new with Box::new

What Is PhantomPinned For?

Pinning to Stack and Heap

Pinning on the Heap

Pinning on the Stack

Practical Example

Self-Referential Struct Without Pinning

Correcting with Pinning

Self-Referential Struct mem::swap Example

Why Is Pinning Important in Async Rust?

Example Without Pinning

Correcting with Pinning

Tokio Runtime and Pinning

How Tokio Handles Async Operations

Why Pinning Is Important with Tokio

Potential Issue

Correcting with Pinning

Moving Futures Across Threads

Conclusion

The `Pin` and `Unpin` Traits

`Pin`

`Unpin`

Interaction Between `Pin` and `Unpin`

`Box::pin` vs `Pin::new(Box::new(...))`

Using `Box::pin`

Using `Pin::new` with `Box::new`

What Is `PhantomPinned` For?

Self-Referential Struct `mem::swap` Example