class: title # Unsafe Rust ## Scott Rixner and Alan Cox --- layout: true --- ## Safety in Rust * Rust guarantees memory safety and thread safety at compile time. * The Borrow Checker ensures: * No dangling pointers. * No double frees. * No data races. * These checks are statically analyzed. * **Problem:** Static analysis is conservative. Some valid programs are rejected. --- ## What is Unsafe Rust? * **Unsafe Rust** is a superset of Safe Rust. * It allows you to perform operations that the compiler cannot verify as safe. * It does **not** disable the borrow checker. * It transfers the responsibility of safety checks from the **compiler** to the **programmer**. * *Unsafe* doesn't mean the code *is* dangerous; it means the compiler can't prove it's safe. --- ## The 5 Unsafe Capabilities In Unsafe Rust, you can do five things you cannot do in Safe Rust: 1. Dereference raw pointers. 2. Call unsafe functions or methods. 3. Access or modify mutable static variables. 4. Implement an unsafe trait. 5. Access fields of unions. --- ## The `unsafe` Keyword * Used to mark: * **Blocks:** `unsafe { ... }` - "I promise this code is safe." * **Functions:** `unsafe fn foo() { ... }` - "You must read the docs to call this safely." * **Traits:** `unsafe trait Foo { ... }` - "Implementing this requires manual safety guarantees." --- ## Raw Pointers * Accesses to references (`&T`, `&mut T`) are checked. * Raw Pointers (`*const T`, `*mut T`) are similar to C pointers: * Ignore borrowing rules (can have both immutable and mutable pointers to the same location). * Are not guaranteed to point to valid memory. * Are allowed to be null. * Do not implement any automatic cleanup. --- ## Creating vs. Dereferencing ```rust fn main() { let mut num = 5; // Creating raw pointers (SAFE) let r1 = &num as *const i32; let r2 = &mut num as *mut i32; // Dereferencing (UNSAFE) unsafe { println!("r1 is: {}", *r1); println!("r2 is: {}", *r2); } } ``` * Creating raw pointers is **safe**. * Dereferencing them is **unsafe**. --- ## Why use Raw Pointers? * Interfacing with C code (FFI). * Building safe abstractions that the borrow checker can't understand. * Example: Splitting a slice into two mutable disjoint parts. --- ## Calling Unsafe Functions ```rust unsafe fn dangerous() {} fn main() { unsafe { dangerous(); } } ``` * An unsafe function requires the caller to ensure certain preconditions. * Unsafe functions must be called from within an `unsafe` block. --- ## Safe Abstractions * Ideally, unsafe code is wrapped in a safe API. * The safe wrapper ensures that no matter how it is called, no undefined behavior occurs. * Example: `Vec::push`, `String::from_utf8_unchecked`. --- ## Example: `split_at_mut` ```rust // This won't compile in safe Rust! fn split_at_mut(slice: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) { let len = slice.len(); assert!(mid <= len); (&mut slice[..mid], &mut slice[mid..]) // Error: cannot borrow *slice as mutable more than once } ``` * Goal: Split a mutable slice into two mutable slices at an index. * Safe Rust can't prove that the two returned slices don't overlap. --- ## Implementing `split_at_mut` with Unsafe ```rust use std::slice; fn split_at_mut(slice: &mut [i32], mid: usize) -> (&mut [i32], &mut [i32]) { let len = slice.len(); let ptr = slice.as_mut_ptr(); assert!(mid <= len); unsafe { ( slice::from_raw_parts_mut(ptr, mid), slice::from_raw_parts_mut(ptr.add(mid), len - mid), ) } } ``` * Uses raw pointers and pointer arithmetic (`ptr.add`). --- class: middle ## Question: Why Is `split_at_mut` Correct? Why doesn't the compiler require lifetime annotations on the function signature? Why does the compiler allow us to return two mutable references to the same underlying object? --- ## Extern Functions (FFI) ```rust extern "C" { fn abs(input: i32) -> i32; } fn main() { unsafe { println!("Absolute value of -3 according to C: {}", abs(-3)); } } ``` * Calling code written in other languages (usually C) is always unsafe. * Rust can't check C's memory safety. --- ## Unions ```rust union MyUnion { f1: u32, f2: f32, } fn main() { let u = MyUnion { f1: 10 }; unsafe { // Interpreting bits of u32 as f32 let f = u.f2; } } ``` * C-style unions are unsafe because Rust can't know which variant is valid. * Mainly used for FFI. --- ## Unsafe Traits * A trait is unsafe when at least one of its methods has some invariant that the compiler cannot check. * Example: `Send` and `Sync`. * `unsafe trait Send {}` * Implementing `Send` for a type is a promise that it is safe to send across threads. * If you implement it incorrectly, you get data races. --- ## Example: A Custom Unsafe Trait ```rust // SAFETY: Implementor promises that all-zero bits are a valid value. unsafe trait Zeroable {} // Correct: 0 is a valid u32 unsafe impl Zeroable for u32 {} // WRONG / Undefined Behavior: // 0 is NOT a valid reference (null pointer) unsafe impl Zeroable for &i32 {} ``` * Imagine a trait for types that are safe to initialize with all zeros. * The compiler cannot verify if a type is valid with zeroed memory. * If `&i32` claimed to be `Zeroable`, a safe function using it could create a null reference. --- ## "Safe" Unsafe Code * An unsafe block is **sound** if it is *impossible* for safe code to cause undefined behavior. * Unsafe code must rely on invariants that it maintains, or that are checked at runtime. * *Bad:* Unsafe function that crashes if you pass `0`. * *Good:* Safe function that returns an error `Result` if you pass `0`, then calls the unsafe function and returns an ok `Result`. --- ## Undefined Behavior * If you violate Rust's safety contracts in an `unsafe` block you get undefined behavior. * It is **not** just a crash, you are back in the world of C, you could get: * The program runs "correctly" (for now). * Corrupted data. * Security vulnerabilities. * Segfaults. * The compiler optimizes your code assuming undefined behavior never happens, which can lead to wild results. --- ## Common Sources of Undefined Behavior 1. Dereferencing null or dangling pointers. 2. Reading uninitialized memory. 3. Breaking pointer aliasing rules (creating overlapping `&mut T`). 4. Data races. 5. Invalid values in primitive types (e.g., `bool` that is not 0 or 1). --- class: middle ## Question: Implications? * Safe code doesn't guarantee correctness, only the absence of Undefined Behavior * Unsafe code can assume the latter of the invoking safe code, but not the former How does this affect your approach to writing unsafe code? --- ## Checking Unsafe Code * Miri: An interpreter for Rust's MIR (Mid-level Intermediate Representation). * Can detect many forms of undefined behavior that typical tests miss. * `cargo miri test` * Sanitizers: ASAN (AddressSanitizer), TSAN (ThreadSanitizer), MSAN (MemorySanitizer). --- ## Unsafe Usage in the Wild * **[Rust Foundation Report (May 2024)](https://foundation.rust-lang.org/news/unsafe-rust-in-the-wild-notes-on-the-current-state-of-unsafe-rust/):** * ~19.11% of significant Rust crates directly use `unsafe`. * ~34.35% call other crates that use `unsafe`. * **[Is Rust Used Safely by Software Developers? (2020)](https://arxiv.org/abs/2007.00752):** * <30% of libraries use `unsafe` explicitly. * 50%+ of libraries depend on unsafe code. --- class: middle ## Question: Why Is `unsafe` Used? --- ## Responsibility ```rust // SAFETY: We checked that index is within bounds unsafe { *ptr.add(index) } ``` * With `unsafe`, you opt-out of the compiler's safety net. * You must verify invariants manually. * Keep `unsafe` blocks as small as possible. * Isolate unsafe code within safe abstractions. * Document *why* a block is safe with comments. --- ## Conclusion * `unsafe` is a powerful tool, essential for systems programming and FFI. * It enables Safe Rust to exist (by powering the standard library). * Use it sparingly and responsibly. * Always prefer Safe Rust solutions unless strictly necessary. --- ## class: middle ## Question: The Big Picture Do you believe that programmers will be more likely to write correct programs with Rust's unsafe than with C?