Getting Rusty

The goal of this MD book is to track my journey of getting Rusty and cover the full spectrum of Rust. The standard process of learning rust is throught the official Rust Programming Book, followed by interactive Rustling Course. For those with existing background working with a static compiler, one can also start with Rust by Example or the Comprehensive Rust - Google. Many examples in this are adopted from these resources.

Rust is a multi-paradigm systems programming language focusing on safety, speed, and concurrency.

Multi-paradigm
  • Imperative Programming
  • Object Oriented Programming with struct, enum, traits, methods
  • Functional Programming: immutability, higher-order functions, pattern matching
Compile time memory safety

Different type of memory bugs are prevented at compile time.

  • No uninitialized variables.
  • No double-frees.
  • No use-after-free.
  • No NULL pointers: no issue of dereferencing null pointers, dangling pointers.
  • No forgotten locked mutexes.
  • No data races between threads.
  • No iterator invalidation.
  • No undefined runtime behavior - what a Rust statement does is never left unspecified
  • Array access is bounds checked.
  • Integer overflow is defined (panic or wrap-around).
  • Many abstractions such as iterators are zero-cost. There is no garbage collector, so you can use exactly as much memory requried at the given time.

Furthermore, good compiler error messages allows writing and debugging rust code easy and more productive.

Rust is fast and resource efficient
  • Rust is statically compiled with rustc which uses LLVM as its backend. Performance (runtime and memory) is comparable to C/C++.
  • Full support for concurrency using OS threads with mutexes and channels. Refered as "Fearless concurrency" increases reliability on the compiler to ensure correctness at runtime.
  • also provides unsafe use of rust for even faster operations Nomicon
with expressive language features
  • Generics.
  • No overhead foreign function interface (FFI). Fucntion call be rust and C have identical performance to C function calls.
  • Built-in dependency manager: cargo.
  • Built-in support for testing.
  • Excellent Language Server Protocol support.
Other features
  • Strong, static yet expressive type system influenced by Haskell. Types allows to check potential problems and avoid them.
  • Concurrency can be done with any technique with thread saftey through the same type system ensuring memory saftey
  • Cross platform: compile to different systems, embedded systems and even web as WebAssembly (WASM)
  • C interoperability, but use of C reduces memory safety
  • supports many platforms and architectures: x86, ARM, WASM, Linux, Mac, Windows, ...

Rust has been voted the "most loved programming language" in the Stack Overflow Developer survey since 2016. But it is not the only language that has been voted as the most loved programming language. Google gathered data from their engineers to understand the rust learning curve, and found that they were proficient in Rust in less than 2 months

How to use this Guide

There are several useful keyboard shortcuts in mdBook:

  • Arrow-Left: Navigate to the previous page.
  • Arrow-Right: Navigate to the next page.
  • Ctrl + Enter: Execute the code sample that has focus.
  • s: Activate the search bar.

There are code blocks like the following in this book which allows you to edit and run the code as you wish.

fn main() {
    println!("Hello 🌍!");
}

And yes it looks very like C/C++, main function define with fn keytword is the entry point of the program. Blocks are delimited by curly braces and needs a semi-colon to end a statement.

Environment Setup

To get a more developer-like experience, you'd need to install rust toolchain on your machine.

Install Rustup toolchain installer, the site suggests the following command for linux

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

For windows check rust in windows you'd need also need to install Vistual Studio build tools.

This rustup tool that can manage multiple versions of Rust compiler on a machine. To check installation and to update the compiler version run: rustup update

If you want to develop inside a dev container, you can create one by following through this tutorial. In short install and configure Docker in your system, install "Dev Container" extension in VSCode and add and edit a dev container configuration file to the repo and rebuild the container to get started.

Alternatively you can try code in Online Rust Playground

IDEs and Tools

There are different IDE available for rust development. Visual Studio Code is the most popular one. Jetbrain also provide a separate IDE RustRover

To setup for rust install rust-analyzer for VScode. It will provide you with code completion, syntax highlighting, and other features. The toolset also includes linting provided by rustc and clippy to detect issues with your code.

If you are using vim you will have to install rust-analyzer from rustup and rust.vim

Cargo : Rust build tool, dependency manager and package manager

rustc is the Rust compiler that allows us to compile the file, but managing multiple files and dependencies can be difficult hence we use Cargo.

  • Start a project : cargo new prj_name

    creates a folder with Cargo.toml manifest files to track metadata and dependencies and src/main.rs with hello world script

    • initialize with a new git repo as well cargo new --vcs=git prj
    • creating a binary executable project : cargo new prj_name --bin
    • creating a library project cargo new prj_name --lib --> creates lib.rs instead of main.rs
  • Quickly check code to make sure it compiles without producing an executable cargo check

  • Add Dependencies:

    • cargo add crate_name: This will add the latest crate from the package registry Crates.io to the Cargo.toml file. To use the library you'll have to import the page in your source code: use crate_name::func; All crates on crates.io will automatically have its documentation built and available on docs.rs

    eg: cargo add rand To use dependencies in the code add the following line

    #![allow(unused)]
    fn main() {
    use rand::Rng;
    
    let rand_num = rand::thread_rng().gen_range(1..=100);
    }
    • use --git URL --branch branch --tag tag to add crate from git URL branch and tag
    • use --dev add as dev dependency, --build to add as build dependency, --target add as target dependency to the given target platform
    • use -F features or --feature features feature to add/activate additional features available in the crate
  • Build your project: cargo build

    compiles into a target/debug folder with compiled file and other build files

    • binaries can also be generate using rustc main.rs, but it becomes easier to build with cargo with multiple files and dependencies
    • building for release compiles with optimization to target/release cargo build --release
    • build also creates cargo.lock file that figure out and stores the versions of dependencies that fits the criteria and ensures we rebuild the same artifact every time
  • cargo update will ignore the Cargo.lock file and update a crate to newer version that fit the specification in Cargo.toml

  • Compile and Run the project: cargo run

  • Test your project: cargo test

  • Build documentation for your project: cargo doc --open also opens the doc in a browser

    builds into target\doc compiling doc comments /// from code into documentation

  • Remove generated artifacts: cargo clean

    removes /target folder

  • Publish a library to crates.io: cargo publish

Cargo is extensible with sub command plugins such as:

  • Linter: Clippy: install using rustup component add clippy and run (and automatically fix )using cargo clippy --fix
  • Formatter: Rustfmt: auto-formatting to community standards, run using cargo fmt

Other advance features: workspaces, build scripting.

Variables and Data Types

A variable is a named abstraction of the memory (storage location) which holds a data. It has a name, a datatype, value, reference to memory location. Rust provides type safety via static typing, i.e. type of the variables must be known at compile time. Hence, the type must be declared or infered based on the value assigned during the compilation type checking. Static typing helps catch type errors early in the development process, reducing the likelihood of runtime errors related to type mismatches.

Variable bindings are made with let, and the type is declared with a colon.


fn main() {
    let x: i32 = 42;  
    let y = 17;  // type i32 inferred from value during compilation 
    println!("x: {x}");
    // x = 24;    // cannot assign twice to immutable vairable
    // println!("x: {x}");
}

Variables are immutable by default. We can make it mutable using mut keyword. We can only change its value not the type.


fn main() {
    let mut x: i32 = 42;
    println!("x: {x}");
    x = 24;
    println!("x: {x}");
}

This doesn't mean let creates constants, constants are values that are bound to a name, must be type annotated, and are not allowed mut usage to change it. Constants can be declared in any scope, usually defined in global scope.

const PI: f32 = 3.14159265;

fn main(){
    let r: f32 = 2.0;
    println!("{}",PI*r.powf(2.0));
}

Even if the variables cannot be changed, one can reuse the variable name by redeclaring the variable. The first variable is said to be shadowed by the second. Furthermore we can even create a new type.


fn main() {
    let x = "Five"; 
    let x = x.len();  // shadowing to store different data type.

    {
        let x = x * 2;  // shadowing to create a new variable in the inner scope
        println!("x in the inner scope is: {x}");
    }

    println!("x in outer scope is: {x}");
}

Data Types

use std::io;

fn main(){
    println!("Please input your guess.");

    let mut guess = String::new();

    io::stdin()
        .read_line(&mut guess)              // read line from stdin
        .expect("Failed to read line");     // error handling

    let guess: i32 = guess.trim().parse().expect("Not a number!");  // trim end spaces and parse to annotated format

    println!("{}",guess);    
}

Basic built-in types, and the syntax for literal values of each type.

TypesLiterals
Signed integersi8, i16, i32, i64, i128, isize-10, 0, 1_000, 123_i64
Unsigned integersu8, u16, u32, u64, u128, usize0, 123, 10_u16
Floating point numbersf32, f643.14, -10.0e20, 2_f32
Unicode scalar valueschar'a', 'α', '∞' , '😍'
Booleansbooltrue, false

The types have widths as follows:

  • iN, uN, and fN are N bits wide,
  • isize and usize are the width of a pointer depending on the computer architecture,
  • char is 32 bits wide,
  • bool is 8 bits wide.

The signed numbers are stored using two's complement.

Integer overflow can occur when we change a variable to outside of the range. eg: 256 to u8 with range 0 - 255

  • in debug mode, Rust includes checks for integer overflow that cause your program to panic at runtime if this behavior occurs.
  • in release mode i.e compiled with --release flag, Rust doesn't panic but performs two’s complement wrapping, i.e 256 becomes 0, 257 becomes 1,...

Relying on integer overflow’s wrapping behavior is considered an error, hence rust provides families of methods to handle them explicitly.

  • Wrap in all modes with the wrapping_* methods, such as wrapping_add.
  • Return the None value if there is overflow with the checked_* methods.
  • Return the value and a boolean indicating whether there was overflow with the overflowing_* methods.
  • Saturate at the value’s minimum or maximum values with the saturating_* methods. e.g., (a * b).saturating_add(b * c).saturating_add(c * a).

Rust provides addition, subtraction, multiplication, division, and remainder operators. Here's list of Operators

Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.

Array is a datatype that stores a 'N' number of values of the same type 'T'. Here length 'N' (a compile-time constant) is part of the type [T, N] hence [u8; 3] and [u8; 4] are considered two different types.

fn main() {
    let mut a: [i8; 10] = [42; 10];
    a[5] = 0;   // indexing to access array
    println!("a: {a:?}");
}

Try accessing an out-of-bounds array element. Array accesses are checked at runtime and panic when index is beyond the arrays bound. Rust can usually optimize these checks away, and they can be avoided using unsafe Rust.

We can use literals to assign values to arrays.

The println! macro asks for the debug implementation with the ? format parameter: {} gives the default output, {:?} gives the debug output. Types such as integers and strings implement the default output, but arrays only implement the debug output. This means that we must use debug output here.

Adding #, eg {a:#?}, invokes a "pretty printing" format, which can be easier to read.

Like arrays, tuples have a fixed length but tuples can group together values of different types into a compound type. Fields of a tuple can be accessed by the period and the index of the value, e.g. t.0, t.1. The empty tuple () is referred to as the "unit type" and signifies Void or absence of a return value.

fn main() {
    let t: (i8, bool) = (7, true);
    println!("t.0: {}", t.0);  //indexing to access tuple
    println!("t.1: {}", t.1);
    let (x, y) = t;  // destructuring tuple to two variables x, y
    println!("{x}");
}

Type inference

Rust compiler infers types based on constraints given by variable declarations and usages. Machine code generated by such declaration is identical to the explicit declaration of a type.

fn takes_u32(x: u32) {
    println!("u32: {x}");
}

fn takes_i8(y: i8) {
    println!("i8: {y}");
}

fn main() {
    let x = 10;
    let y = 20;

    takes_u32(x);
    takes_i8(y);
    // takes_u32(y);
}

Type Conversion

Blocks and Scopes

A block in Rust contains a sequence of expressions, enclosed by braces {}. Each block has a value and a type, which are those of the last expression of the block:

fn main() {
    let z = 13;
    let x = {
        let y = 10;
        println!("y: {y}");
        z - y  // if commented block returns ()
    };
    println!("x: {x}");
}

A variable's scope is limited to the enclosing block.

You can shadow variables, both those from outer scopes and variables from the same scope:

fn main() {
    let a = 10;
    println!("before: {a}");
    {
        let a = "Hi";
        println!("inner scope: {a}");

        let a = true;
        println!("shadowed in inner scope: {a}");
    }

    println!("after: {a}");
}

Shadowing is different from mutation, because after shadowing both variable's memory locations exist at the same time. Both are available under the same name, depending where you use it in the code.

Memory Management & Ownership

All programs need to manage memory for efficient memory utilization, prevent memory leaks for safety and program stability. There are few approaches for memory management:

  1. Manual memory allocation and deallocation (eg in C/C++) for fine-grained control which can lead to memory leaks and segmentation faults
  2. Automatic memory management using garbage collection (GC) (eg: Java, Python) regularly check for unused memory and free the memory. Some uses Automatic Reference Counting (ARC) (eg: swift) to keep count of object in memory and deallocates the memory if count drops to zero.
    • If we forget to clean, it'll waste memory,
    • if we do it too early it would create invalid variable,
    • if we do it twice, it can create double-free bug.
  3. Ownership and borrowing (rust): set of rules enforces at compile time, where each piece of data or value has a single owner at a time and memory is automatically freed (i.e. the value is dropped) when the owner goes out of scope.

Stack and Heap Allocation | |Stack | Heap | |---|---|---| |Structure| Contiguous block of memory, hence size known and fixed| non-contiguous allocation, less organized, memory allocator finds empty spot and returns a pointer to use it| |Usage | static memory: local variable, function call management | dynamic memory: objects, data structure| | Management | By compiler allocated when function called, deallocated when exits | mannually allocated, deallocated depending on programming language | | speed| faster alloc/dealloc & access | slower alloc/dealloc & access due to complexity, following pointers| | scope | limited to function/block| usually global, as long as reference exists| | safety| follows LIFO to reduce memory corruption| prone to memory leaks and fragmentation|

When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack. The main purpose of ownership is to manage heap data.

Most of the data primitive data types have known, fixed size hence are stored on stack and easily popped off when their scope is over. String literal eg: "hello" are fixed and known at compile time. Dynamic data type such as String, whose size are unknown at compile time and whose size might change during runtime need to allocate memory on heap.

{
let mut s = String::from("hello"); // Memory requested from memory allocator at runtime

s.push_str(", world!"); // push_str() appends a literal to a String

println!("{s}");
} // scope of s is over and s is no longer valid

In Rust the memory is automatically returned once the variable that owns it goes out of scope (without need of GC). Rust calls a function called drop automatically at the closing curly bracket. This is similar to Resource Acquisition Is Initialization (RAII) pattern in C++.

let x = 5;  // data in stack
let y = x;  // copy of value of x is binded to y

let s1 = String::from("hello");  //String is made up of ptr to heap, len, capacity
let s2 = s1;  // String data (ptr, len, capacity) is copied not the actual string from the heap, same data is being used

// println!({"s1"})  // s1 is invalid after move

since both s1, s2 points to same memory, both might try to free the memory "double-free error". To ensure memory safety, after let s2 = s1; Rust consider s1 to be moved to s2 and s1 is no longer valid.

This type of copy is called a shallow copy in other language, but because of invalidation of first variable, it is known as move in rust.By design choice, Rust will never automatically create "deep" copies of data.

let s1 = String::from("hello");
let s2 = s1.clone();  // create deep copy of heap data

println!("s1 = {s1}, s2 = {s2}");

Rust has Copy trait that we can place on types that are stored on stack such as integers, if a type implements the Copy trait, variables that use it do not move, but rather are trivially copied, making them still valid after assignment to another variable. Copy is not allowed if Drop is implemented.

Passing a value to a function is similar to assigning a value to a variable.


fn main() {
    let s = String::from("hello");  // s comes into scope

    takes_ownership(s);             // s's value moves into the function...
    
    // s is no longer valid here

    let x = 5;     // x comes into scope

    makes_copy(x);   // x would move into the function,
    
    // but i32 is Copy, so it's still valid & can be used

} // Here, x goes out of scope, then s. But because s's value was moved, nothing special happens.

fn takes_ownership(some_string: String) { // some_string comes into scope
    println!("{some_string}");
} //some_string goes out of scope and `drop` is called. The backing memory is freed.

fn makes_copy(some_integer: i32) { // some_integer comes into scope
    println!("{some_integer}");
} // Here, some_integer goes out of scope.

fn main() {
    let s1 = gives_ownership();         //  moves return value into s1

    let s2 = String::from("hello");     // s2 comes into scope

    let s3 = takes_and_gives_back(s2);  // s2 is moved into function  which moves its return value into s3
} // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing
  // happens. s1 goes out of scope and is dropped.

fn gives_ownership() -> String {      // move its return value into the function  that calls it

    let some_string = String::from("yours"); // comes into scope

    some_string    // returned and moves out to the calling  function
}

// This function takes a String and returns one
fn takes_and_gives_back(a_string: String) -> String { // a_string comes into
                                                      // scope

    a_string  // a_string is returned and moves out to the calling function
}

When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless ownership of the data has been moved to another variable.

Taking ownership and then returning ownership with every function is a bit tedious, if we want to let a function use a value without transferring owernship through reference .

References & Borrowing

Reference is like a pointer to the address of the data storage, whose data is owned by some other variable. Unlike a pointer, a reference is guaranteed to point to a valid value of a particular type for the life of that reference. The action of creating a reference is called borrowing. The borrowed references are immutable by default. Multiple immutable references can exist simultaneously. We can also create a mutable reference, but only one mutable reference can exist at a time to prevent data races. Data race condition occur when two or more pointer access the same data at the same time and at least one is being used to write to the data without any mechanism to synchronize access to the data. For the same reason, we also cannot have a mutable reference while we have an immutable one to the same value, only allowed if the scope of immutable reference ends before creating mutable reference.


fn main() {
    let s1 = String::from("hello");

    let len = calculate_length(&s1);  // creates and passes a reference

    println!("The length of '{s1}' is {len}.");

    let s2 = &s1 ;  //s1 is unused after so mutable reference is fine

    change(&mut s1);  // pass mutable reference

    println!("{s}");
}

fn calculate_length(s: &String) -> usize {  // takes in a reference
    s.len()
}

fn change (s: &mut String){
    s.push_str(", world");
}

we can use curly brackets to create a new scope, allowing for multiple mutable references, just not simultaneous ones:

    let mut s = String::from("hello");

    {
        let r1 = &mut s;
    } // r1 goes out of scope here, so we can make a new reference with no problems.

    let r2 = &mut s;

For a contiguous sequence of elements in a collection, it might be useful to pass reference of a slice of the elements rather than the whole collection.

fn first_word(s: &String) -> usize {  // returning usize
    let bytes = s.as_bytes();  // convert strings to array of bytes

    for (i, &item) in bytes.iter().enumerate() {  // iterate over byte as indexed tuple
        if item == b' ' {  //  search for the byte representing space
            return i;
        }
    }

    s.len()  // meaningful and valid only as long as string is valid
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s); // word will get the value 5

    s.clear(); // this empties the String, making it equal to ""

    // word still has the value 5 here, but there's no more string that
    // we could meaningfully use the value 5 with. word is now totally invalid!
}
    let s = String::from("hello world");

    let hello = &s[0..5];  // first and last index, but stores start and length of slice, 
    let world = &s[6..11];
    let word = &s[..]; // both first and last index can be dropped to take entire string
fn first_word(s: &String) -> &str {
    let bytes = s.as_bytes();

    for (i, &item) in bytes.iter().enumerate() {
        if item == b' ' {
            return &s[0..i];
        }
    }

    &s[..]
}

fn main() {
    let mut s = String::from("hello world");

    let word = first_word(&s);

    s.clear(); // compiler indicates error!

    println!("the first word is: {word}");
}

The compiler will ensure the references into the String remain valid. Rust disallows the mutable reference in clear and the immutable reference in word from existing at the same time, and compilation fails. Not only has Rust made our API easier to use, but it has also eliminated an entire class of errors at compile time!

The type of string literal is &str: a slice pointing to specific point of the binary. This is also why string literals are immutable; &str is an immutable reference.

#![allow(unused)]
fn main() {
fn first_word(s: &str) -> &str {

}

dangling pointer — a pointer that references a location in memory that may have been given to someone else—by freeing some memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees that references will never be dangling references: if you have a reference to some data, the compiler will ensure that the data will not go out of scope before the reference to the data does.

fn main() {
    let reference_to_nothing = dangle();
}

fn dangle() -> &String { // dangle returns a reference to a String

    let s = String::from("hello"); // s is a new String

    &s // we return a reference to the String, s
} // Here, s goes out of scope, and is dropped. Its memory goes away.
  // Danger!

The solution is to return the string directly.

TODO Deferencing with * operator

Lifetimes

lifetimes is used to track how long references are valid. This ensures that references do not outlive the data they point to, preventing dangling references.

Control Flow

Conditional

Condition is expected to be a bool, other values are not automatically converted to a Boolean.

fn main() {
    let x = 42;
    if x == 0 {
        println!("zero!");
    } else if x < 100 {
        println!("small");
    } else {
        println!("large");
    }
}

We can use if as an expression, which returns the value of the last expression in the block (notice missing ;, for returning the value).

We can use if in an expression

fn main() {
    let x = 10;
    let size = if x < 20 { "small" } else { "large" }; 
    println!("number size: {}", size);
}

Note that values in both arm should have same data type since it has to be evaluated and assigned to the size variable at runtime.

Match Case

Match case allows to compare a value against multiple patterns and execute code based on the first matching pattern. Patterns can be made up of

  • literal values,
  • destructured arrays, enums, structs or tuples
  • variables
  • wildcards
  • Placeholders
#![allow(unused)]
fn main() {
match VALUE {
    PATTERN => EXPRESSION,
    PATTERN => EXPRESSION,
    PATTERN => EXPRESSION,
}
}
enum UsState {
    Alabama,
    Alaska,
    // --snip--
}

enum Coin {
    Penny,
    Nickel,
    Dime,
    Quarter(UsState),  //
}

fn value_in_cents(coin: Coin) -> u8 {
    match coin {
        Coin::Penny => 1,
        Coin::Nickel => 5,
        Coin::Dime => 10,
        Coin::Quarter(state) => {
            println!("State quarter from {state:?}!");
            25
        }
    }
}

fn main() {
    println!("penny: {}", value_in_cents(Coin::Quarter(UsState::Alaska)));
}

Match with Option

#![allow(unused)]
fn main() {
fn plus_one(x: Option<i32>) -> Option<i32> {
    match x {
        None => None,
        Some(i) => Some(i + 1),
    }
}

let five = Some(5);
let six = plus_one(five);
let none = plus_one(None);
}

Unlike if let, else if, else if letand else , match in Rust needs to be exhaustive in the sense that all possibilities for the value in the match expression must be accounted for. One wayy to ensure you've covered every possiblity is to have a catchall pattern for the last arm using a variable name matching any value, or using _ as a placeholder that doesn't binds to a variable ignoring any other value without any warning.

    let dice_roll = 9;
    match dice_roll {
        1..=3 => add_fancy_hat(),  // match range of patterns
        5 | 7 => remove_fancy_hat(), // match multiple patterns
        other => move_player(other), // catch all other cases in variable `other`
    }

    fn add_fancy_hat() {}
    fn remove_fancy_hat() {}
    fn move_player(num_spaces: u8) {}

if let

if let is an expression that lets you combine if and let into a single construct to match one pattern while ignoring the rest. This means less boilerplate code and works the same way without the exhaustive checking of match.

#![allow(unused)]

fn main() {
let config_max = Some(3u8);

match config_max {
    Some(max) => println!("The maximum is configured to be {max}"),
    _ => (),  // ignore everything else i.e `None`
}

// alternatively with if let
if let Some(max) = config_max {
    println!("The maximum is configured to be {max}");
}
}

with else

#![allow(unused)]

fn main() {
let mut count = 0;
match coin {
    Coin::Quarter(state) => println!("State quarter from {state:?}!"),
    _ => count += 1,
}


let mut count = 0;
if let Coin::Quarter(state) = coin {
    println!("State quarter from {state:?}!");
} else {
    count += 1;
}
}

Loops

While

While loop is a conditional loop that runs till the condition is satisfied.

fn main() {
    let mut x = 200;
    while x >= 10 {
        x = x / 2;
    }
    println!("Final x: {x}");
}

For

For loop, for-in loop is an iterator loop that iter over an elements of an iterator which can be fixed or could theoretically go on indefinitely.


fn main() {
    for x in 1..5 {
        println!("x: {x}");
    }

    for x in 1..=5 {
        println!("x: {x}");
    }
    for elem in [1, 2, 3, 4, 5] {
        println!("elem: {elem}");
    }

    for (index, elem) in [1, 2, 3, 4, 5].iter().enumerate() {
        println!("index: {index}, elem: {elem}");
    }

    for i in std::iter::repeat(5) {
    println!("turns out {i} never stops being 5");
    break; // would loop forever otherwise
}
}

Under the hood for loops use a concept called "iterators" to handle iterating over different kinds of ranges/collections.

Loop

Loop is idiomatic infinite loop similar to while true.

fn main() {

}

break and continue

fn main() {
    let mut i = 0;
    loop {
        i += 1;
        if i > 5 {
            break;
        }
        if i % 2 == 0 {
            continue;
        }
        println!("{}", i);
    }
}

Both continue and break can optionally take a label argument which is used to break out of nested loops, can be used for loop, while, for.

fn main() {
    let s = [[5, 6, 7], [8, 9, 10], [21, 15, 32]];
    let mut elements_searched = 0;
    let target_value = 10;
    'outer: for i in 0..=2 {
        for j in 0..=2 {
            elements_searched += 1;
            if s[i][j] == target_value {
                break 'outer;
            }
        }
    }
    print!("elements searched: {elements_searched}");
}

Example: Fibonacci Number


fn fib(n: u32) -> u32 {
    if n < 2 {
        todo!("Implement this");
    } else {
        todo!("Implement this");
    }
}

fn main() {
    let n = 20;
    println!("fib({n}) = {}", fib(n));
}
Solution

fn fib(n: u32) -> u32 {
    if n < 2 {
        return n;
    } else {
        return fib(n - 1) + fib(n - 2);
    }
}

fn main() {
    let n = 20;
    println!("fib({n}) = {}", fib(n));
}

Functions

The last expression in a function body (or any block) becomes the return value if ; is ommited at the end of the expression. The return keyword can also be used for early return. Some functions have no return value, and return the 'unit type', (). The compiler will infer this if the -> () return type is omitted.

fn gcd(a: u32, b: u32) -> u32 { // declaring fn with parameters annotated by their type -> return type
    if b > 0 {
        gcd(b, a % b)   // last expression in the block is a returned if ; is ommited
    } else {
        a               // last expression in the block is a returned if ; is ommited 
    }
}

fn main() {
    println!("gcd: {}", gcd(143, 52));  // calling gcd with argument values
}

Rust code uses snake case as the conventional style for function and variable names, in which all letters are lowercase and underscores separate words.

Overloading is not supported -- each function has a single implementation. Always takes a fixed number of parameters. Default arguments are not supported. Macros can be used to support variadic functions. Always takes a single set of parameter types. These types can be generic.

Statements are instructions that perform some action and do not return a value. Expressions evaluate to a resultant value. Rust is an expression-based language.

Structs and Enums

Structure (struct) is a custom data type that lets you package together multiple related, named values (fields) in a meaniful group similar to object's data attributes. We create an instance of struct by specifying concrete values for each fields in key:value pairs, and access the values using the dot notation. The individual fields cannot be marked as mutable hence the entire instance needs to be declared as mutable.


struct User {
    active: bool,
    username: String,
    email: String,
    sign_in_count: u64,
}

fn main(){
    let mut user1 = User {
        active: true, 
        sign_in_count: 1,
        username: String::from("someone123"), 
        email: String::from("someone@example.com"), 
        };
    
    user1.sign_in_count += 1;

    println!("user1.username: {}", user1.username);
    
    let mut user2 = User {
        email: String::from("someone456@example.com"),
        ..user1   // remaining fields of user1 is copied into user2
        // string is also copied, not a stack only copy hence, user1 won't be available
    };
    
    println!("user2.sign_in_count: {}", user2.sign_in_count);
    // println!("user1.username: {}", user1.username); 

}

// functions can use the field init shorthand syntax rather than repeat each field names
fn build_user(email: String, username: String) -> User {
    User {
        active: true,
        username,
        email,
        sign_in_count: 1,
    }
}

Special Structs

Tuple Structs are similar to Tuples, without field names

struct Color(i32, i32, i32);
struct Point(i32, i32, i32);

fn main() {
    let black = Color(0, 0, 0);
    let origin = Point(0, 0, 0);
}

unit-like structurs don't have any fields at all, similar to () the unit type. These are useful when you need to implement a trait on some type but don’t have any data that you want to store in the type itself.

struct AlwaysEqual;

fn main() {
    let subject = AlwaysEqual;
}

Enum

Enumeration (enum) allows to define a type by enumerating its possible variants. Enum give you a way of saying a value is one of a possible set of values, not both at the same time. We can even optionally put data directly into each enum variant as well by defining its associated data type.

#![allow(unused)]
fn main() {
enum IpAddr {
        V4(String),
        V6(String),
}

let home = IpAddr::V4(String::from("127.0.0.1"));

let loopback = IpAddr::V6(String::from("::1"));

fn route(ip_kind: IpAddrKind) {}  // to call function with either variant
}

The standard library define IpAddr enum with struct as data type instead of string.

A popular enum defined by standard library is Option included in the prelude, which encodes the common scenario in which a value could be something or it could be nothing (instead of null feature). With this functionality compiler can check whether you have handled all the cases and prevent bugs related to null references. You have to convert an Option<T> to a T before you can perform T operations with it. This helps catch one of the most common issues with null: assuming that something isn’t null when it actually is.

#![allow(unused)]
fn main() {
enum Option<T> {
    None,
    Some(T),
}

let absent_number: Option<i32> = None;
let some_number = Some(5);

}

Another most common enum is Result that represent either success Ok or failure Err.

Trait

Methods

All functions defined within an impl block are called associated functions.

Methods are associated function that specifies the behaivors associated with a struct type. Unlike functions they are defined wihtin the context of a struct (or enum or a trait object). Their first parameter is always a reference to the struct instance (self), which represent instance of the struct the method is being called on. We don't have to repeat the type of self in every method. Also all the things we can do with an instance of a type is placed in one impl block rather than in various places in the library. However, each struct is allowed to have multiple impl blocks.


struct Rectangle {
    width: u32,
    height: u32,
}

impl Rectangle {
    fn area(&self) -> u32 { // &self is short for self: &Self ; alias for the type
        self.width * self.height
    }
}

fn main() {
    let rect1 = Rectangle {
        width: 30,
        height: 50,
    };

    println!("The area of the rectangle is {}", rect1.area());
}

Unlike C/C++ which uses . for calling method on object and -> for calling method on pointer, Rust has automatic referencing and dereferencing. When you call a method, Rust automatically adds in &, &mut, or * so object matches the signature of the method. p1.distance(&p2); is equivalent to (&p1).distance(&p2); Given the receiver and name of a method, Rust can figure out definitively whether the method is reading (&self), mutating (&mut self), or consuming (self).

We can define associated functions that don’t have self as their first parameter (and thus are not methods) eg: is String::from function that’s defined on the String type. These associated functions are often used for constructors that will return a new instance of the struct. These are often called new, but new isn’t a special name and isn’t built into the language.

impl Rectangle {
    fn square(size: u32) -> Self {
        Self {
            width: size,
            height: size,
        }
    }
}

To call this associated function, we use the :: syntax with the struct name; let sq = Rectangle::square(3); is an example. This function is namespaced by the struct.

Examples


#[derive(Debug)]
struct Fibonacci {
    current: u8,
    previous: u8,
}

impl Fibonacci {
    fn new() -> Self {
        Self {
            current: 0,
            previous: 0,
        }
    }
}

impl Iterator for Fibonacci {
    type Item = u8;

    fn next(&mut self) -> Option<Self::Item> {
        if self.current == 0 {
            self.current = 1;
            return Some(self.current);
        }

        let next_value = self.previous.checked_add(self.current)?;
        self.previous = self.current;
        self.current = next_value;
        Some(self.current)
    }
}

fn main() {
    for fb in Fibonacci::new() {
        println!("{fb}");
        if fb > 30 {
            break;
        }
    }
    println!("------");
    for fb in Fibonacci::new() {
        println!("{fb}");
    }
}

Error Handling

Rust requires you to acknowledge the possibility of an error and take some action before your code will compile. Rust groups errors into two major categories:

  1. recoverable: eg file not found error -> report and retry without stoping the program
    • use Result<T,E> enum with two variants Ok(T) and Err(E)
  2. unrecoverable errors: eg: Runtime failures like failed bounds checks, out of memory, or file I/O errors
    • use panic! to stop execution
    • use Non-panicking APIs (eg: Vec::get) if crashing is not acceptable.

Unrecoverable error with panic!

Rust handles fatal errors (unrecoverable and unexpected) with a "panic". panic can be triggered

  • by taking an action that causes thhe code to panic, usually as symptoms of bugs in program logic.
  • explicitly calling panic! macro

By default, these panics will print a failure message, unwind, clean up the stack, and quit. Unwinding means rust walk back up the stack and cleans up the data from each function it encounters. The unwinding can be caught. We can also abort immediately on panic without cleaning up , which makes the binary as small as possible by adding the following line in Cargo.toml.

[profile.release]
panic = 'abort'
fn main() {
    panic!("crash and burn");
}

Error Message : src/main.rs:2:5 indicates that it’s the second line, fifth character of our src/main.rs file.

In many cases call to panic! might be part of some other function calls in different files, in such cases the filename and line number of panic! might be reported, not the line of code that eventually led to panic! call.

fn main() {
    let v = vec![1, 2, 3];

    v[99];  // accessing invalid index 
}

In other languages, such Buffer overread leads to security vulnerabilities, but Rust will trigger a panic and stop execution. Rust provides a note telling to run with RUST_BACKTRACE to get a backtrace (list of all function call leading to panic) of exactly what happened to cause the error.

$ RUST_BACKTRACE=1 cargo run

Catching the panic


use std::panic;

fn main() {
    let result = panic::catch_unwind(|| "No problem here!");
    println!("{result:?}");

    let result = panic::catch_unwind(|| {
        panic!("oh no!");
    });
    println!("{result:?}");
}

Recoverable error with Result<T,E>

Many functions in rust returns a Result to state if it succeeded or failed in its operation. While reading a file, the file might not exist, or we might not have permission to access the file, leading to different types of errors, Result enum can convey such information.


use std::fs::File;       // T for file open
use std::io::ErrorKind;  // E variants

fn main() {
    let greeting_file_result = File::open("hello.txt");  // returns a result

    let greeting_file = match greeting_file_result {
        Ok(file) => file,
        Err(error) => match error.kind() { 
            ErrorKind::NotFound => match File::create("hello.txt") {
                Ok(fc) => fc,
                Err(e) => panic!("Problem creating the file: {e:?}"),
            },
            other_error => {
                panic!("Problem opening the file: {other_error:?}");
            }
        },
    };
}

Match can be verbose, and might not communicate intent well. Helper methods defined on Result can be helpful for more specific task. Such as - unwrap(): return value inside Ok or panic! if Err.

  • expect(Msg): allows to return good message
  • unwrap_or_else.
use std::fs::File;
use std::io::ErrorKind;

fn main() {
    let greeting_file = File::open("hello.txt").unwrap_or_else(|error| {
        if error.kind() == ErrorKind::NotFound {
            File::create("hello.txt").unwrap_or_else(|error| {
                panic!("Problem creating the file: {error:?}");
            })
        } else {
            panic!("Problem opening the file: {error:?}");
        }
    });
}

Packages, Crates and Modules

As program grows its importante to group related functionality and group separate code with distict features. Code organization in Rust is done using the concept of packages, crates and modules.

Project code iss split into multiple modules and each module is a separate file. A module is a file that contains one or more items, and the items can be functions, structs, enums, constants, etc.

A package is a collection of crates. A crate is a binary or library that is built using the Rust compiler.

Packages: A Cargo feature that lets you build, test, and share crates Crates: A tree of modules that produces a library or executable Modules and use: Let you control the organization, scope, and privacy of paths Paths: A way of naming an item, such as a struct, function, or module