Getting Rusty
The goal of this MD book is to track my journey of getting Rusty and cover the full spectrum of Rust. The standard process of learning rust is throught the official Rust Programming Book, followed by interactive Rustling Course. For those with existing background working with a static compiler, one can also start with Rust by Example or the Comprehensive Rust - Google. Many examples in this mdbook are copied/adopted from these resources.
Rust is a multi-paradigm systems programming language focusing on safety, speed, and concurrency.
Multi-paradigm
- Imperative Programming
- Object Oriented Programming with struct, enum, traits, methods
- Functional Programming: immutability, higher-order functions, pattern matching
Compile time memory safety
Different type of memory bugs are prevented at compile time.
- No uninitialized variables.
- No double-frees.
- No use-after-free.
- No NULL pointers: no issue of dereferencing null pointers, dangling pointers.
- No forgotten locked mutexes.
- No data races between threads.
- No iterator invalidation.
- No undefined runtime behavior - what a Rust statement does is never left unspecified
- Array access is bounds checked.
- Integer overflow is defined (panic or wrap-around).
- Many abstractions such as iterators are zero-cost. There is no garbage collector, so you can use exactly as much memory requried at the given time.
Furthermore, good compiler error messages allows writing and debugging rust code easy and more productive.
Rust is fast and resource efficient
- Rust is statically compiled with
rustc
which uses LLVM as its backend. Performance (runtime and memory) is comparable to C/C++. - Full support for concurrency using OS threads with mutexes and channels. Refered as "Fearless concurrency" increases reliability on the compiler to ensure correctness at runtime.
- also provides unsafe use of rust for even faster operations Nomicon
with expressive language features
- Generics.
- No overhead foreign function interface (FFI). Function call be rust and C have identical performance to C function calls.
- Built-in dependency manager: cargo.
- Built-in support for testing.
- Excellent Language Server Protocol support.
Other features
- Strong, static yet expressive type system influenced by Haskell. Types allows to check potential problems and avoid them.
- Concurrency can be done with any technique with thread saftey through the same type system ensuring memory saftey
- Cross platform: compile to different systems, embedded systems and even web as WebAssembly (WASM)
- C interoperability, but use of C reduces memory safety
- supports many platforms and architectures: x86, ARM, WASM, Linux, Mac, Windows, ...
Rust has been voted the "most loved programming language" in the Stack Overflow Developer survey since 2016. But it is not the only language that has been voted as the most loved programming language. Google gathered data from their engineers to understand the rust learning curve, and found that they were proficient in Rust in less than 2 months
How to use this Guide
There are several useful keyboard shortcuts in mdBook:
Arrow-Left
: Navigate to the previous page.Arrow-Right
: Navigate to the next page.Ctrl + Enter
: Execute the code sample that has focus.s
: Activate the search bar.
There are code blocks like the following in this book which allows you to edit and run the code as you wish.
fn main() { println!("Hello 🌍!"); }
And yes it looks very like C/C++, main
function define with fn
keytword is the entry point of the program. Blocks are delimited by curly braces and needs a semi-colon to end a statement.
Environment Setup
To get a more developer-like experience, you'd need to install rust toolchain on your machine.
Install Rustup toolchain installer, the site suggests the following command for linux
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
For windows check rust in windows you'd need also need to install Vistual Studio build tools.
A toolchain is the combination of a compilation target (converting rust code to machine code for a platform) and a release channel (release version such as stable, beta, nightly). Target platforms are split into tiers, from “guaranteed-to-work” Tier 1 to “best-effort” Tier 3.
rustup toolchain list
will give you an overview of what toolchain components are installed on your system.
This rustup tool that can manage multiple versions of Rust compiler on a machine. To check installation and to update the compiler version run: rustup update
If you want to develop inside a dev container, you can create one by following through this tutorial. In short install and configure Docker in your system, install "Dev Container" extension in VSCode and add and edit a dev container configuration file to the repo and rebuild the container to get started.
Alternatively you can try code in Online Rust Playground
IDEs and Tools
There are different IDE available for rust development. Visual Studio Code is the most popular one.
To setup for rust install rust-analyzer
is an implementaion of Language server Protocol for VScode. It will provide you with code completion, syntax highlighting, and other features. The toolset also includes linting provided by rustc and clippy to detect issues with your code.
If you are using vim you will have to install rust-analyzer from rustup and rust.vim
Jetbrain also provide a separate IDE RustRover
Cargo : Rust build tool, dependency manager and package manager
rustc
is the Rust compiler that allows us to compile the file, but managing multiple files and dependencies can be difficult hence we use Cargo.
-
Start a project :
cargo new prj_name
creates a folder with
Cargo.toml
manifest files to track metadata and dependencies andsrc/main.rs
with hello world script- initialize with a new git repo as well
cargo new --vcs=git prj
- creating a binary executable project :
cargo new prj_name --bin
- creating a library project
cargo new prj_name --lib
--> createslib.rs
instead ofmain.rs
as the entry point.
- initialize with a new git repo as well
-
Quickly check code to make sure it compiles without producing an executable
cargo check
-
cargo add crate_name
: This will add the latest crate from the package registry Crates.io to theCargo.toml
file. To use the library you'll have to import the page in your source code:use crate_name::func;
All crates on crates.io will automatically have its documentation built and available on docs.rs
eg:
cargo add rand
To use dependencies in the code add the following line#![allow(unused)] fn main() { use rand::Rng; let rand_num = rand::thread_rng().gen_range(1..=100); }
- use
--git URL --branch branch --tag tag
to add crate from git URL branch and tag - use
--dev
add as dev dependency,--build
to add as build dependency,--target
add as target dependency to the given target platform - use
-F features
or--feature features
feature to add/activate additional features available in the crate
-
Build your project:
cargo build
compiles into a target/debug folder with compiled file and other build files
- binaries can also be generate using
rustc main.rs
, but it becomes easier to build with cargo with multiple files and dependencies - building for release compiles with optimization to target/release
cargo build --release
- build also creates cargo.lock file that figure out and stores the versions of dependencies that fits the criteria and ensures we rebuild the same artifact every time
- binaries can also be generate using
-
cargo update
will ignore the Cargo.lock file and update a crate to newer version that fit the specification in Cargo.toml -
Compile and Run the project:
cargo run
-
Test your project:
cargo test
-
Build documentation for your project:
cargo doc --open
also opens the doc in a browserbuilds into
target\doc
compiling doc comments///
from code into documentation -
Remove generated artifacts:
cargo clean
removes
/target
folder -
Publish a library to crates.io:
cargo publish
Cargo is extensible with sub command plugins such as:
- Linter: Clippy: install using
rustup component add clippy
and run (and automatically fix )usingcargo clippy --fix
- Formatter: Rustfmt: auto-formatting to community standards, run using
cargo fmt
Other advance features: workspaces, build scripting.
Variables and Data Types
A variable is a named abstraction of the memory (storage location) acting as a container to hold certain data.It requires
- a unique identifier or name,
- a data type that specifies what type of data the variable can hold,
- the value stored in the variable,
- reference to or the address of memory location that the variable data is stored,
- scope or the context within which the variable is accessible, It can be local (within function or block) or global scope (throughout the program)
- lifetime during which the variable exist in memory. It can be static (exists for the entire program execution) or dynamic (exists only during certain parts of the program)
- Optionally there are type qualifiers that modify the behaviors of the variable. In rust we can make a variable mutable using keyword
mut
.
Variable binding associates a variable with a specific value. This is done through let
keyword.
Since Rust provides type safety via static typing (type of the variables must be known at compile time). Hence, the type must be declared (with a colon) or infered based on the value assigned during the compilation type checking. Static typing helps catch type errors early in the development process, reducing the likelihood of runtime errors related to type mismatches.
fn main() { let x: i32 = 42; let y = 17; // type i32 inferred from value during compilation println!("x: {x}"); // x = 24; // cannot assign twice to immutable vairable // println!("x: {x}"); }
In Rust, Variables are immutable by default. We can make it mutable using mut
keyword. We can only change its value not the type.
fn main() { let mut x: i32 = 42; println!("x: {x}"); x = 24; println!("x: {x}"); }
This doesn't mean let creates constants, constants are values that are bound to a name, must be type annotated, and are not allowed mut
usage to change it. Constants can be declared in any scope, usually defined in global scope. They are compile-time constants.
const PI: f32 = 3.14159265; fn main(){ let r: f32 = 2.0; println!("{}",PI*r.powf(2.0)); }
Even if the variables cannot be changed, one can reuse the variable name by redeclaring the variable. The first variable is said to be shadowed by the second. Furthermore we can even create a new type.
fn main() { let x = "Five"; let x = x.len(); // shadowing to store different data type. { let x = x * 2; // shadowing to create a new variable in the inner scope println!("x in the inner scope is: {x}"); } println!("x in outer scope is: {x}"); }
Using descriptive names for variables and functions makes the code more readable and easier to understand. Rust community follows snake_case for variable and function names to ensures consistency and readability in code. Prefixing boolean variables with is_, has_, or can_ makes their purpose clear and improves code readability. eg: is_even, has_permission, can_open, etc.
Use clear and universally understood terms instead of abbreviations to enhance code readability
Data Types and Values
use std::io;
fn main(){
println!("Please input your guess.");
let mut guess = String::new();
io::stdin()
.read_line(&mut guess) // read line from stdin
.expect("Failed to read line"); // error handling
let guess: i32 = guess.trim().parse().expect("Not a number!"); // trim end spaces and parse to annotated format
println!("{}",guess);
}
Basic built-in types, and the syntax for literal values of each type.
Types | Literals | |
---|---|---|
Signed integers | i8 , i16 , i32 , i64 , i128 , isize | -10 , 0 , 1_000 , 123_i64 |
Unsigned integers | u8 , u16 , u32 , u64 , u128 , usize | 0 , 123 , 10_u16 , 0xfa, 0b0010_1011; |
Floating point numbers | f32 , f64 | 3.14 , -10.0e20 , 2_f32 |
Unicode scalar values | char | 'a' , 'α' , '∞' , '😍' |
Booleans | bool | true , false |
The types have widths as follows:
iN
,uN
, andfN
are N bits wide,isize
andusize
are the width of a pointer depending on the computer architecture,char
is 32 bits wide,bool
is 8 bits wide.
The signed numbers are stored using two's complement.
Integer overflow can occur when we change a variable to outside of the range. eg: 256 to u8 with range 0 - 255
- in debug mode, Rust includes checks for integer overflow that cause your program to panic at runtime if this behavior occurs.
- in release mode i.e compiled with
--release
flag, Rust doesn't panic but performs two’s complement wrapping, i.e 256 becomes 0, 257 becomes 1,...
Relying on integer overflow’s wrapping behavior is considered an error, hence rust provides families of methods to handle them explicitly.
- Wrap in all modes with the
wrapping_*
methods, such aswrapping_add
. - Return the None value if there is overflow with the
checked_*
methods. - Return the value and a boolean indicating whether there was overflow with the
overflowing_*
methods. - Saturate at the value’s minimum or maximum values with the
saturating_*
methods. e.g.,(a * b).saturating_add(b * c).saturating_add(c * a)
.
Rust provides addition, subtraction, multiplication, division, and remainder operators. Here's list of Operators
Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.
Array is a datatype that stores a 'N' number of values of the same type 'T'. Here length 'N' (a compile-time constant) is part of the type [T, N] hence [u8; 3] and [u8; 4] are considered two different types.
fn main() { let mut a: [i8; 10] = [42; 10]; a[5] = 0; // indexing to access array println!("a: {a:?}"); }
Try accessing an out-of-bounds array element. Array accesses are checked at runtime and panic when index is beyond the arrays bound. Rust can usually optimize these checks away, and they can be avoided using unsafe Rust.
We can use literals to assign values to arrays.
The println! macro asks for the debug implementation with the ? format parameter: {} gives the default output, {:?} gives the debug output. Types such as integers and strings implement the default output, but arrays only implement the debug output. This means that we must use debug output here.
Adding #, eg {a:#?}, invokes a "pretty printing" format, which can be easier to read.
Like arrays, tuples have a fixed length but tuples can group together values of different types into a compound type. Fields of a tuple can be accessed by the period and the index of the value, e.g. t.0, t.1. The empty tuple () is referred to as the "unit type" and signifies Void or absence of a return value.
fn main() { let t: (i8, bool) = (7, true); println!("t.0: {}", t.0); //indexing to access tuple println!("t.1: {}", t.1); let (x, y) = t; // destructuring tuple to two variables x, y println!("{x}"); }
Type inference
Rust compiler infers types based on constraints given by variable declarations and usages. Machine code generated by such declaration is identical to the explicit declaration of a type.
fn takes_u32(x: u32) { println!("u32: {x}"); } fn takes_i8(y: i8) { println!("i8: {y}"); } fn main() { let x = 10; let y = 20; takes_u32(x); takes_i8(y); // takes_u32(y); }
Type Conversion
- using
as
keyword - using
try_from
method that can handle errors - using
Wrapping
type that can handle overflow or other errors
use std::convert::TryFrom; use std::num::Wrapping; fn main() { let x: u8 = 10; let y: i16 = x as i16; println!("y: {y}"); let unsigned: u32 = 42; match i32::try_from(unsigned) { Ok(signed) => println!("Signed value: {}", signed), Err(e) => println!("Conversion failed: {}", e), } let unsigned: u32 = 42; let signed: i32 = Wrapping(unsigned).0 as i32; println!("Signed value: {}", signed); }
Blocks and Scopes
A block in Rust contains a sequence of expressions, enclosed by braces {}. Each block has a value and a type, which are those of the last expression of the block:
fn main() { let z = 13; let x = { let y = 10; println!("y: {y}"); z - y // if commented block returns () }; println!("x: {x}"); }
A variable's scope is limited to the enclosing block.
You can shadow variables, both those from outer scopes and variables from the same scope:
fn main() { let a = 10; println!("before: {a}"); { let a = "Hi"; println!("inner scope: {a}"); let a = true; println!("shadowed in inner scope: {a}"); } println!("after: {a}"); }
Shadowing is different from mutation, because after shadowing both variable's memory locations exist at the same time. Both are available under the same name, depending where you use it in the code.
Memory Management & Ownership
All programs need to manage memory for efficient memory utilization, prevent memory leaks for safety and program stability. There are few approaches for memory management:
- Manual memory allocation and deallocation (eg in C/C++) for fine-grained control which can lead to memory leaks and segmentation faults
- Automatic memory management using garbage collection (GC) (eg: Java, Python) regularly check for unused memory and free the memory. Some uses Automatic Reference Counting (ARC) (eg: swift) to keep count of object in memory and deallocates the memory if count drops to zero.
- If we forget to clean, it'll waste memory,
- if we do it too early it would create invalid variable,
- if we do it twice, it can create double-free bug.
- Ownership and borrowing (rust): set of rules enforces at compile time, where each piece of data or value has a single owner at a time and memory is automatically freed (i.e. the value is dropped) when the owner goes out of scope.
Stack and Heap Allocation
Stack | Heap | |
---|---|---|
Structure | Contiguous block of memory, hence size known and fixed | non-contiguous allocation, less organized, memory allocator finds empty spot and returns a pointer to use it |
Usage | static memory: local variable, function call management | dynamic memory: objects, data structure |
Management | By compiler allocated when function called, deallocated when exits | mannually allocated, deallocated depending on programming language |
speed | faster alloc/dealloc & access | slower alloc/dealloc & access due to complexity, following pointers |
scope | limited to function/block | usually global, as long as reference exists |
safety | follows LIFO to reduce memory corruption | prone to memory leaks and fragmentation |
When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the function’s local variables get pushed onto the stack. When the function is over, those values get popped off the stack. The main purpose of ownership is to manage heap data.
Most of the data primitive data types have known, fixed size hence are stored on stack and easily popped off when their scope is over. String literal eg: "hello" are fixed and known at compile time. Dynamic data type such as String
, whose size are unknown at compile time and whose size might change during runtime need to allocate memory on heap.
{ let mut s = String::from("hello"); // Memory requested from memory allocator at runtime s.push_str(", world!"); // push_str() appends a literal to a String println!("{s}"); } // scope of s is over and s is no longer valid
In Rust the memory is automatically returned once the variable that owns it goes out of scope (without need of GC). Rust calls a function called drop automatically at the closing curly bracket. This is similar to Resource Acquisition Is Initialization (RAII) pattern in C++.
let x = 5; // data in stack let y = x; // copy of value of x is binded to y let s1 = String::from("hello"); //String is made up of ptr to heap, len, capacity let s2 = s1; // String data (ptr, len, capacity) is copied not the actual string from the heap, same data is being used // println!({"s1"}) // s1 is invalid after move
since both s1, s2 points to same memory, both might try to free the memory "double-free error". To ensure memory safety, after let s2 = s1;
Rust consider s1
to be moved to s2
and s1
is no longer valid.
This type of copy is called a shallow copy in other language, but because of invalidation of first variable, it is known as move in rust.By design choice, Rust will never automatically create "deep" copies of data.
let s1 = String::from("hello"); let s2 = s1.clone(); // create deep copy of heap data println!("s1 = {s1}, s2 = {s2}");
Rust has Copy
trait that we can place on types that are stored on stack such as integers, if a type implements the Copy
trait, variables that use it do not move, but rather are trivially copied, making them still valid after assignment to another variable. Copy
is not allowed if Drop
is implemented.
Passing a value to a function is similar to assigning a value to a variable.
fn main() { let s = String::from("hello"); // s comes into scope takes_ownership(s); // s's value moves into the function... // s is no longer valid here let x = 5; // x comes into scope makes_copy(x); // x would move into the function, // but i32 is Copy, so it's still valid & can be used } // Here, x goes out of scope, then s. But because s's value was moved, nothing special happens. fn takes_ownership(some_string: String) { // some_string comes into scope println!("{some_string}"); } //some_string goes out of scope and `drop` is called. The backing memory is freed. fn makes_copy(some_integer: i32) { // some_integer comes into scope println!("{some_integer}"); } // Here, some_integer goes out of scope.
fn main() { let s1 = gives_ownership(); // moves return value into s1 let s2 = String::from("hello"); // s2 comes into scope let s3 = takes_and_gives_back(s2); // s2 is moved into function which moves its return value into s3 } // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing // happens. s1 goes out of scope and is dropped. fn gives_ownership() -> String { // move its return value into the function that calls it let some_string = String::from("yours"); // comes into scope some_string // returned and moves out to the calling function } // This function takes a String and returns one fn takes_and_gives_back(a_string: String) -> String { // a_string comes into // scope a_string // a_string is returned and moves out to the calling function }
When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless ownership of the data has been moved to another variable.
Taking ownership and then returning ownership with every function is a bit tedious, if we want to let a function use a value without transferring owernship through reference .
References & Borrowing
Reference is like a pointer to the address of the data storage, whose data is owned by some other variable. Unlike a pointer, a reference is guaranteed to point to a valid value of a particular type for the life of that reference. The action of creating a reference is called borrowing. The borrowed references are immutable by default. Multiple immutable references can exist simultaneously. We can also create a mutable reference, but only one mutable reference can exist at a time to prevent data races. Data race condition occur when two or more pointer access the same data at the same time and at least one is being used to write to the data without any mechanism to synchronize access to the data. For the same reason, we also cannot have a mutable reference while we have an immutable one to the same value, only allowed if the scope of immutable reference ends before creating mutable reference.
fn main() { let s1 = String::from("hello"); let len = calculate_length(&s1); // creates and passes a reference println!("The length of '{s1}' is {len}."); let s2 = &s1 ; //s1 is unused after so mutable reference is fine change(&mut s1); // pass mutable reference println!("{s}"); } fn calculate_length(s: &String) -> usize { // takes in a reference s.len() } fn change (s: &mut String){ s.push_str(", world"); }
we can use curly brackets to create a new scope, allowing for multiple mutable references, just not simultaneous ones:
let mut s = String::from("hello"); { let r1 = &mut s; } // r1 goes out of scope here, so we can make a new reference with no problems. let r2 = &mut s;
For a contiguous sequence of elements in a collection, it might be useful to pass reference of a slice of the elements rather than the whole collection.
fn first_word(s: &String) -> usize { // returning usize let bytes = s.as_bytes(); // convert strings to array of bytes for (i, &item) in bytes.iter().enumerate() { // iterate over byte as indexed tuple if item == b' ' { // search for the byte representing space return i; } } s.len() // meaningful and valid only as long as string is valid } fn main() { let mut s = String::from("hello world"); let word = first_word(&s); // word will get the value 5 s.clear(); // this empties the String, making it equal to "" // word still has the value 5 here, but there's no more string that // we could meaningfully use the value 5 with. word is now totally invalid! }
let s = String::from("hello world"); let hello = &s[0..5]; // first and last index, but stores start and length of slice, let world = &s[6..11]; let word = &s[..]; // both first and last index can be dropped to take entire string
fn first_word(s: &String) -> &str { let bytes = s.as_bytes(); for (i, &item) in bytes.iter().enumerate() { if item == b' ' { return &s[0..i]; } } &s[..] } fn main() { let mut s = String::from("hello world"); let word = first_word(&s); s.clear(); // compiler indicates error! println!("the first word is: {word}"); }
The compiler will ensure the references into the String remain valid. Rust disallows the mutable reference in clear and the immutable reference in word from existing at the same time, and compilation fails. Not only has Rust made our API easier to use, but it has also eliminated an entire class of errors at compile time!
The type of string literal is &str: a slice pointing to specific point of the binary. This is also why string literals are immutable; &str is an immutable reference.
#![allow(unused)] fn main() { fn first_word(s: &str) -> &str { }
dangling pointer — a pointer that references a location in memory that may have been given to someone else—by freeing some memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees that references will never be dangling references: if you have a reference to some data, the compiler will ensure that the data will not go out of scope before the reference to the data does.
fn main() { let reference_to_nothing = dangle(); } fn dangle() -> &String { // dangle returns a reference to a String let s = String::from("hello"); // s is a new String &s // we return a reference to the String, s } // Here, s goes out of scope, and is dropped. Its memory goes away. // Danger!
The solution is to return the string directly.
TODO Deferencing with * operator
Lifetimes
lifetimes is used to track how long references are valid. This ensures that references do not outlive the data they point to, preventing dangling references.
Control Flow
Conditional
Condition is expected to be a bool, other values are not automatically converted to a Boolean.
fn main() { let x = 42; if x == 0 { println!("zero!"); } else if x < 100 { println!("small"); } else { println!("large"); } }
We can use if
as an expression, which returns the value of the last expression in the block (notice missing ;, for returning the value).
We can use if in an expression
fn main() { let x = 10; let size = if x < 20 { "small" } else { "large" }; println!("number size: {}", size); }
Note that values in both arm should have same data type since it has to be evaluated and assigned to the size
variable at runtime.
Match Case
Match case allows to compare a value against multiple patterns and execute code based on the first matching pattern. Patterns can be made up of:
- literal values,
- destructured arrays, enums, structs or tuples
- variables
- wildcards
- Placeholders
#![allow(unused)] fn main() { match VALUE { PATTERN => EXPRESSION, PATTERN => EXPRESSION, PATTERN => EXPRESSION, } }
#[derive(Debug)] enum UsState { Alabama, Alaska, // --snip-- } enum Coin { Penny, Nickel, Dime, Quarter(UsState), // } fn value_in_cents(coin: Coin) -> u8 { match coin { Coin::Penny => 1, Coin::Nickel => 5, Coin::Dime => 10, Coin::Quarter(state) => { println!("State quarter from {state:?}!"); 25 } } } fn main() { println!("penny: {}", value_in_cents(Coin::Quarter(UsState::Alaska))); }
Unlike if let
, else if
, else if let
and else
, Pattern matching is exhaustive, i.e the compiler ensures all possible cases are handled which prevents bugs. Try removing one of the cases.
One way to ensure you've covered every possiblity is to have a catchall pattern for the last arm using a variable name matching any value, or using _
as a placeholder that doesn't binds to a variable ignoring any other value without any warning.
let dice_roll = 9; match dice_roll { 1..=3 => add_fancy_hat(), // match range of patterns 5 | 7 => remove_fancy_hat(), // match multiple patterns other => move_player(other), // catch all other cases in variable `other` } fn add_fancy_hat() {} fn remove_fancy_hat() {} fn move_player(num_spaces: u8) {}
Match with Option<T>
#![allow(unused)] fn main() { fn plus_one(x: Option<i32>) -> Option<i32> { match x { None => None, Some(i) => Some(i + 1), } } let five = Some(5); let six = plus_one(five); let none = plus_one(None); }
if let
if let
is an expression that lets you combine if
and let
into a single construct to match one pattern while ignoring the rest. This means less boilerplate code and works the same way without the exhaustive checking of match
. These are particulary useful when dealing with Option and Result types.
#![allow(unused)] fn main() { let config_max = Some(3u8); match config_max { Some(max) => println!("The maximum is configured to be {max}"), _ => (), // ignore everything else i.e `None` } // alternatively with if let if let Some(max) = config_max { println!("The maximum is configured to be {max}"); } }
with else
#![allow(unused)] fn main() { let mut count = 0; match coin { Coin::Quarter(state) => println!("State quarter from {state:?}!"), _ => count += 1, } let mut count = 0; if let Coin::Quarter(state) = coin { println!("State quarter from {state:?}!"); } else { count += 1; } }
Loops
While
While loop is a conditional loop that runs till the condition is satisfied.
fn main() { let mut x = 200; while x >= 10 { x = x / 2; } println!("Final x: {x}"); }
For
For loop, for-in loop is an iterator loop that iter over an elements of an iterator which can be fixed or could theoretically go on indefinitely.
fn main() { for x in 1..5 { println!("x: {x}"); } for x in 1..=5 { println!("x: {x}"); } for elem in [1, 2, 3, 4, 5] { println!("elem: {elem}"); } for (index, elem) in [1, 2, 3, 4, 5].iter().enumerate() { println!("index: {index}, elem: {elem}"); } for i in std::iter::repeat(5) { println!("turns out {i} never stops being 5"); break; // would loop forever otherwise } }
Under the hood for loops use a concept called "iterators" to handle iterating over different kinds of ranges/collections.
Loop
Loop is idiomatic infinite loop similar to while true
.
fn main() { }
break and continue
fn main() { let mut i = 0; loop { i += 1; if i > 5 { break; } if i % 2 == 0 { continue; } println!("{}", i); } }
Both continue and break can optionally take a label argument which is used to break out of nested loops, can be used for loop, while, for.
fn main() { let s = [[5, 6, 7], [8, 9, 10], [21, 15, 32]]; let mut elements_searched = 0; let target_value = 10; 'outer: for i in 0..=2 { for j in 0..=2 { elements_searched += 1; if s[i][j] == target_value { break 'outer; } } } print!("elements searched: {elements_searched}"); }
Example: Fibonacci Number
fn fib(n: u32) -> u32 { if n < 2 { todo!("Implement this"); } else { todo!("Implement this"); } } fn main() { let n = 20; println!("fib({n}) = {}", fib(n)); }
Solution
fn fib(n: u32) -> u32 { if n < 2 { return n; } else { return fib(n - 1) + fib(n - 2); } } fn main() { let n = 20; println!("fib({n}) = {}", fib(n)); }
Iterators
Using iterators, map, and filter rather than loop in Rust allows for a functional programming style that can make your code more concise and expressive.
fn main() { let numbers = vec![1, 2, 3, 4, 5, 6, 7, 8, 9, 10]; let result: Vec<i32> = numbers.iter() .filter(|&&x| x % 2 == 0) // Filter even numbers .map(|&x| x * x) // Square each number .collect(); // Collect the results into a new vector println!("{:?}", result); // Output: [4, 16, 36, 64,100] }
Functions
The last expression in a function body (or any block) becomes the return value if ;
is ommited at the end of the expression. The return keyword can also be used for early return. Some functions have no return value, and return the 'unit type', (). The compiler will infer this if the -> () return type is omitted.
fn gcd(a: u32, b: u32) -> u32 { // declaring fn with parameters annotated by their type -> return type if b > 0 { gcd(b, a % b) // last expression in the block is a returned if ; is ommited } else { a // last expression in the block is a returned if ; is ommited } } fn main() { println!("gcd: {}", gcd(143, 52)); // calling gcd with argument values }
Rust code uses snake case as the conventional style for function and variable names, in which all letters are lowercase and underscores separate words.
Overloading is not supported -- each function has a single implementation. Always takes a fixed number of parameters. Default arguments are not supported. Macros can be used to support variadic functions. Always takes a single set of parameter types. These types can be generic.
Statements are instructions that perform some action and do not return a value. Expressions evaluate to a resultant value. Rust is an expression-based language.
Functions often have contracts: their behavior is only guaranteed if the inputs meet particular requirement.
Tuples, Arrays and Collections
These data types stores multiple values. Built-in array and tuple have a fixed and known size during compile time hence data is stored in stack, while collection store data in heap which doesn't need to know amount of data at compile time and can grow or shrink at runtime. Different type of collection have different capability and costs.
- A vector allows you to store a variable number of values next to each other.
- A string is a collection of characters.
- A hash map allows you to associate a value with a specific key. It’s a particular implementation of the more general data structure called a map.
Tuples
Tuples are a collection of values of different types, fixed size and known at compile time.
fn main() { let tup: (i32, f64, u8) = (500, 6.4, 1); let x = tup.0; // accessing tuple values with dot notation println!("The value of x is: {x}"); let (int_num, float_num, uint_num) = tup; // Pattern Matching for Destructuring into local variables println!("The value of float is: {float_num}"); }
The patterns used here are "irrefutable", meaning that the compiler can statically verify that the value on the right of = has the same structure as the pattern. A variable name is an irrefutable pattern that always matches any value, hence why we can also use let to declare a single variable. Rust also supports using patterns in conditionals, allowing for equality comparison and destructuring to happen at the same time.
Arrays
Arrays store a fixed number of item of same data type, both size and data types are known at compile time.
fn main() { let mut a: [i8; 10] = [42; 10]; // array [T;N] a[5] = 0; // accessing array elements with indexing println!("a: {a:?}"); }
The for
statement supports iterating over arrays (but not tuples).
fn main() { let primes = [2, 3, 5, 7, 11, 13, 17, 19]; for prime in primes { for i in 2..prime { assert_ne!(prime % i, 0); } } }
This functionality uses the IntoIterator
trait
The assert_ne! macro is new here. There are also assert_eq! and assert! macros. These are always checked, while debug-only variants like debug_assert! compile to nothing in release builds.
Vectors
Vector Vec<T>
store multiple values of the same type (implemented using generics) next to each other in memory. Ownership and borrowing rules ensure references of vector remain valid.
fn main(){ let v: Vec<i32> = Vec::new(); // create an empty vector, since no values are inserted type annotation is required let v2 = vec![1, 2, 3]; // vec! macro, type infered from the values // updating vector let mut v = Vec::new(); // data type infered from the data we provide later v.push(1); // add elements to vector v.push(2); v.push(3); // Referencing values stored in vector let third: &i32 = &v[2]; // via indexing, zero-based indexing, cause panic for out of bound index println!("The third element is {third}"); let third: Option<&i32> = v.get(2); // get method return option to handle error match third { Some(third) => println!("The third element is {third}"), None => println!("There is no third element."), } for i in &v { // iterating over the values in vector println!("{i}"); } for i in &mut v { // iterating over mutable references *i += 100; // dereference to get to the value in i } } // When vector goes out of scope, vector and elements are dropped
Rust needs to know what types will be in the vector at compile time so it knows exactly how much memory on the heap will be needed to store each element. If we want to store values of different type in a vector, we can create a custom enum type and store the value.
enum SpreadsheetCell { Int(i32), Float(f64), Text(String), } let row = vec![ SpreadsheetCell::Int(3), SpreadsheetCell::Text(String::from("blue")), SpreadsheetCell::Float(10.12), ];
If you don’t know the exhaustive set of types a program will get at runtime to store in a vector, the enum technique won’t work. Instead, you can use a trait object,
Strings
String
is a more complicated data structure than a string literals represented as string slice str
. Literal strings are hardcoded in the binary program program and stored in the stack while String is a growable, mutable, UTF-8 encoded collection of byte string stored in a heap. String is implemented as wrapper around a vector of byests with some extra guarantees, restrictions (doesn't support indexing) and capabilities, many of vec operations are also available.
fn main(){ let mut s1 = String::from("नमस्ते"); // UTF-8 encoded let mut _s = String::new(); // empty string to add data later let mut s = "Hello".to_string(); // from literals s.push_str(" World"); //append a string slice. // push method is for vec so it can only add one character at time s1.push_str(&s); // append a String to String, push_str takes ownership of s. // concatenation with + Operator let s1 = String::from("Hello, "); let s2 = String::from("world!"); let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used, &s2 reference is used, s2 will still be valid let s1 = String::from("tic"); let s2 = String::from("tac"); let s3 = String::from("toe"); let s = format!("{s1}-{s2}-{s3}"); // macro to return a concatenated macro in set format. }
fn main(){ let s1 = String::from("नमस्ते"); let s2 = String::from("Hello"); println!("length of s1 is {} and s2 is {}",s1.len(),s2.len()); }
Devanagari : “नमस्ते” can be represented in following ways
- stored as u8 values: [224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]
- unicode scalar: ['न', 'म', 'स', '्', 'त', 'े']
- Grapheme clusters: ["न", "म", "स्", "ते"] , as certain characters like '्, 'े are diacritics and doesn't make sense on their own
It's unclear what the return type of the string-indexing operation should be, and indexing into string byte may not always correlated with a valid character
We can however index and slice a string literal.
Indexing are expected to always take O(1) time but its not possible to guarantee that performance with a String.
The best way to operate on piece of string is to be explict about where we want character or bytes.
for c in "Зд".chars() { println!("{c}"); } for b in "Зд".bytes() { println!("{b}"); }
Crates such as unicode segmentation allows to extract grapheme clusters as well.
Hash Maps
HashMap<K, V>
stores a mapping of keys of type K to values of type V using a hashing function, which determines how it places these keys and values into memory. This is also known as object, hash table, dictionary, or associative array, in other programming languages. Hash Map allows us to look up data not by using an index, but using the key value. Hashmap like vectors are stored on heap, they are homogeneous: all of the keys must have the same type, and all of the values must have the same type. Each unique key can only have one value associated with it at a time, but the number of key and value pair is growable.
use std::collections::HashMap; fn main() { let mut scores = HashMap::new(); scores.insert(String::from("Team A"), 10); scores.insert(String::from("Team B"), 50); let team_name = String::from("Team B"); let score = scores.get(&team_name) // returns Option<&V> .copied() // get Option<i32> rather than Option<&i32> .unwrap_or(0); // return score value if None then 0 scores.insert(String::from("Team A"), 20); // overwrites the value // Adding a Key and Value Only If a Key Isn’t Present, if key present, existing value is unchanged scores.entry(String::from("Team C")) // returns Entry enum (Vacant | Occupied) with mutable reference to the value corresponding the entry .or_insert(50); // or_inserts defined on entry, inserts value if vacant //iterate over key value pair similar to vector for (key, value) in &scores { println!("{key}: {value}"); } // updating a value based on the old value let text = "hello world wonderful world"; let mut map = HashMap::new(); for word in text.split_whitespace() { let count = map.entry(word).or_insert(0); // or_insert returns mutable reference to the value (&mut V) corresponding to the key *count += 1; //dereference the mutable reference to increment the value } println!("{map:?}"); }
For types that implement the Copy trait, like i32, the values are copied into the hash map. For owned values like String, the values will be moved and the hash map will be the owner of those values.
Hashmap uses a hashing function called SipHash that can provide resistance to denial-of-service (DoS) attacks involving hash tables. We can choose faster hash function.
Slice
Slice Type std::slice
are dynamically sized view (reference) into a sequence (array/vector). Pass a slice to fun(&array)
to pass reference to a slice, while defining the func fn fun(num: &[u8])
allows to pass the reference to something of known size with only a read-only access, the ownership need not be passed to the funciton. The data is only borrowed for the duration of the function call.
Slice also has iterator traits like vectors.
Structs and Enums
Structure (struct) is a custom data type that lets you package together multiple related, named values (fields) in a meaniful group similar to object's data attributes. We create an instance of struct by specifying concrete values for each fields in key:value pairs, and access the values using the dot notation. The individual fields cannot be marked as mutable hence the entire instance needs to be declared as mutable.
struct User { active: bool, username: String, email: String, sign_in_count: u64, } fn main(){ let mut user1 = User { active: true, sign_in_count: 1, username: String::from("someone123"), email: String::from("someone@example.com"), }; user1.sign_in_count += 1; println!("user1.username: {}", user1.username); let mut user2 = User { email: String::from("someone456@example.com"), ..user1 // remaining fields of user1 is copied into user2 // string is also copied, not a stack only copy hence, user1 won't be available }; println!("user2.sign_in_count: {}", user2.sign_in_count); // println!("user1.username: {}", user1.username); } // functions can use the field init shorthand syntax rather than repeat each field names fn build_user(email: String, username: String) -> User { User { active: true, username, email, sign_in_count: 1, } }
Structure can be used to define custom types with requirement validation. Group related methods within impl blocks to organize code logically and improve readability.
pub struct Guess { value: i32, } impl Guess { pub fn new(value: i32) -> Guess { if value < 1 || value > 100 { panic!("Guess value must be between 1 and 100, got {value}."); } Guess { value } } pub fn value(&self) -> i32 { //getter self.value } }
Special Structs
Tuple Structs are similar to Tuples, without field names
struct Color(i32, i32, i32); struct Point(i32, i32, i32); fn main() { let black = Color(0, 0, 0); let origin = Point(0, 0, 0); }
unit-like structurs don't have any fields at all, similar to ()
the unit type. These are useful when you need to implement a trait on some type but don’t have any data that you want to store in the type itself.
struct AlwaysEqual; fn main() { let subject = AlwaysEqual; }
Enum
Enumeration (enum) allows to define a type by enumerating its possible variants. Enum give you a way of saying a value is one of a possible set of values, not both at the same time. We can even optionally put data directly into each enum variant as well by defining its associated data type.
#![allow(unused)] fn main() { enum IpAddr { V4(String), V6(String), } let home = IpAddr::V4(String::from("127.0.0.1")); let loopback = IpAddr::V6(String::from("::1")); fn route(ip_kind: IpAddrKind) {} // to call function with either variant }
The standard library define IpAddr
enum with struct as data type instead of string.
A popular enum defined by standard library is Option
included in the prelude, which encodes the common scenario in which a value could be something or it could be nothing (instead of null feature). With this functionality compiler can check whether you have handled all the cases and prevent bugs related to null references. You have to convert an Option<T>
to a T
before you can perform T operations with it. This helps catch one of the most common issues with null: assuming that something isn’t null when it actually is.
#![allow(unused)] fn main() { enum Option<T> { None, Some(T), } let absent_number: Option<i32> = None; let some_number = Some(5); }
Another most common enum is Result that represent either success Ok
or failure Err
.
Methods
All functions defined within an impl block are called associated functions.
Methods are associated function that specifies the behaivors associated with a struct type. Unlike functions they are defined wihtin the context of a struct (or enum or a trait object). Their first parameter is always a reference to the struct instance (self), which represent instance of the struct the method is being called on. We don't have to repeat the type of self in every method. Also all the things we can do with an instance of a type is placed in one impl
block rather than in various places in the library. However, each struct is allowed to have multiple impl blocks.
struct Rectangle { width: u32, height: u32, } impl Rectangle { fn area(&self) -> u32 { // &self is short for self: &Self ; alias for the type self.width * self.height } } fn main() { let rect1 = Rectangle { width: 30, height: 50, }; println!("The area of the rectangle is {}", rect1.area()); }
Unlike C/C++ which uses .
for calling method on object and ->
for calling method on pointer, Rust has automatic referencing and dereferencing. When you call a method, Rust automatically adds in &
, &mut
, or *
so object matches the signature of the method.
p1.distance(&p2);
is equivalent to (&p1).distance(&p2);
Given the receiver and name of a method, Rust can figure out definitively whether the method is reading (&self
), mutating (&mut self
), or consuming (self
).
We can define associated functions that don’t have self as their first parameter (and thus are not methods) eg: is String::from
function that’s defined on the String type. These associated functions are often used for constructors that will return a new instance of the struct. These are often called new, but new isn’t a special name and isn’t built into the language.
impl Rectangle { fn square(size: u32) -> Self { Self { width: size, height: size, } } }
To call this associated function, we use the ::
syntax with the struct name; let sq = Rectangle::square(3);
is an example. This function is namespaced by the struct.
eg: Fibonacci number
#[derive(Debug)] struct Fibonacci { current: u8, previous: u8, } impl Fibonacci { fn new() -> Self { Self { current: 0, previous: 0, } } } impl Iterator for Fibonacci { type Item = u8; fn next(&mut self) -> Option<Self::Item> { if self.current == 0 { self.current = 1; return Some(self.current); } let next_value = self.previous.checked_add(self.current)?; self.previous = self.current; self.current = next_value; Some(self.current) } } fn main() { for fb in Fibonacci::new() { println!("{fb}"); if fb > 30 { break; } } println!("------"); for fb in Fibonacci::new() { println!("{fb}"); } }
Generics
Generics is an abstract stand-ins for concrete types (i32, String) or other properties that allows to reduce code duplication when defining functions, structures, enums and methods.
Generics in functions
#![allow(unused)] fn main() { use std::cmp::PartialOrd; fn largest<T>(list: &[T]) -> &T { let mut largest = &list[0]; for item in list { if item > largest { // won't work for all possible types that T could be largest = item; } } largest } }
is generic over some type T. We can call the function with wither i32 or char values. It uses std::cmp::PartialOrd
trait to enable comparisons
In struct and method definitions, Methods written within an impl that declares the generic type will be defined on any instance of the type, no matter what concrete type ends up substituting for the generic type.
struct Point<T> { // both should be of same type x: T, y: T, } struct Point2<T, U> { // allow two different types x: T, y: U, } impl<T> Point<T> { // can have different name but its conventional fn x(&self) -> &T { &self.x } } impl Point<f32> { fn distance_from_origin(&self) -> f32 { (self.x.powi(2) + self.y.powi(2)).sqrt() } } // when generic parameters aren't always same impl<X1, Y1> Point<X1, Y1> { fn mixup<X2, Y2>(self, other: Point<X2, Y2>) -> Point<X1, Y2> { Point { x: self.x, y: other.y, } } } fn main() { let integer = Point { x: 5, y: 10 }; let float = Point { x: 1.0, y: 4.0 }; let mixed = Point2 { x: 5, y: 4.0 }; println!("p.x = {}", integer.x()); let p1 = Point { x: 5, y: 10.4 }; let p2 = Point { x: "Hello", y: 'c' }; let p3 = p1.mixup(p2); }
In Enum
#![allow(unused)] fn main() { enum Option<T> { Some(T), None, } enum Result<T, E> { Ok(T), Err(E), } }
Rust accomplishes this by performing monomorphization of the code using generics at compile time. Monomorphization is the process of turning generic code into specific code by filling in the concrete types that are used when compiled.
Trait
Traits (similar to interfaces) define functionality/behaviors for a particular type in a generic way. We can use trait bounds to specify that a generic type can be any type that has certain behavior. We can combine traits with generic types to constrain a generic type to accept only those types that have a particular behavior, as opposed to just any type.
Lifetimes
Lifetimes is a variety of generics that give the compiler information about borrowed values and how references relate to each other
Error Handling
Rust requires you to acknowledge the possibility of an error and take some action before your code will compile. Rust groups errors into two major categories:
- recoverable: eg file not found error -> report and retry without stoping the program
- use
Result<T,E>
enum with two variantsOk(T)
andErr(E)
- use
- unrecoverable errors: eg: Runtime failures like failed bounds checks, out of memory, or file I/O errors
- use
panic!
to stop execution - use Non-panicking APIs (eg:
Vec::get
) if crashing is not acceptable.
- use
Unrecoverable error with panic!
Rust handles fatal errors (unrecoverable and unexpected) with a "panic". panic can be triggered
- by taking an action that causes thhe code to panic, usually as symptoms of bugs in program logic.
- explicitly calling
panic!
macro
By default, these panics will print a failure message, unwind, clean up the stack, and quit. Unwinding means rust walk back up the stack and cleans up the data from each function it encounters. The unwinding can be caught. We can also abort immediately on panic without cleaning up , which makes the binary as small as possible by adding the following line in Cargo.toml
.
[profile.release]
panic = 'abort'
fn main() { panic!("crash and burn"); }
Error Message : src/main.rs:2:5 indicates that it’s the second line, fifth character of our src/main.rs file.
In many cases call to panic! might be part of some other function calls in different files, in such cases the filename and line number of panic! might be reported, not the line of code that eventually led to panic! call.
fn main() { let v = vec![1, 2, 3]; v[99]; // accessing invalid index }
In other languages, such Buffer overread leads to security vulnerabilities, but Rust will trigger a panic and stop execution.
Rust provides a note telling to run with RUST_BACKTRACE
to get a backtrace (list of all function call leading to panic) of exactly what happened to cause the error.
$ RUST_BACKTRACE=1 cargo run
Catching the panic
use std::panic; fn main() { let result = panic::catch_unwind(|| "No problem here!"); println!("{result:?}"); let result = panic::catch_unwind(|| { panic!("oh no!"); }); println!("{result:?}"); }
It’s advisable to have your code panic when it’s possible that your code could end up in a bad state (when some assumption, guarantee, contract, or invariant has been unexpectedly broken, such as when invalid values, contradictory values, or missing values are passed to your code). Attempting to operate on invalid data (attempt to access an out-of-bounds memory, violation of function contracts) can expose your code to vulnerabilities. Such violations and panic should be explained in API documentations.
Recoverable error with Result<T,E>
Many functions in rust returns a Result to state if it succeeded or failed in its operation.
fn divide(a:f64, b:f64) -> Result<f64, String> { if b == 0.0 { Err(String::from("Cannot divide by zero")) } else { Ok(a / b) } }
While reading a file, the file might not exist, or we might not have permission to access the file, leading to different types of errors, Result enum can convey such information.
use std::fs::File; // T for file open use std::io::ErrorKind; // E variants fn main() { let greeting_file_result = File::open("hello.txt"); // returns a result let greeting_file = match greeting_file_result { Ok(file) => file, Err(error) => match error.kind() { ErrorKind::NotFound => match File::create("hello.txt") { Ok(fc) => fc, Err(e) => panic!("Problem creating the file: {e:?}"), }, other_error => { panic!("Problem opening the file: {other_error:?}"); } }, }; }
Match can be verbose, and might not communicate intent well. Helper methods defined on Result can be helpful for more specific task. Such as - .unwrap(): return value inside Ok or panic! if Err.
- .expect(Msg): allows to return Msg if Err, else return value inside Ok.
- unwrap_or_else: takes a closure as argument and runs it if the Result is an Err.
use std::fs::File; use std::io::ErrorKind; fn main() { let greeting_file = File::open("hello.txt").unwrap_or_else(|error| { if error.kind() == ErrorKind::NotFound { File::create("hello.txt").unwrap_or_else(|error| { panic!("Problem creating the file: {error:?}"); }) } else { panic!("Problem opening the file: {error:?}"); } }); }
The above methods causes the code to panic. When code panics, there’s no way to recover. A called function instead of handling errors within itself can return (propagate) the error to the calling function to handle it. The calling code may have better context and it could choose to attempt to recover in a way that’s appropriate for its situation, or it could decide that an Err value in this case is unrecoverable, so it can call panic! and turn your recoverable error into an unrecoverable one. However, using unwrap, expect can be useful in cases where it’s logically impossible to get the Err value such as while parsing the hardcoded text.
use std::fs::File; use std::io::{self, Read}; fn read_username_from_file() -> Result<String, io::Error> { let username_file_result = File::open("hello.txt"); let mut username_file = match username_file_result { Ok(file) => file, Err(e) => return Err(e), // propagate the error }; let mut username = String::new(); match username_file.read_to_string(&mut username) { Ok(_) => Ok(username), Err(e) => Err(e), // propagate the error, returns the last expression } }
This pattern of propagating errors is so common in Rust that Rust provides the question mark operator ? to make this easier. ?
returns the value inside Ok or propagate the error, we can further chain method calls with ? to make shorter expressions.
?
calls the from
function on the error type returned by the expression, converting it into a Result<T, E>
type of the current function. This is useful when a function returns one error type to represent all the ways a function might fail, even if parts might fail for many different reasons.
use std::fs::File; use std::io::{self, Read}; fn read_username_from_file() -> Result<String, io::Error> { let mut username = String::new(); File::open("hello.txt")?.read_to_string(&mut username)?; Ok(username) } // shorter and more ergonomic way to directly read string from file use std::fs; use std::io; fn read_username_from_file() -> Result<String, io::Error> { fs::read_to_string("hello.txt") }
The ? operator can only be used in functions whose return type is a Result, i.e. the main function should return a Result<T,E>
or Option<T>
returning None or Some.
Traits
Using traits in Rust helps to encapsulate shared behavior, making the code more reusable, modular and easier to understand. These are similar to interfaces in other programming languages that allow you to define a set of methods that a type must implement, promoting polymorphism.
trait Describable { fn describe(&self) -> String; } struct Dog { name: String, age: u8, } // implement the trait for the struct impl Describable for Dog { fn describe(&self) -> String { format!("Dog named {} is {} years old.", self.name, self.age) } } // Implement the trait for another struct struct Car { model: String, year: u16, } impl Describable for Car { fn describe(&self) -> String { format!("Car model {} from year {}.", self.model, self.year) } } fn main() { let my_dog = Dog { name:String::from("Buddy"), age: 3 }; let my_car = Car { model: String::from("Tesla"), year: 2020 }; // Use the shared behavior println!("{}", my_dog.describe()); println!("{}", my_car.describe()); }
Type conversion
The From
and Into
traits in Rust provide a standardized way to convert between types.
struct MyType { value: i32, } impl From<i32> for MyType { fn from(item: i32) -> Self { MyType { value: item } } } fn main() { let num = 42; let my_value: MyType = MyType::from(num); println!("MyType value: {}", my_value.value); let original_value: i32 = my_value.into(); println!("Original value: {}", original_value); }
Packages, Crates and Modules
As program grows its importante to group related functionality and group separate code with distict features. Code organization in Rust is done using the concept of packages, crates and modules.
Project code iss split into multiple modules and each module is a separate file. A module is a file that contains one or more items, and the items can be functions, structs, enums, constants, etc.
A package is a collection of crates. A crate is a binary or library that is built using the Rust compiler.
- Packages: A Cargo feature that lets you build, test, and share crates
- Crates: A tree of modules that produces a library or executable, can contain both. This is the primary unit of code organization which can be distributed to the community via crates.io
- Modules and use: Let you control the organization, scope, and privacy of paths
- Paths: A way of naming an item, such as a struct, function, or module
A common practice in the Rust community is to create dual library/binary crates even when the primary intention of a project is to produce an executable.
Playground
Use this for testing
Type and test your code here
Random Number
Generate Random Number
Computers generate random numbers using two main methods:
- True Random Number Generators (TRNGs) that rely on inherently random physical and unpredictable processes such as radioactive decay, mouse movements, fan noise, etc and convert them into random number.
- Pseudo-Random Number Generators (PRNGs): uses deterministic algorithm that start with an initial seed value and quickly produce a deterministic sequence of numbers that appear random. PRNGs are the ones we use.
use rand::Rng; fn main() { let mut rng = rand::thread_rng(); if rand::random() { // generate a boolean println!("char: {}", rand::random::<char()>); // generate a random unicode char } let n1: u8 = rng.gen(); // generate a u8 int let n2: f64 = rng.gen(); // generate a f64 between 0-1 println!("Random u8: {}", n1); println!("Random f64: {}", n2); println!("Random u32: {}", rng.gen::<u32>()); println!("Random i32: {}", rng.gen::<i32>()); println!("Random float: {}", rng.gen::<f64>()); // range println!("Integer: {}", rng.gen_range(0..10)); // [0,10) println!("Float: {}", rng.gen_range(0.0..10.0)); let mut nums: Vec<i32> = (1..100).collect(); nums.shuffle(&mut rng); }
By default, rand crate have uniform distribution, which we can aso generate as follows:
use rand::distributions::{Distribution, Uniform}; fn main() { let mut rng = rand::thread_rng(); let die = Uniform::from(1..7); loop { let throw = die.sample(&mut rng); println!("Roll the die: {}", throw); if throw == 6 { break; } } }
We can generate Random Number with other distributions from rand_distr
use rand_distr::{Distribution, Normal, NormalError}; use rand::thread_rng; fn main() -> Result<(), NormalError> { let mut rng = thread_rng(); let normal = Normal::new(1.0, 2.0)?; // (μ,σ^2) // other constructs are also available let v = normal.sample(&mut rng); println!("{} is from a N(1, 4) distribution", v); Ok(()) }
Generate random set of alphanumeric ASCII characters for example to create a password
use rand::{thread_rng, Rng}; use rand::distributions::Alphanumeric; fn main() { let rand_string: String = thread_rng() .sample_iter(&Alphanumeric) .take(30) .map(char::from) .collect(); println!("{}", rand_string); }
To include user defined bytestring in the password
fn main() { use rand::Rng; const CHARSET: &[u8] = b"ABCDEFGHIJKLMNOPQRSTUVWXYZ\ abcdefghijklmnopqrstuvwxyz\ 0123456789)(*&^%$#@!~"; const PASSWORD_LEN: usize = 30; let mut rng = rand::thread_rng(); let password: String = (0..PASSWORD_LEN) .map(|_| { let idx = rng.gen_range(0..CHARSET.len()); CHARSET[idx] as char }) .collect(); println!("{:?}", password); }
Custom Random Number Generator
use rand::Rng; use rand::distributions::{Distribution, Standard}; #[derive(Debug)] struct Point { x: i32, y: i32, } impl Distribution<Point> for Standard { //implement Distribution trait on the new type Point for Standard (generic RV dist) fn sample<R: Rng + ?Sized>(&self, rng: &mut R) -> Point { let (rand_x, rand_y) = rng.gen(); Point { x: rand_x, y: rand_y, } } } fn main() { let mut rng = rand::thread_rng(); let rand_tuple = rng.gen::<(i16, bool, f64)>(); // random tuple of our choice let rand_point: Point = rng.gen(); println!("Random tuple: {:?}", rand_tuple); println!("Random Point: {:?}", rand_point); }
Misc
Common Data Structures
Leet Code Problems
References
Search Problems
The search problems are the problems of finding a particular item (the search "key") or set of items within a collection of items.
Linear Search
Linear search sequentially checks each element of the list for the target value until a match is found or until all the elements have been searched. Hence, in worst case, the algorithm has to check every element of the list.
use std::cmp::PartialEq; fn linear_search<T: PartialEq>(item: &T, arr: &[T]) -> Option<usize> { for (i, data) in arr.iter().enumerate() { if item == data { return Some(i); } } None } fn main(){ let arr = [81, 12, 32, 56, 17, 38, 19, 10]; let result = linear_search(&17, &arr); println!("result is {result:?}"); }
Properties
- Worst case performance: O(n)
- Best case performance: O(1)
- Average case performance: O(n)
- Worst case space complexity: O(1) iterative
Binary Search
Binary search compares the target value to the middle element of a sorted array. If they are not equal, the half in which the target cannot lie is eliminated and the search continues on the remaining half, again taking the middle element to compare to the target value, and repeating this until the target value is found. If the search ends with the remaining half being empty, the target is not in the array.
Binary search is also known as half-interval search or logarithmic search
Properties
- Worst case performance O(log n)
- Best case performance O(1)
- Average case performance O(log n)
- Worst case space complexity O(1)
Exponential Search
Exponential search begins by searching a small range of elements, gradually increasing the range exponentially until the target is found or the entire list is searched.
Properties
- Worst case performance O(log i)
- Best case performance O(1)
- Average case performance O(log i)
- Worst case space complexity O(1)
Jump Search or Block Search
Jump Search search a sorted array first checking all items L(km), where k ∈ N and m is the block size, until an item is found that is larger than the search key. To find the exact position of the search key in the list a linear search is performed on the sublist L[(k-1)m, km].
Properties
- Worst case performance O(√n)
- Best case performance O(1)
- Average case performance O(√n)
- Worst case space complexity O(1)
Fibonacci Search
Unlike Binary search dividing array into two equal parts, Fibonacci search divides the sorted array into two parts that have sizes consecutive Fibonacci numbers to narrow down the possible location.
Properties
- Worst case performance O(log n)
- Best case performance O(1)
- Average case performance O(log n)
- Worst case space complexity O(1)
Sort
Queue
A queue is a linear data structure that follows First In First Out (FIFO) principle. Elements are added at the back (enqueue) and removed from the front(deqeue) just like a line in any queue. Queue are used for tasks like scheduling, buffering, asynchronous data handling and so on.
Idiomatic approach to create queue is using VecDeque which is a double-ended queue implementation with a growable ring buffer. The “default” usage of this type as a queue is to use push_back
to add to the queue, and pop_front
to remove from the queue. extend
and append
push onto the back in this manner, and iterating over VecDeque goes front to back.
use std::collections::VecDeque; fn main() { let deque: VecDeque<u32> = VecDeque::with_capacity(10); let _r1 = q.enqueue(1); let _r2 = q.enqueue(2); let _r3 = q.enqueue(3); if let Err(error) = q.enqueue(4) { println!("Enqueue error: {error}"); } if let Some(data) = q.dequeue() { println!("data: {data}"); } else { println!("empty queue"); } println!("size: {}, empty: {}", q.size(), q.is_empty()); println!("content: {:?}", q); }
Linked List
Tree
Binary Tree
B-Tree
B-tree is a self-balancing tree data structure that maintains sorted data and allows searches, insertions, and deletions in logarithmic time. It’s particularly useful for systems that read and write large blocks of data, like databases and file systems.
Nodes can have more than two children (non-binary) and each node contains multiple keys, which divide its children into ranges. All leaves are at the same level, ensuring balanced height. A B-tree of order m can have at most m children and at least ⌈m/2⌉ children (except the root).
Fewer levels compared to binary trees → faster disk access. Great for minimizing I/O operations, which makes it the backbone of database indexing (e.g., MySQL, PostgreSQL use B+ trees, a variant).
use std::collections::BTreeMap; fn main() { let mut btree = BTreeMap::new(); btree.insert(1, "one"); btree.insert(2, "two"); btree.insert(3, "three"); // Efficiently retrieve values if let Some(value) = btree.get(&2) { println!("The value for key 2 is: {}", value); } }
Graph
Heap
Box
Using Box is a smart pointer that provides ownership for heap-allocated data in Rust. This is useful for large data structures or when the size of the data is not known at compile time. This is often used for recursive data structures and dynamic dispatch which is essential for polymorphism.
trait Shape { fn area(&self) -> f64; } struct Circle { radius: f64, } impl Shape for Circle { fn area(&self) -> f64 { std::f64::consts::PI * self.radius * self.radius } } // Function that takes a Box<dyn Shape> to allow for dynamic dispatch fn print_area(shape: Box<dyn Shape>) { println!("The area is {}", shape.area()); } fn main() { let circle = Circle { radius: 5.0 }; let boxed_circle: Box<dyn Shape> = Box::new(circle); print_area(boxed_circle); }
Hashing
Trie
Segment Tree
Bit Manipulation
Dynamic Programming
Geometry
The Twelve Factor App
I. Codebase One codebase tracked in revision control, many deploys
II. Dependencies Explicitly declare and isolate dependencies
III. Config Store config in the environment
IV. Backing services Treat backing services as attached resources
V. Build, release, run Strictly separate build and run stages
VI. Processes Execute the app as one or more stateless processes
VII. Port binding Export services via port binding
VIII. Concurrency Scale out via the process model
IX. Disposability Maximize robustness with fast startup and graceful shutdown
X. Dev/prod parity Keep development, staging, and production as similar as possible
XI. Logs Treat logs as event streams
XII. Admin processes Run admin/management tasks as one-off processes
- Automated Test that runs on every commit
The world of backend development is vast. The operational context vastly define the tool and practices used to tackle the problem.
-
Trunk based development is suitable for software that are continuously deployed in cloud
-
Gitflow is more suitable for those selling software that are hosted and run on-premise by the customers
-
Cloud-native application, which are expected
- distributed with multiple instanced to achieve high-availability while running in fault-prone environments;
- To allow us to continuously release new versions with zero downtime;
- To handle dynamic workloads (e.g. request volumes), we should measure load on system and scale elastically without overprovisioning
-
Desktop/on-premise application
Logging
logging directives controlled with RUST_LOG
environmental variable.
mod foo { mod bar { pub fn run() { log::warn!("[bar] warn"); log::info!("[bar] info"); log::debug!("[bar] debug"); } } pub fn run() { log::warn!("[foo] warn"); log::info!("[foo] info"); log::debug!("[foo] debug"); bar::run(); } } fn main() { env_logger::init(); log::warn!("[root] warn"); log::info!("[root] info"); log::debug!("[root] debug"); foo::run(); }
Run test application as
RUST_LOG="warn,test::foo=info,test::foo::bar=debug" ./test
Dependencies
Configuration Management
Backing Services
Build, release, run
SQLite
Postgres
Processes
Port Binding
Concurrency
Concurrent programming, where different parts of a program execute independently, and parallel programming, where different parts of a program execute at the same time, are becoming increasingly important as more computers take advantage of their multiple processors. When multiple transactions interact with the same data simultaneously, potential inconsistencies and data integrity issues are possible. Rust provides full support of "Fearless Concurrency" that ensures memory saftey executing multiple tasks. Here are different approaches in rust to write concurrent code:
- Mutexes and Locks: Use the std::sync module to create mutexes and locks, which allow exclusive access to shared resources. Example: Mutex::new(0) and lock().unwrap() to ensure only one thread modifies the shared data at a time.
- Channels: Implement message passing between threads using channels, such as tokio::sync::mpsc::channel. This approach allows threads to communicate without blocking, and can be used with Tokio’s async runtime.
- Atomic Reference Counting: Use Arc (Atomically Reference Counted) smart pointers to share ownership of data between threads, ensuring the data is not dropped until all threads are finished using it.
- Async/Await: Leverage Tokio’s async/await syntax to write concurrent code that’s easier to read and maintain. This approach is particularly useful for I/O-bound tasks.
- Rayon: Utilize the Rayon crate, which provides a parallelism library for Rust. It offers a high-level API for parallelizing tasks and can be used with Tokio’s async runtime.
- Tokio: Use Tokio’s async runtime to write concurrent code that’s designed for I/O-bound tasks. Tokio provides a set of building blocks, including async channels, timers, and file I/O.
- Condvars: Employ condition variables (Condvars) to synchronize threads, allowing them to wait for specific conditions to be met before proceeding.
- Manual Low-Level Async: Implement low-level async programming using mio, a Rust library for building asynchronous I/O applications. This approach requires a deeper understanding of Rust’s concurrency model and is typically used for performance-critical code.
- Parallelism: Use Rust’s built-in parallelism features, such as std::thread::scope and std::parallel::Iterator, to parallelize tasks and take advantage of multi-core processors.
Threads
Rust threads work similarly to threads in other languages. Threads are all daemon threads, the main thread does not wait for them, hence the program can end before spawn complete the process. Thread panics are independent of each other. Panics can carry a payload, which can be unpacked with downcast_ref.
use std::thread; use std::time::Duration; fn main() { thread::spawn(|| { for i in 1..10 { println!("Count in thread: {i}!"); thread::sleep(Duration::from_millis(5)); } }); for i in 1..5 { println!("Main thread: {i}"); thread::sleep(Duration::from_millis(5)); } }
Async and await
The async and await keywords in Rust enable writing asynchronous functions and handling asynchronous operations in a way that resembles synchronous code, improving readability and maintainability. These are usually used to asynchronously handle file operations, or I/O operations without blocking the thread.
use tokio::fs::File; use tokio::io::{self, AsyncReadExt}; async fn read_file_content(path: &str) -> Result<String, io::Error> { let mut file = File::open(path).await?; // Asynchronously open the file let mut content = String::new(); file.read_to_string(&mut content).await?; //Asynchronously read the file content Ok(content) // Return the content if no errors } // Main function to run the async function #[tokio::main] async fn main() { match read_file_content("example.txt").await { Ok(content) => println!("File content: {}", content), Err(e) => eprintln!("Error reading file: {}", e), } }
Disposability
Dev/Prod Parity
Logging
logging directives controlled with RUST_LOG
environmental variable.
mod foo { mod bar { pub fn run() { log::warn!("[bar] warn"); log::info!("[bar] info"); log::debug!("[bar] debug"); } } pub fn run() { log::warn!("[foo] warn"); log::info!("[foo] info"); log::debug!("[foo] debug"); bar::run(); } } fn main() { env_logger::init(); log::warn!("[root] warn"); log::info!("[root] info"); log::debug!("[root] debug"); foo::run(); }
Run test application as
RUST_LOG="warn,test::foo=info,test::foo::bar=debug" ./test
Admin Processes
Networking
HTTP Request
RestAPI
WebAssembly
Mathematics
Number System
Complex numbers
use std::f64::consts::PI; use num::complex::Complex; fn main() { let complex_integer = num::complex::Complex::new(10, 20); let complex_float = num::complex::Complex::new(10.1, 20.1); println!("Complex integer: {}", complex_integer); println!("Complex float: {}", complex_float); let complex_num1 = num::complex::Complex::new(10.0, 20.0); let complex_num2 = num::complex::Complex::new(3.1, -4.2); let sum = complex_num1 + complex_num2; // must be of same type println!("Sum: {}", sum); let x = Complex::new(0.0, 2.0*PI); println!("e^(2i * pi) = {}", x.exp()); }
Big Integers
BigInts are used for calculations exceeding 128 bits.
use num::bigint::{BigInt, ToBigInt}; fn factorial(x: i32) -> BigInt { if let Some(mut factorial) = 1.to_bigint() { for i in 1..=x { factorial = factorial * i; } factorial } else { panic!("Failed to calculate factorial!"); } } fn main() { println!("{}! equals {}", 100, factorial(100)); }