Getting Rusty
The goal of this MD book is to track my journey of getting Rusty and cover the full spectrum of Rust. The standard process of learning rust is throught the official Rust Programming Book, followed by interactive Rustling Course. For those with existing background working with a static compiler, one can also start with Rust by Example or the Comprehensive Rust - Google. Many examples in this are adopted from these resources.
Rust is a multi-paradigm systems programming language focusing on safety, speed, and concurrency.
Multi-paradigm
- Imperative Programming
- Object Oriented Programming with struct, enum, traits, methods
- Functional Programming: immutability, higher-order functions, pattern matching
Compile time memory safety
Different type of memory bugs are prevented at compile time.
- No uninitialized variables.
- No double-frees.
- No use-after-free.
- No NULL pointers: no issue of dereferencing null pointers, dangling pointers.
- No forgotten locked mutexes.
- No data races between threads.
- No iterator invalidation.
- No undefined runtime behavior - what a Rust statement does is never left unspecified
- Array access is bounds checked.
- Integer overflow is defined (panic or wrap-around).
- Many abstractions such as iterators are zero-cost. There is no garbage collector, so you can use exactly as much memory requried at the given time.
Furthermore, good compiler error messages allows writing and debugging rust code easy and more productive.
Rust is fast and resource efficient
- Rust is statically compiled with
rustc
which uses LLVM as its backend. Performance (runtime and memory) is comparable to C/C++. - Full support for concurrency using OS threads with mutexes and channels. Refered as "Fearless concurrency" increases reliability on the compiler to ensure correctness at runtime.
- also provides unsafe use of rust for even faster operations Nomicon
with expressive language features
- Generics.
- No overhead foreign function interface (FFI). Fucntion call be rust and C have identical performance to C function calls.
- Built-in dependency manager: cargo.
- Built-in support for testing.
- Excellent Language Server Protocol support.
Other features
- Strong, static yet expressive type system influenced by Haskell. Types allows to check potential problems and avoid them.
- Concurrency can be done with any technique with thread saftey through the same type system ensuring memory saftey
- Cross platform: compile to different systems, embedded systems and even web as WebAssembly (WASM)
- C interoperability, but use of C reduces memory safety
- supports many platforms and architectures: x86, ARM, WASM, Linux, Mac, Windows, ...
Rust has been voted the "most loved programming language" in the Stack Overflow Developer survey since 2016. But it is not the only language that has been voted as the most loved programming language. Google gathered data from their engineers to understand the rust learning curve, and found that they were proficient in Rust in less than 2 months
How to use this Guide
There are several useful keyboard shortcuts in mdBook:
Arrow-Left
: Navigate to the previous page.Arrow-Right
: Navigate to the next page.Ctrl + Enter
: Execute the code sample that has focus.s
: Activate the search bar.
There are code blocks like the following in this book which allows you to edit and run the code as you wish.
fn main() { println!("Hello đ!"); }
And yes it looks very like C/C++, main
function define with fn
keytword is the entry point of the program. Blocks are delimited by curly braces and needs a semi-colon to end a statement.
Environment Setup
To get a more developer-like experience, you'd need to install rust toolchain on your machine.
Install Rustup toolchain installer, the site suggests the following command for linux
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
For windows check rust in windows you'd need also need to install Vistual Studio build tools.
This rustup tool that can manage multiple versions of Rust compiler on a machine. To check installation and to update the compiler version run: rustup update
If you want to develop inside a dev container, you can create one by following through this tutorial. In short install and configure Docker in your system, install "Dev Container" extension in VSCode and add and edit a dev container configuration file to the repo and rebuild the container to get started.
Alternatively you can try code in Online Rust Playground
IDEs and Tools
There are different IDE available for rust development. Visual Studio Code is the most popular one. Jetbrain also provide a separate IDE RustRover
To setup for rust install rust-analyzer
for VScode. It will provide you with code completion, syntax highlighting, and other features. The toolset also includes linting provided by rustc and clippy to detect issues with your code.
If you are using vim you will have to install rust-analyzer from rustup and rust.vim
Cargo : Rust build tool, dependency manager and package manager
rustc
is the Rust compiler that allows us to compile the file, but managing multiple files and dependencies can be difficult hence we use Cargo.
-
Start a project :
cargo new prj_name
creates a folder with
Cargo.toml
manifest files to track metadata and dependencies andsrc/main.rs
with hello world script- initialize with a new git repo as well
cargo new --vcs=git prj
- creating a binary executable project :
cargo new prj_name --bin
- creating a library project
cargo new prj_name --lib
--> createslib.rs
instead ofmain.rs
- initialize with a new git repo as well
-
Quickly check code to make sure it compiles without producing an executable
cargo check
-
cargo add crate_name
: This will add the latest crate from the package registry Crates.io to theCargo.toml
file. To use the library you'll have to import the page in your source code:use crate_name::func;
All crates on crates.io will automatically have its documentation built and available on docs.rs
eg:
cargo add rand
To use dependencies in the code add the following line#![allow(unused)] fn main() { use rand::Rng; let rand_num = rand::thread_rng().gen_range(1..=100); }
- use
--git URL --branch branch --tag tag
to add crate from git URL branch and tag - use
--dev
add as dev dependency,--build
to add as build dependency,--target
add as target dependency to the given target platform - use
-F features
or--feature features
feature to add/activate additional features available in the crate
-
Build your project:
cargo build
compiles into a target/debug folder with compiled file and other build files
- binaries can also be generate using
rustc main.rs
, but it becomes easier to build with cargo with multiple files and dependencies - building for release compiles with optimization to target/release
cargo build --release
- build also creates cargo.lock file that figure out and stores the versions of dependencies that fits the criteria and ensures we rebuild the same artifact every time
- binaries can also be generate using
-
cargo update
will ignore the Cargo.lock file and update a crate to newer version that fit the specification in Cargo.toml -
Compile and Run the project:
cargo run
-
Test your project:
cargo test
-
Build documentation for your project:
cargo doc --open
also opens the doc in a browserbuilds into
target\doc
compiling doc comments///
from code into documentation -
Remove generated artifacts:
cargo clean
removes
/target
folder -
Publish a library to crates.io:
cargo publish
Cargo is extensible with sub command plugins such as:
- Linter: Clippy: install using
rustup component add clippy
and run (and automatically fix )usingcargo clippy --fix
- Formatter: Rustfmt: auto-formatting to community standards, run using
cargo fmt
Other advance features: workspaces, build scripting.
Variables and Data Types
A variable is a named abstraction of the memory (storage location) which holds a data. It has a name, a datatype, value, reference to memory location. Rust provides type safety via static typing, i.e. type of the variables must be known at compile time. Hence, the type must be declared or infered based on the value assigned during the compilation type checking. Static typing helps catch type errors early in the development process, reducing the likelihood of runtime errors related to type mismatches.
Variable bindings are made with let, and the type is declared with a colon.
fn main() { let x: i32 = 42; let y = 17; // type i32 inferred from value during compilation println!("x: {x}"); // x = 24; // cannot assign twice to immutable vairable // println!("x: {x}"); }
Variables are immutable by default. We can make it mutable using mut
keyword. We can only change its value not the type.
fn main() { let mut x: i32 = 42; println!("x: {x}"); x = 24; println!("x: {x}"); }
This doesn't mean let creates constants, constants are values that are bound to a name, must be type annotated, and are not allowed mut
usage to change it. Constants can be declared in any scope, usually defined in global scope.
const PI: f32 = 3.14159265; fn main(){ let r: f32 = 2.0; println!("{}",PI*r.powf(2.0)); }
Even if the variables cannot be changed, one can reuse the variable name by redeclaring the variable. The first variable is said to be shadowed by the second. Furthermore we can even create a new type.
fn main() { let x = "Five"; let x = x.len(); // shadowing to store different data type. { let x = x * 2; // shadowing to create a new variable in the inner scope println!("x in the inner scope is: {x}"); } println!("x in outer scope is: {x}"); }
Data Types
use std::io;
fn main(){
println!("Please input your guess.");
let mut guess = String::new();
io::stdin()
.read_line(&mut guess) // read line from stdin
.expect("Failed to read line"); // error handling
let guess: i32 = guess.trim().parse().expect("Not a number!"); // trim end spaces and parse to annotated format
println!("{}",guess);
}
Basic built-in types, and the syntax for literal values of each type.
Types | Literals | |
---|---|---|
Signed integers | i8 , i16 , i32 , i64 , i128 , isize | -10 , 0 , 1_000 , 123_i64 |
Unsigned integers | u8 , u16 , u32 , u64 , u128 , usize | 0 , 123 , 10_u16 |
Floating point numbers | f32 , f64 | 3.14 , -10.0e20 , 2_f32 |
Unicode scalar values | char | 'a' , 'α' , 'â' , 'đ' |
Booleans | bool | true , false |
The types have widths as follows:
iN
,uN
, andfN
are N bits wide,isize
andusize
are the width of a pointer depending on the computer architecture,char
is 32 bits wide,bool
is 8 bits wide.
The signed numbers are stored using two's complement.
Integer overflow can occur when we change a variable to outside of the range. eg: 256 to u8 with range 0 - 255
- in debug mode, Rust includes checks for integer overflow that cause your program to panic at runtime if this behavior occurs.
- in release mode i.e compiled with
--release
flag, Rust doesn't panic but performs twoâs complement wrapping, i.e 256 becomes 0, 257 becomes 1,...
Relying on integer overflowâs wrapping behavior is considered an error, hence rust provides families of methods to handle them explicitly.
- Wrap in all modes with the wrapping_* methods, such as wrapping_add.
- Return the None value if there is overflow with the checked_* methods.
- Return the value and a boolean indicating whether there was overflow with the overflowing_* methods.
- Saturate at the valueâs minimum or maximum values with the saturating_* methods. e.g., (a * b).saturating_add(b * c).saturating_add(c * a).
Rust provides addition, subtraction, multiplication, division, and remainder operators. Here's list of Operators
Compound types can group multiple values into one type. Rust has two primitive compound types: tuples and arrays.
Array is a datatype that stores a 'N' number of values of the same type 'T'. Here length 'N' (a compile-time constant) is part of the type [T, N] hence [u8; 3] and [u8; 4] are considered two different types.
fn main() { let mut a: [i8; 10] = [42; 10]; a[5] = 0; // indexing to access array println!("a: {a:?}"); }
Try accessing an out-of-bounds array element. Array accesses are checked at runtime and panic when index is beyond the arrays bound. Rust can usually optimize these checks away, and they can be avoided using unsafe Rust.
We can use literals to assign values to arrays.
The println! macro asks for the debug implementation with the ? format parameter: {} gives the default output, {:?} gives the debug output. Types such as integers and strings implement the default output, but arrays only implement the debug output. This means that we must use debug output here.
Adding #, eg {a:#?}, invokes a "pretty printing" format, which can be easier to read.
Like arrays, tuples have a fixed length but tuples can group together values of different types into a compound type. Fields of a tuple can be accessed by the period and the index of the value, e.g. t.0, t.1. The empty tuple () is referred to as the "unit type" and signifies Void or absence of a return value.
fn main() { let t: (i8, bool) = (7, true); println!("t.0: {}", t.0); //indexing to access tuple println!("t.1: {}", t.1); let (x, y) = t; // destructuring tuple to two variables x, y println!("{x}"); }
Type inference
Rust compiler infers types based on constraints given by variable declarations and usages. Machine code generated by such declaration is identical to the explicit declaration of a type.
fn takes_u32(x: u32) { println!("u32: {x}"); } fn takes_i8(y: i8) { println!("i8: {y}"); } fn main() { let x = 10; let y = 20; takes_u32(x); takes_i8(y); // takes_u32(y); }
Type Conversion
Blocks and Scopes
A block in Rust contains a sequence of expressions, enclosed by braces {}. Each block has a value and a type, which are those of the last expression of the block:
fn main() { let z = 13; let x = { let y = 10; println!("y: {y}"); z - y // if commented block returns () }; println!("x: {x}"); }
A variable's scope is limited to the enclosing block.
You can shadow variables, both those from outer scopes and variables from the same scope:
fn main() { let a = 10; println!("before: {a}"); { let a = "Hi"; println!("inner scope: {a}"); let a = true; println!("shadowed in inner scope: {a}"); } println!("after: {a}"); }
Shadowing is different from mutation, because after shadowing both variable's memory locations exist at the same time. Both are available under the same name, depending where you use it in the code.
Memory Management & Ownership
All programs need to manage memory for efficient memory utilization, prevent memory leaks for safety and program stability. There are few approaches for memory management:
- Manual memory allocation and deallocation (eg in C/C++) for fine-grained control which can lead to memory leaks and segmentation faults
- Automatic memory management using garbage collection (GC) (eg: Java, Python) regularly check for unused memory and free the memory. Some uses Automatic Reference Counting (ARC) (eg: swift) to keep count of object in memory and deallocates the memory if count drops to zero.
- If we forget to clean, it'll waste memory,
- if we do it too early it would create invalid variable,
- if we do it twice, it can create double-free bug.
- Ownership and borrowing (rust): set of rules enforces at compile time, where each piece of data or value has a single owner at a time and memory is automatically freed (i.e. the value is dropped) when the owner goes out of scope.
Stack and Heap Allocation | |Stack | Heap | |---|---|---| |Structure| Contiguous block of memory, hence size known and fixed| non-contiguous allocation, less organized, memory allocator finds empty spot and returns a pointer to use it| |Usage | static memory: local variable, function call management | dynamic memory: objects, data structure| | Management | By compiler allocated when function called, deallocated when exits | mannually allocated, deallocated depending on programming language | | speed| faster alloc/dealloc & access | slower alloc/dealloc & access due to complexity, following pointers| | scope | limited to function/block| usually global, as long as reference exists| | safety| follows LIFO to reduce memory corruption| prone to memory leaks and fragmentation|
When your code calls a function, the values passed into the function (including, potentially, pointers to data on the heap) and the functionâs local variables get pushed onto the stack. When the function is over, those values get popped off the stack. The main purpose of ownership is to manage heap data.
Most of the data primitive data types have known, fixed size hence are stored on stack and easily popped off when their scope is over. String literal eg: "hello" are fixed and known at compile time. Dynamic data type such as String
, whose size are unknown at compile time and whose size might change during runtime need to allocate memory on heap.
{ let mut s = String::from("hello"); // Memory requested from memory allocator at runtime s.push_str(", world!"); // push_str() appends a literal to a String println!("{s}"); } // scope of s is over and s is no longer valid
In Rust the memory is automatically returned once the variable that owns it goes out of scope (without need of GC). Rust calls a function called drop automatically at the closing curly bracket. This is similar to Resource Acquisition Is Initialization (RAII) pattern in C++.
let x = 5; // data in stack let y = x; // copy of value of x is binded to y let s1 = String::from("hello"); //String is made up of ptr to heap, len, capacity let s2 = s1; // String data (ptr, len, capacity) is copied not the actual string from the heap, same data is being used // println!({"s1"}) // s1 is invalid after move
since both s1, s2 points to same memory, both might try to free the memory "double-free error". To ensure memory safety, after let s2 = s1;
Rust consider s1
to be moved to s2
and s1
is no longer valid.
This type of copy is called a shallow copy in other language, but because of invalidation of first variable, it is known as move in rust.By design choice, Rust will never automatically create "deep" copies of data.
let s1 = String::from("hello"); let s2 = s1.clone(); // create deep copy of heap data println!("s1 = {s1}, s2 = {s2}");
Rust has Copy
trait that we can place on types that are stored on stack such as integers, if a type implements the Copy
trait, variables that use it do not move, but rather are trivially copied, making them still valid after assignment to another variable. Copy
is not allowed if Drop
is implemented.
Passing a value to a function is similar to assigning a value to a variable.
fn main() { let s = String::from("hello"); // s comes into scope takes_ownership(s); // s's value moves into the function... // s is no longer valid here let x = 5; // x comes into scope makes_copy(x); // x would move into the function, // but i32 is Copy, so it's still valid & can be used } // Here, x goes out of scope, then s. But because s's value was moved, nothing special happens. fn takes_ownership(some_string: String) { // some_string comes into scope println!("{some_string}"); } //some_string goes out of scope and `drop` is called. The backing memory is freed. fn makes_copy(some_integer: i32) { // some_integer comes into scope println!("{some_integer}"); } // Here, some_integer goes out of scope.
fn main() { let s1 = gives_ownership(); // moves return value into s1 let s2 = String::from("hello"); // s2 comes into scope let s3 = takes_and_gives_back(s2); // s2 is moved into function which moves its return value into s3 } // Here, s3 goes out of scope and is dropped. s2 was moved, so nothing // happens. s1 goes out of scope and is dropped. fn gives_ownership() -> String { // move its return value into the function that calls it let some_string = String::from("yours"); // comes into scope some_string // returned and moves out to the calling function } // This function takes a String and returns one fn takes_and_gives_back(a_string: String) -> String { // a_string comes into // scope a_string // a_string is returned and moves out to the calling function }
When a variable that includes data on the heap goes out of scope, the value will be cleaned up by drop unless ownership of the data has been moved to another variable.
Taking ownership and then returning ownership with every function is a bit tedious, if we want to let a function use a value without transferring owernship through reference .
References & Borrowing
Reference is like a pointer to the address of the data storage, whose data is owned by some other variable. Unlike a pointer, a reference is guaranteed to point to a valid value of a particular type for the life of that reference. The action of creating a reference is called borrowing. The borrowed references are immutable by default. Multiple immutable references can exist simultaneously. We can also create a mutable reference, but only one mutable reference can exist at a time to prevent data races. Data race condition occur when two or more pointer access the same data at the same time and at least one is being used to write to the data without any mechanism to synchronize access to the data. For the same reason, we also cannot have a mutable reference while we have an immutable one to the same value, only allowed if the scope of immutable reference ends before creating mutable reference.
fn main() { let s1 = String::from("hello"); let len = calculate_length(&s1); // creates and passes a reference println!("The length of '{s1}' is {len}."); let s2 = &s1 ; //s1 is unused after so mutable reference is fine change(&mut s1); // pass mutable reference println!("{s}"); } fn calculate_length(s: &String) -> usize { // takes in a reference s.len() } fn change (s: &mut String){ s.push_str(", world"); }
we can use curly brackets to create a new scope, allowing for multiple mutable references, just not simultaneous ones:
let mut s = String::from("hello"); { let r1 = &mut s; } // r1 goes out of scope here, so we can make a new reference with no problems. let r2 = &mut s;
For a contiguous sequence of elements in a collection, it might be useful to pass reference of a slice of the elements rather than the whole collection.
fn first_word(s: &String) -> usize { // returning usize let bytes = s.as_bytes(); // convert strings to array of bytes for (i, &item) in bytes.iter().enumerate() { // iterate over byte as indexed tuple if item == b' ' { // search for the byte representing space return i; } } s.len() // meaningful and valid only as long as string is valid } fn main() { let mut s = String::from("hello world"); let word = first_word(&s); // word will get the value 5 s.clear(); // this empties the String, making it equal to "" // word still has the value 5 here, but there's no more string that // we could meaningfully use the value 5 with. word is now totally invalid! }
let s = String::from("hello world"); let hello = &s[0..5]; // first and last index, but stores start and length of slice, let world = &s[6..11]; let word = &s[..]; // both first and last index can be dropped to take entire string
fn first_word(s: &String) -> &str { let bytes = s.as_bytes(); for (i, &item) in bytes.iter().enumerate() { if item == b' ' { return &s[0..i]; } } &s[..] } fn main() { let mut s = String::from("hello world"); let word = first_word(&s); s.clear(); // compiler indicates error! println!("the first word is: {word}"); }
The compiler will ensure the references into the String remain valid. Rust disallows the mutable reference in clear and the immutable reference in word from existing at the same time, and compilation fails. Not only has Rust made our API easier to use, but it has also eliminated an entire class of errors at compile time!
The type of string literal is &str: a slice pointing to specific point of the binary. This is also why string literals are immutable; &str is an immutable reference.
#![allow(unused)] fn main() { fn first_word(s: &str) -> &str { }
dangling pointer â a pointer that references a location in memory that may have been given to someone elseâby freeing some memory while preserving a pointer to that memory. In Rust, by contrast, the compiler guarantees that references will never be dangling references: if you have a reference to some data, the compiler will ensure that the data will not go out of scope before the reference to the data does.
fn main() { let reference_to_nothing = dangle(); } fn dangle() -> &String { // dangle returns a reference to a String let s = String::from("hello"); // s is a new String &s // we return a reference to the String, s } // Here, s goes out of scope, and is dropped. Its memory goes away. // Danger!
The solution is to return the string directly.
TODO Deferencing with * operator
Lifetimes
lifetimes is used to track how long references are valid. This ensures that references do not outlive the data they point to, preventing dangling references.
Control Flow
Conditional
Condition is expected to be a bool, other values are not automatically converted to a Boolean.
fn main() { let x = 42; if x == 0 { println!("zero!"); } else if x < 100 { println!("small"); } else { println!("large"); } }
We can use if
as an expression, which returns the value of the last expression in the block (notice missing ;, for returning the value).
We can use if in an expression
fn main() { let x = 10; let size = if x < 20 { "small" } else { "large" }; println!("number size: {}", size); }
Note that values in both arm should have same data type since it has to be evaluated and assigned to the size
variable at runtime.
Match Case
Match case allows to compare a value against multiple patterns and execute code based on the first matching pattern. Patterns can be made up of
- literal values,
- destructured arrays, enums, structs or tuples
- variables
- wildcards
- Placeholders
#![allow(unused)] fn main() { match VALUE { PATTERN => EXPRESSION, PATTERN => EXPRESSION, PATTERN => EXPRESSION, } }
enum UsState { Alabama, Alaska, // --snip-- } enum Coin { Penny, Nickel, Dime, Quarter(UsState), // } fn value_in_cents(coin: Coin) -> u8 { match coin { Coin::Penny => 1, Coin::Nickel => 5, Coin::Dime => 10, Coin::Quarter(state) => { println!("State quarter from {state:?}!"); 25 } } } fn main() { println!("penny: {}", value_in_cents(Coin::Quarter(UsState::Alaska))); }
Match with Option
#![allow(unused)] fn main() { fn plus_one(x: Option<i32>) -> Option<i32> { match x { None => None, Some(i) => Some(i + 1), } } let five = Some(5); let six = plus_one(five); let none = plus_one(None); }
Unlike if let
, else if
, else if let
and else
, match
in Rust needs to be exhaustive in the sense that all possibilities for the value in the match expression must be accounted for. One wayy to ensure you've covered every possiblity is to have a catchall pattern for the last arm using a variable name matching any value, or using _
as a placeholder that doesn't binds to a variable ignoring any other value without any warning.
let dice_roll = 9; match dice_roll { 1..=3 => add_fancy_hat(), // match range of patterns 5 | 7 => remove_fancy_hat(), // match multiple patterns other => move_player(other), // catch all other cases in variable `other` } fn add_fancy_hat() {} fn remove_fancy_hat() {} fn move_player(num_spaces: u8) {}
if let
if let
is an expression that lets you combine if
and let
into a single construct to match one pattern while ignoring the rest. This means less boilerplate code and works the same way without the exhaustive checking of match
.
#![allow(unused)] fn main() { let config_max = Some(3u8); match config_max { Some(max) => println!("The maximum is configured to be {max}"), _ => (), // ignore everything else i.e `None` } // alternatively with if let if let Some(max) = config_max { println!("The maximum is configured to be {max}"); } }
with else
#![allow(unused)] fn main() { let mut count = 0; match coin { Coin::Quarter(state) => println!("State quarter from {state:?}!"), _ => count += 1, } let mut count = 0; if let Coin::Quarter(state) = coin { println!("State quarter from {state:?}!"); } else { count += 1; } }
Loops
While
While loop is a conditional loop that runs till the condition is satisfied.
fn main() { let mut x = 200; while x >= 10 { x = x / 2; } println!("Final x: {x}"); }
For
For loop, for-in loop is an iterator loop that iter over an elements of an iterator which can be fixed or could theoretically go on indefinitely.
fn main() { for x in 1..5 { println!("x: {x}"); } for x in 1..=5 { println!("x: {x}"); } for elem in [1, 2, 3, 4, 5] { println!("elem: {elem}"); } for (index, elem) in [1, 2, 3, 4, 5].iter().enumerate() { println!("index: {index}, elem: {elem}"); } for i in std::iter::repeat(5) { println!("turns out {i} never stops being 5"); break; // would loop forever otherwise } }
Under the hood for loops use a concept called "iterators" to handle iterating over different kinds of ranges/collections.
Loop
Loop is idiomatic infinite loop similar to while true
.
fn main() { }
break and continue
fn main() { let mut i = 0; loop { i += 1; if i > 5 { break; } if i % 2 == 0 { continue; } println!("{}", i); } }
Both continue and break can optionally take a label argument which is used to break out of nested loops, can be used for loop, while, for.
fn main() { let s = [[5, 6, 7], [8, 9, 10], [21, 15, 32]]; let mut elements_searched = 0; let target_value = 10; 'outer: for i in 0..=2 { for j in 0..=2 { elements_searched += 1; if s[i][j] == target_value { break 'outer; } } } print!("elements searched: {elements_searched}"); }
Example: Fibonacci Number
fn fib(n: u32) -> u32 { if n < 2 { todo!("Implement this"); } else { todo!("Implement this"); } } fn main() { let n = 20; println!("fib({n}) = {}", fib(n)); }
Solution
fn fib(n: u32) -> u32 { if n < 2 { return n; } else { return fib(n - 1) + fib(n - 2); } } fn main() { let n = 20; println!("fib({n}) = {}", fib(n)); }
Functions
The last expression in a function body (or any block) becomes the return value if ;
is ommited at the end of the expression. The return keyword can also be used for early return. Some functions have no return value, and return the 'unit type', (). The compiler will infer this if the -> () return type is omitted.
fn gcd(a: u32, b: u32) -> u32 { // declaring fn with parameters annotated by their type -> return type if b > 0 { gcd(b, a % b) // last expression in the block is a returned if ; is ommited } else { a // last expression in the block is a returned if ; is ommited } } fn main() { println!("gcd: {}", gcd(143, 52)); // calling gcd with argument values }
Rust code uses snake case as the conventional style for function and variable names, in which all letters are lowercase and underscores separate words.
Overloading is not supported -- each function has a single implementation. Always takes a fixed number of parameters. Default arguments are not supported. Macros can be used to support variadic functions. Always takes a single set of parameter types. These types can be generic.
Statements are instructions that perform some action and do not return a value. Expressions evaluate to a resultant value. Rust is an expression-based language.
Structs and Enums
Structure (struct) is a custom data type that lets you package together multiple related, named values (fields) in a meaniful group similar to object's data attributes. We create an instance of struct by specifying concrete values for each fields in key:value pairs, and access the values using the dot notation. The individual fields cannot be marked as mutable hence the entire instance needs to be declared as mutable.
struct User { active: bool, username: String, email: String, sign_in_count: u64, } fn main(){ let mut user1 = User { active: true, sign_in_count: 1, username: String::from("someone123"), email: String::from("someone@example.com"), }; user1.sign_in_count += 1; println!("user1.username: {}", user1.username); let mut user2 = User { email: String::from("someone456@example.com"), ..user1 // remaining fields of user1 is copied into user2 // string is also copied, not a stack only copy hence, user1 won't be available }; println!("user2.sign_in_count: {}", user2.sign_in_count); // println!("user1.username: {}", user1.username); } // functions can use the field init shorthand syntax rather than repeat each field names fn build_user(email: String, username: String) -> User { User { active: true, username, email, sign_in_count: 1, } }
Special Structs
Tuple Structs are similar to Tuples, without field names
struct Color(i32, i32, i32); struct Point(i32, i32, i32); fn main() { let black = Color(0, 0, 0); let origin = Point(0, 0, 0); }
unit-like structurs don't have any fields at all, similar to ()
the unit type. These are useful when you need to implement a trait on some type but donât have any data that you want to store in the type itself.
struct AlwaysEqual; fn main() { let subject = AlwaysEqual; }
Enum
Enumeration (enum) allows to define a type by enumerating its possible variants. Enum give you a way of saying a value is one of a possible set of values, not both at the same time. We can even optionally put data directly into each enum variant as well by defining its associated data type.
#![allow(unused)] fn main() { enum IpAddr { V4(String), V6(String), } let home = IpAddr::V4(String::from("127.0.0.1")); let loopback = IpAddr::V6(String::from("::1")); fn route(ip_kind: IpAddrKind) {} // to call function with either variant }
The standard library define IpAddr
enum with struct as data type instead of string.
A popular enum defined by standard library is Option
included in the prelude, which encodes the common scenario in which a value could be something or it could be nothing (instead of null feature). With this functionality compiler can check whether you have handled all the cases and prevent bugs related to null references. You have to convert an Option<T>
to a T
before you can perform T operations with it. This helps catch one of the most common issues with null: assuming that something isnât null when it actually is.
#![allow(unused)] fn main() { enum Option<T> { None, Some(T), } let absent_number: Option<i32> = None; let some_number = Some(5); }
Another most common enum is Result that represent either success Ok
or failure Err
.
Trait
Methods
All functions defined within an impl block are called associated functions.
Methods are associated function that specifies the behaivors associated with a struct type. Unlike functions they are defined wihtin the context of a struct (or enum or a trait object). Their first parameter is always a reference to the struct instance (self), which represent instance of the struct the method is being called on. We don't have to repeat the type of self in every method. Also all the things we can do with an instance of a type is placed in one impl
block rather than in various places in the library. However, each struct is allowed to have multiple impl blocks.
struct Rectangle { width: u32, height: u32, } impl Rectangle { fn area(&self) -> u32 { // &self is short for self: &Self ; alias for the type self.width * self.height } } fn main() { let rect1 = Rectangle { width: 30, height: 50, }; println!("The area of the rectangle is {}", rect1.area()); }
Unlike C/C++ which uses .
for calling method on object and ->
for calling method on pointer, Rust has automatic referencing and dereferencing. When you call a method, Rust automatically adds in &
, &mut
, or *
so object matches the signature of the method.
p1.distance(&p2);
is equivalent to (&p1).distance(&p2);
Given the receiver and name of a method, Rust can figure out definitively whether the method is reading (&self
), mutating (&mut self
), or consuming (self
).
We can define associated functions that donât have self as their first parameter (and thus are not methods) eg: is String::from
function thatâs defined on the String type. These associated functions are often used for constructors that will return a new instance of the struct. These are often called new, but new isnât a special name and isnât built into the language.
impl Rectangle { fn square(size: u32) -> Self { Self { width: size, height: size, } } }
To call this associated function, we use the ::
syntax with the struct name; let sq = Rectangle::square(3);
is an example. This function is namespaced by the struct.
Examples
#[derive(Debug)] struct Fibonacci { current: u8, previous: u8, } impl Fibonacci { fn new() -> Self { Self { current: 0, previous: 0, } } } impl Iterator for Fibonacci { type Item = u8; fn next(&mut self) -> Option<Self::Item> { if self.current == 0 { self.current = 1; return Some(self.current); } let next_value = self.previous.checked_add(self.current)?; self.previous = self.current; self.current = next_value; Some(self.current) } } fn main() { for fb in Fibonacci::new() { println!("{fb}"); if fb > 30 { break; } } println!("------"); for fb in Fibonacci::new() { println!("{fb}"); } }
Error Handling
Rust requires you to acknowledge the possibility of an error and take some action before your code will compile. Rust groups errors into two major categories:
- recoverable: eg file not found error -> report and retry without stoping the program
- use
Result<T,E>
enum with two variantsOk(T)
andErr(E)
- use
- unrecoverable errors: eg: Runtime failures like failed bounds checks, out of memory, or file I/O errors
- use
panic!
to stop execution - use Non-panicking APIs (eg:
Vec::get
) if crashing is not acceptable.
- use
Unrecoverable error with panic!
Rust handles fatal errors (unrecoverable and unexpected) with a "panic". panic can be triggered
- by taking an action that causes thhe code to panic, usually as symptoms of bugs in program logic.
- explicitly calling
panic!
macro
By default, these panics will print a failure message, unwind, clean up the stack, and quit. Unwinding means rust walk back up the stack and cleans up the data from each function it encounters. The unwinding can be caught. We can also abort immediately on panic without cleaning up , which makes the binary as small as possible by adding the following line in Cargo.toml
.
[profile.release]
panic = 'abort'
fn main() { panic!("crash and burn"); }
Error Message : src/main.rs:2:5 indicates that itâs the second line, fifth character of our src/main.rs file.
In many cases call to panic! might be part of some other function calls in different files, in such cases the filename and line number of panic! might be reported, not the line of code that eventually led to panic! call.
fn main() { let v = vec![1, 2, 3]; v[99]; // accessing invalid index }
In other languages, such Buffer overread leads to security vulnerabilities, but Rust will trigger a panic and stop execution.
Rust provides a note telling to run with RUST_BACKTRACE
to get a backtrace (list of all function call leading to panic) of exactly what happened to cause the error.
$ RUST_BACKTRACE=1 cargo run
Catching the panic
use std::panic; fn main() { let result = panic::catch_unwind(|| "No problem here!"); println!("{result:?}"); let result = panic::catch_unwind(|| { panic!("oh no!"); }); println!("{result:?}"); }
Recoverable error with Result<T,E>
Many functions in rust returns a Result to state if it succeeded or failed in its operation. While reading a file, the file might not exist, or we might not have permission to access the file, leading to different types of errors, Result enum can convey such information.
use std::fs::File; // T for file open use std::io::ErrorKind; // E variants fn main() { let greeting_file_result = File::open("hello.txt"); // returns a result let greeting_file = match greeting_file_result { Ok(file) => file, Err(error) => match error.kind() { ErrorKind::NotFound => match File::create("hello.txt") { Ok(fc) => fc, Err(e) => panic!("Problem creating the file: {e:?}"), }, other_error => { panic!("Problem opening the file: {other_error:?}"); } }, }; }
Match can be verbose, and might not communicate intent well. Helper methods defined on Result can be helpful for more specific task. Such as - unwrap(): return value inside Ok or panic! if Err.
- expect(Msg): allows to return good message
- unwrap_or_else.
use std::fs::File; use std::io::ErrorKind; fn main() { let greeting_file = File::open("hello.txt").unwrap_or_else(|error| { if error.kind() == ErrorKind::NotFound { File::create("hello.txt").unwrap_or_else(|error| { panic!("Problem creating the file: {error:?}"); }) } else { panic!("Problem opening the file: {error:?}"); } }); }
Packages, Crates and Modules
As program grows its importante to group related functionality and group separate code with distict features. Code organization in Rust is done using the concept of packages, crates and modules.
Project code iss split into multiple modules and each module is a separate file. A module is a file that contains one or more items, and the items can be functions, structs, enums, constants, etc.
A package is a collection of crates. A crate is a binary or library that is built using the Rust compiler.
Packages: A Cargo feature that lets you build, test, and share crates Crates: A tree of modules that produces a library or executable Modules and use: Let you control the organization, scope, and privacy of paths Paths: A way of naming an item, such as a struct, function, or module