Tuples, Arrays and Collections

These data types stores multiple values. Built-in array and tuple have a fixed and known size during compile time hence data is stored in stack, while collection store data in heap which doesn't need to know amount of data at compile time and can grow or shrink at runtime. Different type of collection have different capability and costs.

  • A vector allows you to store a variable number of values next to each other.
  • A string is a collection of characters.
  • A hash map allows you to associate a value with a specific key. It’s a particular implementation of the more general data structure called a map.

Tuples

Tuples are a collection of values of different types, fixed size and known at compile time.

fn main() {
    let tup: (i32, f64, u8) = (500, 6.4, 1);

    let x = tup.0; // accessing tuple values with dot notation
    println!("The value of x is: {x}");

    let (int_num, float_num, uint_num) = tup;  // Pattern Matching for Destructuring into local variables
    println!("The value of float is: {float_num}");
}

The patterns used here are "irrefutable", meaning that the compiler can statically verify that the value on the right of = has the same structure as the pattern. A variable name is an irrefutable pattern that always matches any value, hence why we can also use let to declare a single variable. Rust also supports using patterns in conditionals, allowing for equality comparison and destructuring to happen at the same time.

Arrays

Arrays store a fixed number of item of same data type, both size and data types are known at compile time.

fn main() {
    let mut a: [i8; 10] = [42; 10];  // array [T;N]
    a[5] = 0;   // accessing array elements with indexing
    println!("a: {a:?}");
}

The for statement supports iterating over arrays (but not tuples).

fn main() {
    let primes = [2, 3, 5, 7, 11, 13, 17, 19];
    for prime in primes {
        for i in 2..prime {
            assert_ne!(prime % i, 0);
        }
    }
}

This functionality uses the IntoIterator trait

The assert_ne! macro is new here. There are also assert_eq! and assert! macros. These are always checked, while debug-only variants like debug_assert! compile to nothing in release builds.

Vectors

Vector Vec<T> store multiple values of the same type (implemented using generics) next to each other in memory. Ownership and borrowing rules ensure references of vector remain valid.

fn main(){
    let v: Vec<i32> = Vec::new();  // create an empty vector, since no values are inserted type annotation is required
    let v2 = vec![1, 2, 3];        // vec! macro, type infered from the values

    // updating vector
    let mut v = Vec::new();  // data type infered from the data we provide later
    v.push(1);  // add elements to vector
    v.push(2);
    v.push(3);

    // Referencing values stored in vector
    let third: &i32 = &v[2];   // via indexing, zero-based indexing, cause panic for out of bound index
    println!("The third element is {third}");

    let third: Option<&i32> = v.get(2);  // get method return option to handle error
    match third {
        Some(third) => println!("The third element is {third}"),
        None => println!("There is no third element."),
    }
    
    for i in &v {      // iterating over the values in vector
        println!("{i}");
    }

    for i in &mut v { // iterating over mutable references
        *i += 100;  // dereference to get to the value in i
    }
}  // When vector goes out of scope, vector and elements are dropped

Rust needs to know what types will be in the vector at compile time so it knows exactly how much memory on the heap will be needed to store each element. If we want to store values of different type in a vector, we can create a custom enum type and store the value.

    enum SpreadsheetCell {
        Int(i32),
        Float(f64),
        Text(String),
    }

    let row = vec![
        SpreadsheetCell::Int(3),
        SpreadsheetCell::Text(String::from("blue")),
        SpreadsheetCell::Float(10.12),
    ];

If you don’t know the exhaustive set of types a program will get at runtime to store in a vector, the enum technique won’t work. Instead, you can use a trait object,

Strings

String is a more complicated data structure than a string literals represented as string slice str. Literal strings are hardcoded in the binary program program and stored in the stack while String is a growable, mutable, UTF-8 encoded collection of byte string stored in a heap. String is implemented as wrapper around a vector of byests with some extra guarantees, restrictions (doesn't support indexing) and capabilities, many of vec operations are also available.

fn main(){
    let mut s1 = String::from("नमस्ते");  // UTF-8 encoded
    let mut _s = String::new();  // empty string to add data later
    let mut s = "Hello".to_string();  // from literals
    
    s.push_str(" World");  //append a string slice. // push method is for vec so it can only add one character at time
    s1.push_str(&s);  // append a String to String, push_str takes ownership of s.

    // concatenation with + Operator
    let s1 = String::from("Hello, ");
    let s2 = String::from("world!");
    let s3 = s1 + &s2; // note s1 has been moved here and can no longer be used, &s2 reference is used, s2 will still be valid

    let s1 = String::from("tic");
    let s2 = String::from("tac");
    let s3 = String::from("toe");

    let s = format!("{s1}-{s2}-{s3}");  // macro to return a concatenated macro in set format.

}

fn main(){
    let s1 = String::from("नमस्ते"); 
    let s2 = String::from("Hello");
    println!("length of s1 is {} and s2 is {}",s1.len(),s2.len()); 

}

Devanagari : “नमस्ते” can be represented in following ways

  • stored as u8 values: [224, 164, 168, 224, 164, 174, 224, 164, 184, 224, 165, 141, 224, 164, 164, 224, 165, 135]
  • unicode scalar: ['न', 'म', 'स', '्', 'त', 'े']
  • Grapheme clusters: ["न", "म", "स्", "ते"] , as certain characters like '्, 'े are diacritics and doesn't make sense on their own

It's unclear what the return type of the string-indexing operation should be, and indexing into string byte may not always correlated with a valid character

We can however index and slice a string literal.

Indexing are expected to always take O(1) time but its not possible to guarantee that performance with a String.

The best way to operate on piece of string is to be explict about where we want character or bytes.

for c in "Зд".chars() {
    println!("{c}");
}

for b in "Зд".bytes() {
    println!("{b}");
}

Crates such as unicode segmentation allows to extract grapheme clusters as well.

Hash Maps

HashMap<K, V> stores a mapping of keys of type K to values of type V using a hashing function, which determines how it places these keys and values into memory. This is also known as object, hash table, dictionary, or associative array, in other programming languages. Hash Map allows us to look up data not by using an index, but using the key value. Hashmap like vectors are stored on heap, they are homogeneous: all of the keys must have the same type, and all of the values must have the same type. Each unique key can only have one value associated with it at a time, but the number of key and value pair is growable.

use std::collections::HashMap; 

fn main() {
    let mut scores = HashMap::new(); 

    scores.insert(String::from("Team A"), 10);
    scores.insert(String::from("Team B"), 50);

    let team_name = String::from("Team B");
    let score = scores.get(&team_name)   // returns Option<&V>
                    .copied()            // get Option<i32> rather than Option<&i32>
                    .unwrap_or(0);       // return score value if None then 0

    scores.insert(String::from("Team A"), 20);  // overwrites the value
    
    // Adding a Key and Value Only If a Key Isn’t Present, if key present, existing value is unchanged
    scores.entry(String::from("Team C"))   // returns Entry enum (Vacant | Occupied) with mutable reference to the value corresponding the entry
            .or_insert(50);                // or_inserts defined on entry, inserts value if vacant

    //iterate over key value pair similar to vector
    for (key, value) in &scores {
        println!("{key}: {value}");
    }

    // updating a value based on the old value
    let text = "hello world wonderful world";
    let mut map = HashMap::new();
    for word in text.split_whitespace() {
        let count = map.entry(word).or_insert(0);  // or_insert returns mutable reference to the value (&mut V) corresponding to the key
        *count += 1;  //dereference the mutable reference to increment the value
    }
    println!("{map:?}");
}

For types that implement the Copy trait, like i32, the values are copied into the hash map. For owned values like String, the values will be moved and the hash map will be the owner of those values.

Hashmap uses a hashing function called SipHash that can provide resistance to denial-of-service (DoS) attacks involving hash tables. We can choose faster hash function.

Slice

Slice Type std::slice are dynamically sized view (reference) into a sequence (array/vector). Pass a slice to fun(&array) to pass reference to a slice, while defining the func fn fun(num: &[u8]) allows to pass the reference to something of known size with only a read-only access, the ownership need not be passed to the funciton. The data is only borrowed for the duration of the function call. Slice also has iterator traits like vectors.