Control flow patterns in Rust

rust
code
Author

Ben Ruijl

Published

April 6, 2023

Rust has some great control flow features that drastically simplify writing algorithms. In this short blog post we will consider some design patterns that use break and continue to escape nested loops and that use break on blocks. These patterns offer alternatives to the goto statement in other languages and the for-else statement in Python.

Control flow

Most of you will have used the keywords break and continue to control the innermost loop:

for x in [1,2,3,4,5] {
    if x == 2 {
        continue;
    }
}

Some of you may have also used the fact that break can return a value when used in a loop:

let mut prime_generator = PrimeGenerator(3);
let mersenne_prime = loop {
    let next_prime = prime_generator.next();
    if next_prime.is_mersenne() {
        break next_prime;
    }
};

This pattern nicely encapsulates the computation of mersenne_prime in a single let expression. Sadly you cannot break a while or for with a value directly, but these statements can always be rewritten using loop. We will soon look at an alternative that does not involve converting the iteration to a loop.

Loops can be labeled, which gives us more control. In the example code below, I mimic some code used to compute the greatest common divisor of two polynomials, for which we need to choose prime numbers and then sample integers between 1 and the chosen prime number. If we evaluate the polynomial with the sample (and apply the mod operator % dilligently) and obtain 0, we picked an unlucky prime or evaluation sample1 and we should try another sample or another prime.

1 The polynomial \(3 x\) yields \(0\) when you evaluate it for prime \(3\), as \((3 \% 3) x = 0\), regardless of \(x\). The polynomial \(3x^2-x\) yields 0 for prime \(5\) and sample \(x=2\), but not for other sample points.

let mut prime_generator = PrimeGenerator(3);
'new_prime: loop {
    let next_prime = prime_generator.next();
    let poly_finite_field = polynomial.set_prime(next_prime);
    let mut samples_tried = 0;
    'new_sample: loop {
        let sample = rnd.get_random_number(1, next_prime);
        let r = poly_finite_field.evaluate(sample);
        
        if r == 0 {
            if samples_tried < UNLUCKY_SAMPLE_THRESHOLD {
                samples_tried += 1;
                continue 'new_sample; // unlucky sample
            } else {
                continue 'new_prime;// unlucky prime
            }
        }
        
        // some more math logic here
    }
}

Writing this logic without continuing the outer loop with a continue 'new_prime, would involve breaking all inner loops (in this example there is only one inner loop, but there could be many!) and using a variable to flag that the outer loop should be continued. Using labeled loops with appropriate names makes the code shorter and more readable, as the point where the decision is made to continue to the next prime is also where the continue is located. Using a helper flag, the continue is unnaturally postponed which creates more room for bugs.

Block breaks

One relatively unknown feature is that in Rust one can also break on named blocks:

'block: {
    break 'block;
    println!("Never reached!");
}

Breaking named blocks can be used to simplify logic. For example, in the case of deeply nested if statements:

let deadline = if let Some(settings) = self.settings {
    if let Some(deadline) = settings.deadline {
        if settings.deadline_override && deadline > 0 {
            deadline
        } else {
            self.default_deadline
        }
    } else {
        self.default_deadline
    }
} else {
    self.default_deadline
};

which sadly cannot be fused into one if due to the absence of if-let chains. The above structure is rather verbose as all else blocks fall back to the same case. Thankfully, we can use named blocks to fuse the else case into one line:

let deadline = 'done: {
    if let Some(settings) = self.settings {
        if let Some(deadline) = settings.deadline {
            if settings.deadline_override && deadline > 0 {
                break 'done deadline;
            }
        }
    }
    self.default_deadline
};

Breaking on named blocks can be used to emulate the for-else syntax from Python, which allows code to be executed if the for-loop did not break:

items = [3,5,7,11,13,17]
item = 19
location = None
for i, x in enumerate(items):
    if x == item:
        location = i
        break
else:
    # break was not triggered
    location = len(items)
    items.append(item)

In Rust one could write this with a helper flag that is set when the item is found (if the for-loop does not return a value) or with an Option:

let mut items = vec![3, 5, 7, 11, 13, 17];
let item = 19;
let mut location = None;
for (i, x) in items.iter().enumerate() {
    if *x == item {
        location = Some(i);
        break;
    }
}

if location.is_none() {
    location = Some(items.len());
    items.push(item);
}

let location = location.unwrap();

The downside of the Option is that a (never-failing) unwrap is necessary.

An alternative is to use a named block:

let mut items = vec![3,5,7,11,13,17];
let item = 19;
let location = 'find_item: {
    for (i, x) in items.iter().enumerate() {
        if *x == item {
            break 'find_item i;
        }
    }

    // only reached when not breaking
    items.push(item);
    items.len() - 1
};

where any code after the for-loop will only be reached when the block has not been interrupted by a break.

Goto

Some readers will find these patterns reminiscent of the infamous goto2 statement. I would argue that these patterns are much clearer and safer than the goto’s of old. By their very nature, blocks are clearly marked and Rust’s borrow checker is still hard at work to prevent us from doing something illegal. However, it is true that there is a trade-off between if statements that make the conditions under which code is evaluated clear and the use of a break. Often, I do find code with fewer if encapsulations more readable though, as is the case for early returns:

2 Wikipedia has a section of criticism on the goto statement: “Go To Statement Considered Harmful”

fn test_if(arg: usize) {
    if arg != 0 {
        // some code
    }
}

fn test_early_return(arg: usize) {
    if arg == 0 {
        return;
    }

    // some code
}

where saving on one level of indentation makes the code more readable.

In the end it is up to the programmer to use the patterns that make the logic the clearest. I hope that with this short blog post I have shown some useful patterns that serve as another tool in the programmer’s toolbox.

Let me know what you think in the comments below (new) or on Reddit!