Enums in Rust
Rust is my second favourite programming languages. There are lots of things that I love about it and if I could take some of those features I love and squidge them into PHP, I would.
One of those features in Rust's powerful enumerations (enums). Yes, we have had enums in PHP since PHP 8.1, but they are more or less just glorified class constants. The only two real benefits they provide over class constants is the ability to use the name of the enum as a type and attach methods to an enum.
Rust's enums on the other hand are much more than just constants. I'd like to dig into them a little bit in this post and show you all the cool things you can do with them.
Syntax
We'll start off with the syntax. They look very similar to PHP's enums and many other languages.
enum Role {
Superadmin,
Admin,
Editor,
User
}
The only real difference here is the lack of an explicit case
statement like we have in PHP.
If we want to add methods to our Role
enum, we can do so using an implementation (impl
) block.
impl Role {
pub fn is_admin(&self) -> bool {
matches!(self, Role::Admin)
}
}
This
impl
syntax isn't specifically for enums – methods on structs are also written in this way.
Cool, but this is pretty basic right? There's nothing special here, it's just a plain ol' enumeration with some methods.
You're right – so let me introduce you to the magic.
Tagged Unions 101
Tagged unions, as a concept, have been around since the 1960s. Before we look at Rust's tagged unions, it's probably useful to understand what a regular union
is. We'll use C as our example language.
In C, you can define a union
. A union
has many different fields, but the catch is that only one of those fields can be "filled" at any one time.
union Data {
int i;
float f;
char str[255];
} data;
So if the Data.i
field has a value, it means that Data.f
and Data.str
must be empty. Unions are also stored in a single memory location and the memory itself will always be big enough to store the largest field / member of the union
.
Fun fact – PHP's interpreter uses a
union
to define the internal representation of all PHP values. It's known as aZVal
.
A tagged union is made up of two things – a tag, and a union
. If we were to define one in C, we might do something like this:
typedef enum ValueType { Int, Float, String } ValueType;
typedef struct Value {
Tag tag;
union Data {
int i;
float f;
char str[255];
} value;
} Value;
When we construct a value, we "tag" it using one of the ValueType
cases, then assign some data to one of the union fields based on the tag.
Tagged unions in languages like C and C++ aren't really considered "safe". The union isn't strictly related to the tag, so there's nothing stopping us from having a value tagged as Int
and trying to access the String
field.
This is where Rust takes things a step further.
Tagged Unions in Rust
Rust extends on its basic enumerations by allowing us to define fields as part of the enum member directly. Let's recreate that Value
tagged union.
enum Value {
Int(i64),
Float(f64),
Str(String),
}
This is known as a "tuple variant" or "tuple member". Each of our enum members / cases would be the "tag", and the values inside of those members would be part of the union.
Constructing / instantiating one of these members would then allow us to pass in a value for the field.
let int_value = Value::Int(100);
The super-safety aspects start to show when you want to access a field. The only way to access a field is by destructuring and pattern matching the value.
match int_value {
Value::Int(value) => dbg!(value),
_ => todo!(),
}
Tuple variants are just one of the syntaxes that we can use. We can also use the "struct variant" syntax.
enum AstNode {
Print {
value: AstNode,
},
Add {
left: AstNode,
right: AstNode,
}
}
So instead of having nameless values inside of our members, we can instead give them appropriate names. This is useful if you've got more than one value in a member.
The same rules apply when we want to access a value inside of the member.
match node {
AstNode::Print { value } => dbg!(value),
AstNode::Add { left, right } => dbg!(left, right),
}
Usage in Rust
Rust leans heavily on tagged unions in the core of the language. There are two prime examples of this – the Option
and Result
types.
enum Option<T> {
Some(T),
None,
}
enum Result<T, E> {
Ok(T),
Err(E),
}
These are almost perfect usecases for tagged unions – since both enums can only have two possible members, but you can also use generics to change the type of the value stored inside of the them.