Have you ever wanted to write a structurally typed function in Rust? Do you spend a lot of time and effort getting your Rust
structs just so, and want to DRY-out data access for common field paths without declaring a new
trait and implementing it for each struct (let’s say,
Dog both have a
name: String field)? If so, read on.
This post talks about how we can leverage
LabelledGeneric to build
Path traversers (functionally similar to lenses), and use them to write clean and performant structurally typed functions with all the compile-time safety that you’ve come to expect from Rust.
Re: radio silence
It’s been a while (4 years!) since I last updated this blog. Why?
- I started working on the Cloud SWE team at Elastic (we’re hiring!). I’ve been busy leading project teams, implementing features, and writing (and deleting!) lots of Scala code (no Rust though, sadly 😭)
- My kid gained sentience: Around the same time, my daughter turned 2, and it’s just been a complete whirlwind of activities, learning, viruses, emotions, etc. It’s been awesome and I wouldn’t trade it for the world, but people are DOWNRIGHT LYING if they claim having kids doesn’t change anything.
- 2020: Covid was a big one, but the whole year felt like a trainwreck in slow motion … if the train was pulling dumpster fires.
Lastly, I just didn’t have the oomph to write a post that describes
transmogrify() to follow up on the post on Struct transforms.
Transmogrifier, which allows flexibile recursive transformation between similarly-structured
structs, was added over 2.5 years ago, but writing about it was … intimidating.
Still, I recently decided to try to start writing again, so I picked a topic that’s slightly simpler, but related:
Path, which introduced zero-overhead structurally-typed functions that you could use with normal
structs to stable Rust back in Februrary of 2019 1.
Is the post late? Yes. Better than never? I hope so 🙏
- Structural typing, you say?
- Show me yours
- Quick review of
- Detour: Plucking by labelled field
- Another example
Structural typing, you say?
“Structural typing” was thrown around up there ↑, but what do we mean? To quote Wiki:
A structural type system (or property-based type system) is a major class of type system in which type compatibility and equivalence are determined by the type’s actual structure or definition and not by other characteristics such as its name or place of declaration. Structural systems are used to determine if types are equivalent and whether a type is a subtype of another. It contrasts with nominative systems, where comparisons are based on the names of the types or explicit declarations, and duck typing, in which only the part of the structure accessed at runtime is checked for compatibility.
Out-of-the-box-Rust has nominally typed functions 2 3. For the purposes of this post (and frunk), we specifically mean
structs and their fields when it comes to “structure”4, and not methods that they get from
impls of themselves or
traits. Why? Well, you can’t spell “structural typing without
struct, I’ve been mostly focused on
structs, and … simplicity 😂. Also, to my mind,
traits already enable a kind of part-way “structural typing” of methods 5.
Show me yours
I Read Somewhere ™ that giving a concrete example upfront helps people decide if they want to keep reading (if it aligns with their interests), plus there are lots of movies where the first scene you see is chronologically from the end of the story, followed by a rewinding sound and jump back to the beginning … and Hollywood knows engagement. Anyway, we’ll end up with something that allows us to do write this sort of thing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
The objects you pass to the
print_pet_name function don’t need to know anything specific to it nor structurally typed functions in general: their struct declarations just need to
derive(LabelledGeneric) and have a structure that complies with the function’s type signature (i.e. have a
pet.name path that returns a
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
That’s it. The API is relatively clean, simple to write, read, and understand (IMO), and there are no
dyn traits anywhere (even in the implementation). And, you can still declare and treat your
structs as you normally would, passing them to nominally typed functions, implementing
traits as you normally would etc.
Still, when used with structurally typed functions like
print_pet_name, the compiler will as usual ensure that:
- The paths accessed on the generic parameter
Ainside the structurally typed function are constrained by the function’s type signature.
LabelledGenericobjects passed as arguments to the structurally typed function support the required path in the function’s type signature.
The functions themselves are not constrained to just getting values, they can also set values too (see the other example at the end of the post)
Quick review of
By adding a
#[derive(LabelledGeneric)] attribute to a struct, like so:
1 2 3 4 5
we gain the ability to turn a
Dog object into a labelled heterogenous list:
1 2 3 4 5 6 7 8 9 10 11
This ability to turn a
struct into a heterogenous List of “fields” (type-level labels and values, henceforth “labelled HList”) paves the way for us to go from nominative typing (does this type have the right name?) to structural typing (does this type have a given structure?).
For a more thorough review of HLists and
LabelledGeneric, see this post.
Detour: Plucking by labelled field
Given a labelled HList, it would be useful to be able to “pluck” a value out of it by using a type-level field name. That would allow us to have compile-time-checked access of a field in a labelled Hlist by type-level name:
1 2 3
This is the equivalent of accessing a specific
.age field on a
Dog struct in the normal Rust Way ™, but we’re doing it our own way on its labelled HList equivalent, using user-declared types and taking advantage of the type system.
The trait would look like this:
1 2 3 4 5 6 7
The implementation of this “by-name-field” Plucker shares much with the normal
Plucker mentioned in the previous post, so instead of re-explaining things like the
Index type param, I’ll simply add a link to that section and show the implementation for the exit and recursion implementations here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
In truth, it probably makes sense to re-write the
ByNameFieldPlucker implementation(s) in terms of
Plucker, but this felt somewhat more straightforward when I wrote it at the time for
ByNameFieldPlucker provides us with a way of accessing a field on single struct, but we want to be able to traverse multiple levels of structs. For instance, given the aformentioned
DogPerson structs, Rust allows us to get the age of his dog by doing
dog_person.pet.age, and we’d like to be able to do that structurally. Enter
1 2 3 4 5 6
Index, its second type param is
Indices to reflect the fact that we’re going to need multiple
Indexs to “pluck” by field name from. The “exit” (the last, aka no-more-dots, target field name and value type are on the current struct) and “recurse” (the last target field name and value type are in an “inner” struct) implementations of this trait are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
That type signature is a bit hairy.
It’s a bit “Inceptiony” to think about what the
Indices type param might look like at a given callsite, and for the most part it doesn’t matter for users (we make it the compiler’s job to fill it in or error out trying), but for the purposes of trying to understand what’s going on, it’s reasonable to imagine this as the
Indices for structurally accessing
1 2 3 4
The last piece we need is something that allows us to describe a path (e.g.
pet.age). Since the path is going to be itself a type-level thing (reminder: we pluck values by type-level field name), we can model this as a newtype wrapper around the zero-sized
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Paths basically works like “lens”, only without the target type locked down (maybe that will be a future type in frunk…), enabling this sort of thing:
1 2 3 4
That’s all fine and good. From here on though, things get a bit tricky because we need to create friendly ways to declare
T needs to be a type level path, one that needs to be easy to use and compatible with the way
LabelledGeneric encodes field names into type-level strings. Rubber, meet road.
To make declaring value and type level
Paths easy to use, we’ll need to make use of procedural macros because they allow us to take user-defined expressions and turn them into type-level paths made of type-level field names, and doing so with declarative macros is extremely difficult (I gave it a stab) if not impossible.
A core function that is reused for generating value-level and type-value
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
We then pass those through to the
build_label_type function, which translates each
Ident into a type-level name. This is also re-used by
LabelledGeneric’s derivation macro, which is important because it ensures that the way field names are encoded as types for
Paths is compatible with the way field names are encoded as types in
LabelledGeneric-produced labelled HLists.
The macro for creating a
Path value simply instantiates a
Path::new(), but with a type ascription based on what gets returned from
1 2 3 4 5 6 7 8 9 10 11 12
The macro for creating a
Path type simply splices the type returned from
1 2 3 4 5 6 7 8 9
Getting and setting ids of from
structs, without declaring a
SetId trait and implementing it for each type:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73
PathTraverser trait and
Path type build on
HList as core abstractions, which is nice because we get some more mileage out of them, and it means that there are no additional traits that you need to import nor implement (even as a macro).
As usual, it’s compile-time checked, but it’s also performant. In benchmarks, tests comparing
lens_path* (structurally typed traversal) versus
normal_path* (Rust lang built-in traversal) traversals show that they perform the same: in other words, using structural typing in this way adds zero overhead.
As usual, please give it a spin and chime in with any questions, corrections, and suggestions !
Technically, everything for writing basic structurally typed functions minus support for jumping through
.-separated fields was available in frunk since October of 2018 at the latest because
ByNamePluckerwas available already by then.↩
In Rust, macros can and have been used to approximate structural typing (macro arguments aren’t typed, so you can just do something like
$x.access.some.pathand have the compiler expand and fail it if an object at the callsite doesn’t have that path). This is fine too, but macros can be hard to read and maintain (they have no type signature, so you’ll need to look in the implementation/docs to know what it expects), and they aren’t functions; they’re code that write code. Again, The Macro Way is Fine ™; this post just offers an alternative.↩
Rust did at one point have built-in support structural records, but it was removed almost 9 years ago before 1.0 was released. I found an answer to a question on the internal Rust lang forum asking why, and the 3 reasons listed for removal at the time made sense; the
Pathimplementation described here (and implemented in frunk) addresses 1, if not 2, of the 3 issues (field order requirement and recursion IIUC), leaving the issue of field visibility, which I believe can probably be addressed as an option to the
There are some who would call this “row polymorphism”, which is maybe (more) correct, but it’s also a term that is much more niche (pronounced: “less generally known” or “less searched for”). Indeed, depending on whom you ask, “row polymorphism” is regarded as being under the “structural typing” umbrella (1, 2), but in any case, I personally find the distinction to be of questionable value in the context of Rust 🤷♂️. Having said that, feel free to substitute “row polymorphism” in place of “structural typing” when reading this post if it helps you slog through the actual important bits :)↩
traits can be adhoc and auto-implemented, and directly used as constraints in functions (though still nominally), so being structurally-typed on
traits feels a bit less of a problem that needs solving, and I get the feeling that it will be even less so with things like specialization coming down the pipeline, which will allow for more blanket and overlapping