Structural Typing in Rust
Have you ever wanted to write a structurally typed function in Rust? Do you spend a lot of time and effort getting your Rust struct
s just so, and want to DRY-out data access for common field paths without declaring a new trait
and implementing it for each struct (let’s say, Cat
and Dog
both have a name: String
field)? If so, read on.
This post talks about how we can leverage LabelledGeneric
to build Path
traversers (functionally similar to lenses), and use them to write clean and performant structurally typed functions with all the compile-time safety that you’ve come to expect from Rust.
Re: radio silence
It’s been a while (4 years!) since I last updated this blog. Why?
- I started working on the Cloud SWE team at Elastic (we’re hiring!). I’ve been busy leading project teams, implementing features, and writing (and deleting!) lots of Scala code (no Rust though, sadly 😭)
- My kid gained sentience: Around the same time, my daughter turned 2, and it’s just been a complete whirlwind of activities, learning, viruses, emotions, etc. It’s been awesome and I wouldn’t trade it for the world, but people are DOWNRIGHT LYING if they claim having kids doesn’t change anything.
- 2020: Covid was a big one, but the whole year felt like a trainwreck in slow motion … if the train was pulling dumpster fires.
Lastly, I just didn’t have the oomph to write a post that describes transmogrify()
to follow up on the post on Struct transforms. Transmogrifier
, which allows flexibile recursive transformation between similarly-structured structs
, was added over 2.5 years ago, but writing about it was … intimidating.
Still, I recently decided to try to start writing again, so I picked a topic that’s slightly simpler, but related: Path
, which introduced zero-overhead structurally-typed functions that you could use with normal struct
s to stable Rust back in Februrary of 2019 1.
Is the post late? Yes. Better than never? I hope so 🙏
Overview
- Structural typing, you say?
- Show me yours
- Quick review of
LabelledGeneric
- Detour: Plucking by labelled field
PathTraverser
Path
,path!
andPath!
- Another example
- Conclusion
Structural typing, you say?
“Structural typing” was thrown around up there ↑, but what do we mean? To quote Wiki:
A structural type system (or property-based type system) is a major class of type system in which type compatibility and equivalence are determined by the type’s actual structure or definition and not by other characteristics such as its name or place of declaration. Structural systems are used to determine if types are equivalent and whether a type is a subtype of another. It contrasts with nominative systems, where comparisons are based on the names of the types or explicit declarations, and duck typing, in which only the part of the structure accessed at runtime is checked for compatibility.
Out-of-the-box-Rust has nominally typed functions 2 3. For the purposes of this post (and frunk), we specifically mean struct
s and their fields when it comes to “structure”4, and not methods that they get from impl
s of themselves or trait
s. Why? Well, you can’t spell “structural typing without struct
, I’ve been mostly focused on struct
s, and … simplicity 😂. Also, to my mind, trait
s already enable a kind of part-way “structural typing” of methods 5.
Show me yours
I Read Somewhere ™ that giving a concrete example upfront helps people decide if they want to keep reading (if it aligns with their interests), plus there are lots of movies where the first scene you see is chronologically from the end of the story, followed by a rewinding sound and jump back to the beginning … and Hollywood knows engagement. Anyway, we’ll end up with something that allows us to do write this sort of thing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
The objects you pass to the print_pet_name
function don’t need to know anything specific to it nor structurally typed functions in general: their struct declarations just need to derive(LabelledGeneric)
and have a structure that complies with the function’s type signature (i.e. have a pet.name
path that returns a String
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|
That’s it. The API is relatively clean, simple to write, read, and understand (IMO), and there are no unsafe
or dyn
traits anywhere (even in the implementation). And, you can still declare and treat your struct
s as you normally would, passing them to nominally typed functions, implementing trait
s as you normally would etc.
Still, when used with structurally typed functions like print_pet_name
, the compiler will as usual ensure that:
- The paths accessed on the generic parameter
A
inside the structurally typed function are constrained by the function’s type signature. - The
LabelledGeneric
objects passed as arguments to the structurally typed function support the required path in the function’s type signature.
The functions themselves are not constrained to just getting values, they can also set values too (see the other example at the end of the post)
Quick review of LabelledGeneric
By adding a #[derive(LabelledGeneric)]
attribute to a struct, like so:
1 2 3 4 5 |
|
we gain the ability to turn a Dog
object into a labelled heterogenous list:
1 2 3 4 5 6 7 8 9 10 11 |
|
This ability to turn a struct
into a heterogenous List of “fields” (type-level labels and values, henceforth “labelled HList”) paves the way for us to go from nominative typing (does this type have the right name?) to structural typing (does this type have a given structure?).
For a more thorough review of HLists and LabelledGeneric
, see this post.
Detour: Plucking by labelled field
Given a labelled HList, it would be useful to be able to “pluck” a value out of it by using a type-level field name. That would allow us to have compile-time-checked access of a field in a labelled Hlist by type-level name:
1 2 3 |
|
This is the equivalent of accessing a specific .age
field on a Dog
struct in the normal Rust Way ™, but we’re doing it our own way on its labelled HList equivalent, using user-declared types and taking advantage of the type system.
The trait would look like this:
1 2 3 4 5 6 7 |
|
The implementation of this “by-name-field” Plucker shares much with the normal Plucker
mentioned in the previous post, so instead of re-explaining things like the Index
type param, I’ll simply add a link to that section and show the implementation for the exit and recursion implementations here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
In truth, it probably makes sense to re-write the ByNameFieldPlucker
implementation(s) in terms of Plucker
, but this felt somewhat more straightforward when I wrote it at the time for transmogrify
ing.
PathTraverser
ByNameFieldPlucker
provides us with a way of accessing a field on single struct, but we want to be able to traverse multiple levels of structs. For instance, given the aformentioned Dog
and DogPerson
structs, Rust allows us to get the age of his dog by doing dog_person.pet.age
, and we’d like to be able to do that structurally. Enter PathTraverser
:
1 2 3 4 5 6 |
|
Instead of Index
, its second type param is Indices
to reflect the fact that we’re going to need multiple Index
s to “pluck” by field name from. The “exit” (the last, aka no-more-dots, target field name and value type are on the current struct) and “recurse” (the last target field name and value type are in an “inner” struct) implementations of this trait are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
That type signature is a bit hairy.
It’s a bit “Inceptiony” to think about what the Indices
type param might look like at a given callsite, and for the most part it doesn’t matter for users (we make it the compiler’s job to fill it in or error out trying), but for the purposes of trying to understand what’s going on, it’s reasonable to imagine this as the Indices
for structurally accessing dog_person.pet.age
:
1 2 3 4 |
|
Path
, path!
and Path!
The last piece we need is something that allows us to describe a path (e.g. pet.age
). Since the path is going to be itself a type-level thing (reminder: we pluck values by type-level field name), we can model this as a newtype wrapper around the zero-sized PhantomData<T>
type
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Path
s basically works like “lens”, only without the target type locked down (maybe that will be a future type in frunk…), enabling this sort of thing:
1 2 3 4 |
|
That’s all fine and good. From here on though, things get a bit tricky because we need to create friendly ways to declare Path
s, and T
needs to be a type level path, one that needs to be easy to use and compatible with the way LabelledGeneric
encodes field names into type-level strings. Rubber, meet road.
To make declaring value and type level Path
s easy to use, we’ll need to make use of procedural macros because they allow us to take user-defined expressions and turn them into type-level paths made of type-level field names, and doing so with declarative macros is extremely difficult (I gave it a stab) if not impossible.
A core function that is reused for generating value-level and type-value Path
s is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Where find_idents_in_expr
is a function turns a path expression like pet.age
into a vector of Ident
identifiers.
We then pass those through to the build_label_type
function, which translates each Ident
into a type-level name. This is also re-used by LabelledGeneric
’s derivation macro, which is important because it ensures that the way field names are encoded as types for Path
s is compatible with the way field names are encoded as types in LabelledGeneric
-produced labelled HLists.
Value-level
The macro for creating a Path
value simply instantiates a Path
using Path::new()
, but with a type ascription based on what gets returned from build_path_type
.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Type-level
The macro for creating a Path
type simply splices the type returned from build_path_type
.
1 2 3 4 5 6 7 8 9 |
|
Another example
Getting and setting ids of from struct
s, without declaring a GetId
or SetId
trait and implementing it for each type:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
Conclusion
The PathTraverser
trait and Path
type build on LabelledGeneric
and HList
as core abstractions, which is nice because we get some more mileage out of them, and it means that there are no additional traits that you need to import nor implement (even as a macro).
As usual, it’s compile-time checked, but it’s also performant. In benchmarks, tests comparing lens_path*
(structurally typed traversal) versus normal_path*
(Rust lang built-in traversal) traversals show that they perform the same: in other words, using structural typing in this way adds zero overhead.
As usual, please give it a spin and chime in with any questions, corrections, and suggestions !
Footnotes
-
Technically, everything for writing basic structurally typed functions minus support for jumping through
.
-separated fields was available in frunk since October of 2018 at the latest becauseByNamePlucker
was available already by then.↩ -
In Rust, macros can and have been used to approximate structural typing (macro arguments aren’t typed, so you can just do something like
$x.access.some.path
and have the compiler expand and fail it if an object at the callsite doesn’t have that path). This is fine too, but macros can be hard to read and maintain (they have no type signature, so you’ll need to look in the implementation/docs to know what it expects), and they aren’t functions; they’re code that write code. Again, The Macro Way is Fine ™; this post just offers an alternative.↩ -
Rust did at one point have built-in support structural records, but it was removed almost 9 years ago before 1.0 was released. I found an answer to a question on the internal Rust lang forum asking why, and the 3 reasons listed for removal at the time made sense; the
Path
implementation described here (and implemented in frunk) addresses 1, if not 2, of the 3 issues (field order requirement and recursion IIUC), leaving the issue of field visibility, which I believe can probably be addressed as an option to theLabelledGeneric
derive.↩ -
There are some who would call this “row polymorphism”, which is maybe (more) correct, but it’s also a term that is much more niche (pronounced: “less generally known” or “less searched for”). Indeed, depending on whom you ask, “row polymorphism” is regarded as being under the “structural typing” umbrella (1, 2), but in any case, I personally find the distinction to be of questionable value in the context of Rust 🤷♂️. Having said that, feel free to substitute “row polymorphism” in place of “structural typing” when reading this post if it helps you slog through the actual important bits :)↩
-
trait
s can be adhoc and auto-implemented, and directly used as constraints in functions (though still nominally), so being structurally-typed ontrait
s feels a bit less of a problem that needs solving, and I get the feeling that it will be even less so with things like specialization coming down the pipeline, which will allow for more blanket and overlappingimpl
s.↩