Over the years, every time I switched machines, I noticed it was getting harder and harder for me to get the blog and in particular Octopress 2.0 working on it. It was plagued by incompatibility between system versions, tooling, and dependencies on various levels (OS, Gem, Ruby, etc), but I was also getting more and more out of touch with the Ruby world, having long jumped over to other languages.
Still, I loved this Markdown-based blog, and didn’t think it was time to move to a newer version (Octpress 3.0) or another tool (I’d heard good things about Hugo). I simply didn’t have the time to upgrade or port, nor did I feel the need to: it may be using old versions of things, but at the end of the day, it was generating and deploying simple static HTML files that get served. Finally, this year, I decided to take a stab at containerising it so that I could hopefully easily keep using it for years to come (and lose another excuse to not write..).
I didn’t come up with everything from scratch and followed in the footsteps of those who already did most of the heavy lifting.
I just updated some things here and there:
Dockerfile
To start with the conclusion, here is the complete Dockerfile
in the root dir of my Octopress 2.0 project
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
|
I’ll go through the things that are different from the awesome article at Octopress in a Docker Container. Whatever I don’t call out below should be taken as unchanged from that article, so use it as a reference.
The Octopress in a Docker Container article uses Ubuntu but at version 16.04. I love Ubuntu and think it’s a great choice for an Octropess dev env, but since it’s Jan 2024, I wanted to use the latest Ubuntu LTS release (required for Ruby 2.3) install, 22.04, instead (knowing full well that the next LTS is slated for release in a few months..). Hence
1
|
|
That brought interesting challenges, mostly stemming from the fact that the default apt-get install
for ruby would be too new for (my) Octopress installation’s dependencies.
There are different ways to install Ruby on a system, but I opted for ruby-build, in particular the standalone install option because it was simple.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
|
The main thing here was installing libssl1.0-dev (I used the RVM PPA), and installing GCC-7 (otherwise I got segfaults using Ruby).
Since my goal was to get this working with an old Octopress blog and I didn’t want to mess around with version conflicts, I ADD
ed the Gemfile.lock
file as well, before RUN
ning bundle install
1 2 3 4 5 6 |
|
Compared with the reference article, we update Rubygems and lock down the bundler version.
This is mentioned in the Octopress in a Docker Container article as well, but I’ll mention it here too: in order to preview the blog, you need to change the Rakefile
from
1
|
|
to
1
|
|
Since it’s 2024, I also wanted to try using a Docker Desktop alternative, and chose Rancher Desktop. Overall, the entire experience was really smooth and in my Octopress usage so far, I. haven’t noticed much difference between Rancher Desktop and Docker Desktop, but I’ve only been lightly using docker
CLI.
I did notice that the auto-regenerate-based-on-changes feature of rake preview
worked better (faster, more reliably) with the VZ
emulation mode and virtiofs
volume mount type.
I added a Makefile
to make it simpler for future me to deal with building the image and working with it
1 2 3 4 5 |
|
This is entirely optional/subjective but I find make start-env
more manageable for starting an Octopress env that has everything mounted properly.
So that’s it: yet another containerised-Octopress-2.0 article, with this entry being the first beachape.com one that was written and published entirely using it.
]]>struct
s just so, and want to DRY-out data access for common field paths without declaring a new trait
and implementing it for each struct (let’s say, Cat
and Dog
both have a name: String
field)? If so, read on.
This post talks about how we can leverage LabelledGeneric
to build Path
traversers (functionally similar to lenses), and use them to write clean and performant structurally typed functions with all the compile-time safety that you’ve come to expect from Rust.
It’s been a while (4 years!) since I last updated this blog. Why?
Lastly, I just didn’t have the oomph to write a post that describes transmogrify()
to follow up on the post on Struct transforms. Transmogrifier
, which allows flexibile recursive transformation between similarly-structured structs
, was added over 2.5 years ago, but writing about it was … intimidating.
Still, I recently decided to try to start writing again, so I picked a topic that’s slightly simpler, but related: Path
, which introduced zero-overhead structurally-typed functions that you could use with normal struct
s to stable Rust back in Februrary of 2019 1.
Is the post late? Yes. Better than never? I hope so 🙏
LabelledGeneric
PathTraverser
Path
, path!
and Path!
“Structural typing” was thrown around up there ↑, but what do we mean? To quote Wiki:
A structural type system (or property-based type system) is a major class of type system in which type compatibility and equivalence are determined by the type’s actual structure or definition and not by other characteristics such as its name or place of declaration. Structural systems are used to determine if types are equivalent and whether a type is a subtype of another. It contrasts with nominative systems, where comparisons are based on the names of the types or explicit declarations, and duck typing, in which only the part of the structure accessed at runtime is checked for compatibility.
Out-of-the-box-Rust has nominally typed functions 2 3. For the purposes of this post (and frunk), we specifically mean struct
s and their fields when it comes to “structure”4, and not methods that they get from impl
s of themselves or trait
s. Why? Well, you can’t spell “structural typing without struct
, I’ve been mostly focused on struct
s, and … simplicity 😂. Also, to my mind, trait
s already enable a kind of part-way “structural typing” of methods 5.
I Read Somewhere ™ that giving a concrete example upfront helps people decide if they want to keep reading (if it aligns with their interests), plus there are lots of movies where the first scene you see is chronologically from the end of the story, followed by a rewinding sound and jump back to the beginning … and Hollywood knows engagement. Anyway, we’ll end up with something that allows us to do write this sort of thing:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
The objects you pass to the print_pet_name
function don’t need to know anything specific to it nor structurally typed functions in general: their struct declarations just need to derive(LabelledGeneric)
and have a structure that complies with the function’s type signature (i.e. have a pet.name
path that returns a String
):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 |
|
That’s it. The API is relatively clean, simple to write, read, and understand (IMO), and there are no unsafe
or dyn
traits anywhere (even in the implementation). And, you can still declare and treat your struct
s as you normally would, passing them to nominally typed functions, implementing trait
s as you normally would etc.
Still, when used with structurally typed functions like print_pet_name
, the compiler will as usual ensure that:
A
inside the structurally typed function are constrained by the function’s type signature.LabelledGeneric
objects passed as arguments to the structurally typed function support the required path in the function’s type signature.The functions themselves are not constrained to just getting values, they can also set values too (see the other example at the end of the post)
LabelledGeneric
By adding a #[derive(LabelledGeneric)]
attribute to a struct, like so:
1 2 3 4 5 |
|
we gain the ability to turn a Dog
object into a labelled heterogenous list:
1 2 3 4 5 6 7 8 9 10 11 |
|
This ability to turn a struct
into a heterogenous List of “fields” (type-level labels and values, henceforth “labelled HList”) paves the way for us to go from nominative typing (does this type have the right name?) to structural typing (does this type have a given structure?).
For a more thorough review of HLists and LabelledGeneric
, see this post.
Given a labelled HList, it would be useful to be able to “pluck” a value out of it by using a type-level field name. That would allow us to have compile-time-checked access of a field in a labelled Hlist by type-level name:
1 2 3 |
|
This is the equivalent of accessing a specific .age
field on a Dog
struct in the normal Rust Way ™, but we’re doing it our own way on its labelled HList equivalent, using user-declared types and taking advantage of the type system.
The trait would look like this:
1 2 3 4 5 6 7 |
|
The implementation of this “by-name-field” Plucker shares much with the normal Plucker
mentioned in the previous post, so instead of re-explaining things like the Index
type param, I’ll simply add a link to that section and show the implementation for the exit and recursion implementations here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 |
|
In truth, it probably makes sense to re-write the ByNameFieldPlucker
implementation(s) in terms of Plucker
, but this felt somewhat more straightforward when I wrote it at the time for transmogrify
ing.
PathTraverser
ByNameFieldPlucker
provides us with a way of accessing a field on single struct, but we want to be able to traverse multiple levels of structs. For instance, given the aformentioned Dog
and DogPerson
structs, Rust allows us to get the age of his dog by doing dog_person.pet.age
, and we’d like to be able to do that structurally. Enter PathTraverser
:
1 2 3 4 5 6 |
|
Instead of Index
, its second type param is Indices
to reflect the fact that we’re going to need multiple Index
s to “pluck” by field name from. The “exit” (the last, aka no-more-dots, target field name and value type are on the current struct) and “recurse” (the last target field name and value type are in an “inner” struct) implementations of this trait are as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 |
|
That type signature is a bit hairy.
It’s a bit “Inceptiony” to think about what the Indices
type param might look like at a given callsite, and for the most part it doesn’t matter for users (we make it the compiler’s job to fill it in or error out trying), but for the purposes of trying to understand what’s going on, it’s reasonable to imagine this as the Indices
for structurally accessing dog_person.pet.age
:
1 2 3 4 |
|
Path
, path!
and Path!
The last piece we need is something that allows us to describe a path (e.g. pet.age
). Since the path is going to be itself a type-level thing (reminder: we pluck values by type-level field name), we can model this as a newtype wrapper around the zero-sized PhantomData<T>
type
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Path
s basically works like “lens”, only without the target type locked down (maybe that will be a future type in frunk…), enabling this sort of thing:
1 2 3 4 |
|
That’s all fine and good. From here on though, things get a bit tricky because we need to create friendly ways to declare Path
s, and T
needs to be a type level path, one that needs to be easy to use and compatible with the way LabelledGeneric
encodes field names into type-level strings. Rubber, meet road.
To make declaring value and type level Path
s easy to use, we’ll need to make use of procedural macros because they allow us to take user-defined expressions and turn them into type-level paths made of type-level field names, and doing so with declarative macros is extremely difficult (I gave it a stab) if not impossible.
A core function that is reused for generating value-level and type-value Path
s is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Where find_idents_in_expr
is a function turns a path expression like pet.age
into a vector of Ident
identifiers.
We then pass those through to the build_label_type
function, which translates each Ident
into a type-level name. This is also re-used by LabelledGeneric
’s derivation macro, which is important because it ensures that the way field names are encoded as types for Path
s is compatible with the way field names are encoded as types in LabelledGeneric
-produced labelled HLists.
The macro for creating a Path
value simply instantiates a Path
using Path::new()
, but with a type ascription based on what gets returned from build_path_type
.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
The macro for creating a Path
type simply splices the type returned from build_path_type
.
1 2 3 4 5 6 7 8 9 |
|
Getting and setting ids of from struct
s, without declaring a GetId
or SetId
trait and implementing it for each type:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 |
|
The PathTraverser
trait and Path
type build on LabelledGeneric
and HList
as core abstractions, which is nice because we get some more mileage out of them, and it means that there are no additional traits that you need to import nor implement (even as a macro).
As usual, it’s compile-time checked, but it’s also performant. In benchmarks, tests comparing lens_path*
(structurally typed traversal) versus normal_path*
(Rust lang built-in traversal) traversals show that they perform the same: in other words, using structural typing in this way adds zero overhead.
As usual, please give it a spin and chime in with any questions, corrections, and suggestions !
Technically, everything for writing basic structurally typed functions minus support for jumping through .
-separated fields was available in frunk since October of 2018 at the latest because ByNamePlucker
was available already by then.↩
In Rust, macros can and have been used to approximate structural typing (macro arguments aren’t typed, so you can just do something like $x.access.some.path
and have the compiler expand and fail it if an object at the callsite doesn’t have that path). This is fine too, but macros can be hard to read and maintain (they have no type signature, so you’ll need to look in the implementation/docs to know what it expects), and they aren’t functions; they’re code that write code. Again, The Macro Way is Fine ™; this post just offers an alternative.↩
Rust did at one point have built-in support structural records, but it was removed almost 9 years ago before 1.0 was released. I found an answer to a question on the internal Rust lang forum asking why, and the 3 reasons listed for removal at the time made sense; the Path
implementation described here (and implemented in frunk) addresses 1, if not 2, of the 3 issues (field order requirement and recursion IIUC), leaving the issue of field visibility, which I believe can probably be addressed as an option to the LabelledGeneric
derive.↩
There are some who would call this “row polymorphism”, which is maybe (more) correct, but it’s also a term that is much more niche (pronounced: “less generally known” or “less searched for”). Indeed, depending on whom you ask, “row polymorphism” is regarded as being under the “structural typing” umbrella (1, 2), but in any case, I personally find the distinction to be of questionable value in the context of Rust 🤷♂️. Having said that, feel free to substitute “row polymorphism” in place of “structural typing” when reading this post if it helps you slog through the actual important bits :)↩
trait
s can be adhoc and auto-implemented, and directly used as constraints in functions (though still nominally), so being structurally-typed on trait
s feels a bit less of a problem that needs solving, and I get the feeling that it will be even less so with things like specialization coming down the pipeline, which will allow for more blanket and overlapping impl
s.↩
This is not an objective language vs language comparison. I’ve written this post as part experience dump, part waymark for other Scala devs who are exploring or thinking of exploring Rust.
I’ve written a few Rust libraries/tools as well as Scala ones. For all intents and purposes, I’m a Scala engineer: I get paid to do it and it’s by far my strongest language. I’ve used Rust in a few of my side projects (libraries and smaller utilities).
On the Scala side, I’m the author of enumeratum, which brings flexible enums and value-enums to Scala as a library. I’ve also dabbled in writing macro-based libraries to make things like Free Monads and Tagless Final nicer to use.
On the Rust side, I’ve written frunk, a Rust functional programming toolbelt that is roughly a port of Shapeless with a bit of cats/scalaz mixed in, which does some pretty funky things with the type system that I’ve blogged about (1, 2, 3, 4). I also wrote a Rust port of requestb.in called rusqbin based on Hyper, and a small WIP async client for Microsoft Cognitive services called cogs.
The dev-environment-setup experience with Rust is amazing. The Rust community has striven to make it super easy to get started with Rust and it shows. Literally one shell command will set everything you need up.
rustup
for managing your Rust toolbelts (different versions/channels of Rust)cargo
for managing your build and for publishing to crates.io, which includes, among other things:
test
subcommand for running testsbench
subcommand for running benchmarksrustfmt
for formatting your code (runs on cargo projects via cargo fmt
)rustdoc
for generating beautiful documentation websites.
cargo test
)Coming from Scala, having all of this set up with no fuss right out of the gate is a breath of fresh air and feels like a big win for productivity. I know there are reasons for Scala’s more modular approach, but I think it would be nice if some of this rubbed off on Scala other languages.
When I first started with Rust, I used IntelliJ and its Rust plugin, but later switched to Microsoft Studio Code with the Rust plugin, which interfaces very well with Rust Language Server (installable as a rustup toolchain component). It feels very lightweight, and offers all the assistance I need.
If you lean more towards the functional programming paradigm side of Scala then you’ll probably love the following about Rust’s type system:
Int
, there are i8
, i16
, i32
, i64
, isize
, as well as u8
, u16
… )Essentially Rust has a lot of the good things about Scala’s type system. One thing currently missing from Rust is first class support for higher-kinded types (HKT), which, to be honest, I don’t miss too much because:
If this still sounds unacceptable, just know that you can get quite far in building reuseable abstractions using Rust’s traits + associated types, and BurnSushi’s port of quickcheck is available for writing and enforcing laws.
There are a few interesting things in the pipeline as well:
Adding functionality by using Rust’s traits should be familiar territory if you’ve written typeclass-like stuff in Scala. In fact, Rust’s trait system feels a lot more similar to Haskell’s typeclass system than Scala’s, something which has its pros and cons (no scoping of implementations for a given type, for example). I’ve written an intro/guide to Rust’s trait system in another post.
Both Rust and Scala have local type inference, and overall, they work in pretty much the same way. In both of them, you need to write the types for your function parameters. In Scala, you can leave the return type off and have the compiler infer it for you, in Rust you can’t (if you leave it off, it is assumed to be ()
, unit).
The Rust macro system, while less powerful than Scala’s, is quite useful for keeping your code DRY and importantly, integrates really well with the rest of the language. It is in fact enabled and available out of the box without any additional dependencies/flags.
Compared with Scala’s macros, Rust’s macros feel like a very natural part of the language, and you’ll run into them quite often when reading/using Rust libraries. In Rust code bases, you’ll often see macros declared and used immediately for the purpose of code generation (e.g. deriving trait implementations for a list of numeric types, or for tuples up to N elements), something that Scala users have generally done “out-of-band” by hooking into SBT and using another templating or AST-based tool.
On the other hand, in Scala, the usual refrain is “don’t write macros if you don’t have to”. When I compare the approaches the two languages have taken, I feel that Scala may have been overambitious in terms of giving developers power, thus leading to deprecations of APIs that can’t be maintained due to complexity. Indeed, Scala’s metaprogramming toolkit is going through another reform with the migration to Scalameta.
Because of its simplicity (the macros work based on a series of patterns), Rust’s macro API may feel limiting at first, but if you stick with it, you’ll likely find that you can accomplish more than what you initially thought. For example, the fact that you can build/restructure macro arguments recursively (!) and call the macro again (or even call another macro) is a fairly powerful tool.
Having said that, in addition to the legacy macro system, Rust will soon be getting procedural macros, which are more similar to what Scala devs are used to seeing. You can get a peek of what procedural macros are like by looking at custom derives, which I’ve used to implement derive
for LabelledGeneric
in Rust.
I think it’s not news to anyone that Rust is fast and efficient. The home page of the official site says it runs “blazingly fast” and features “zero-cost abstractions”, and the Rust-faithfuls loudly trumpted Rust’s defeat of GCC-C in in k-nucleotide a few months ago. Even if you don’t completely buy into the “faster than C” part, it’s not a big jump to say that Rust performance is in the same ballpark as C, or at least, there is no reason for it not to be (yes, language and implementation are different, compilers make a difference, etc.).
I’m particularly impressed by the Rust compiler’s (though I’m not sure if it’s LLVM?) ability to compile abstractions away so that the operations they enable have zero overhead. As a personal anecdote, when I wrote LabelledGeneric in frunk, I expected there to be some performance difference between using that abstraction for conversions between structs versus writing the conversions by hand (using From
). After all, there are non-negligible differences in the Shapeless version of it in Scala land (benchmark code):
1 2 3 4 5 6 7 |
|
To my surprise, Rust manages to compile frunk’s LabelledGeneric-based, non-trivial, multi-step, unoptimised (other than using the stack, no effort was spent) transform between structs into a zero-cost abstraction. That is, using LabelledGeneric for conversion adds zero overhead over writing the transform by hand (benchmark code):
1 2 3 4 5 6 |
|
Note: The Rust vs Scala LabelledGeneric
benchmarks are not completely apples-to-apples (the Rust version needs to instantiate new source objects every run because of move semantics), but they illustrate the performance difference between LabelledGeneric-based vs handwritten conversion in the two languages.
Overall, the Rust’s syntax is very similar to Scala’s. Sure, there are small adjustments here and there (let
and let mut
vs var
and val
, you’ll be using angle brackets instead of square ones, etc), but overall the languages feel very similar because they’re both C-like languages that are heavily inspired by ML.
Scala people will probably rejoice at things like the enum
being available (coming soon to Scala via Dotty) as well as partial destructuring (e.g. assuming struct Point { x: i32, y: 32}
, you can do let Point { x, .. } = p;
).
There are a handful of things that you’ll miss just from muscle memory in the beginning, but are either implemented as libraries or are done slightly differently, such as lazy values (rust-lazy or lazy-static) and methods such as Option’s foreach
(try if let Some(x) = myOption { /* use x here */ }
instead). Others are just plain missing, such as by-name parameters (not too big of a deal for me), for
/do
comprehensions, and keyword arguments (these last two hurt).
Oh, in Rust, types and traits are named the same way as in Scala, in CamelCase, but identifiers (bindings and methods) use snake_case, which I still find makes code look longer but isn’t a big problem. You’ll find references that can help if you are unsure and you’ll likely pick it up from reading library code anyways.
As with Swift, I haven’t been able to find conclusive evidence nor credit given to suggest that there was any influence from Scala on Rust …
Rust makes working with C as smooth as possible while sticking to its mantra of keeping things safe. For reference take a look at the section in the Rust book that deals with FFI.
1 2 3 4 5 6 7 |
|
The syntax might look familiar to those who have played around with Scala.Native.
1 2 3 4 5 6 |
|
Since calling C-code can be unsafe (wrt memory, thread-safety), Rust requires you to wrap your C-calls in unsafe. If you wish to hide this from your users, you can wrap these calls in another function.
1 2 3 4 5 6 |
|
Calling Rust code from C is also very smooth, something that Scala Native has yet to implement.
The current “feel” of Rust, and its community (or communities, since libraries/frameworks can have their own) is very welcoming and helpful. It’s also very difficult to quantify so I’ll just list some observations:
?
syntax for Try
s).In Scala, semicolons are optional and almost everything is an expression and therefore return values.
1 2 3 4 5 6 7 8 9 |
|
In Rust, semicolons are non-optional and are of significance. Statements that end with semicolons return ()
(unit) and those that do not get turned into expressions and thus return a value.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Rust’s memory/ownership model is, to me, its main killer feature; it gives you tighter control over the way your program consumes memory while maintaining memory-safety, all without having to ship a garbage collector with the runtime. You get to decide whether to pass things by value or by reference as well as mutability of bindings (including when pattern matching).
There is also the matter of where things get allocated. In Scala (and perhaps with most JVM-based languages), there are a set of rules that decide whether or not something gets put on the stack or on the heap (and thus incur the future cost of garbage collection). In general, the only time something gets allocated on the stack are primitives that do not escape methods as fields of objects, and references to objects which themselves get allocated on the heap. There might be fun runtime tricks the runtime environment does, like escape analysis, but overall, you don’t get to choose.
In Rust, you can choose to allocate things on the heap by instantiating them inside (or transferring ownership of them to) data structures such as Box
es or Vec
s, etc. Or you can choose to work with plain values. You get to pick your abstraction based on the cost you want to pay for the features and guarantees they offer, such as safe multi-thread access (this page is a great reference point). Either way, Rust’s ownership system will, at compile time, make sure that you won’t get data races caused by, for instance, modifying naked values in different threads with no access control.
Scala’s doesn’t give its users the same level of control, so naturally there is some adjustment to be made. However, contrary to the experiences of some others, I didn’t find the ownership stuff too hard to understand and get used to. Having experience with Scala’s rich type system meant that the lifetime annotation stuff was quite easy to come to grips with. Maybe doing C and C++ in Comsci courses in university helped too.
.clone()
s to get the compiler off your back, maybe you’re doing something not quite right.Mutability deserves to be mentioned separately. If you’re coming from years of Scala (or pretty much any other language that stresses immutability and referential transparency as the road to enlightenment), writing your first let mut
or &mut self
can feel dirty.
It took me a while to get used to the idea, but hey, when in Rome, right? If it helps, remember that Rust is focused on speed and efficiency through (near, or actually) zero-cost abstractions and that, thanks to its strict ownership model, data races due to mutability are not a problem.
In Scala, most frameworks that deal with any sort of IO have embraced non-blocking IO by utilising some kind of wrapper data type, such as Future[A]
, Task[A]
, or IO[A]
(usually a Monad), that separates the description of your program from its execution, and identify, by type, the effect of talking with the scary and dirty outside world. This allows you to not block the executing thread when waiting for stuff to happen (such as data to come back) by choosing a suitable execution strategy.
In Rust land, most of the widely-used libraries that I’ve seen, such as the Redis client, and and Hyper (and all the various things built on it, such as Rusoto, Rocket, etc) are all blocking. While this works okay for stuff like single-user utilities, this is suboptimal for applications that are IO heavy and need to serve a large number of concurrent users because your application’s threads can get tied up just waiting for data, leaving it unable to serve other requests. Or, you end up with potentially huge thread pools (à la old school Java Servlet apps..), which seems to go against Rust’s spirit of efficiency.
Having said that I know that advances are being made in this area:
Also, as of now, it’s painful to transform and return Futures from functions because every transformation causes the concrete type of your object to get chained and tagged with an arbitrary closure type. Since writing the result type is non-optional in Rust, the current solution is to declare your return type as Box<Future<A>>
, but it’s less efficient at runtime because boxed trait objects necessitate dynamic dispatch and heap allocation. Hopefully soon “impl Trait” will be released to address this issue (tracking RFC)
In Rust there are a number of ways to represent Strings. Here are a few:
String
runtime string value, with its contents allocated on the heap&'a str
string with a lifetime
&' static str
string with static lifetime (baked into your binary)Vec<u8>
While I’ve mostly gotten used to this by now and understand the purpose of having each one, I hope the ergonomics initiative can make this situation better to understand, since strings are so ubiquitous. How? I have no idea..maybe I’m just ranting.
Obviously, Scala devs are used to compiling once and running the same binaries everywhere thanks to the JVM (mostly :p). While I don’t expect the same for Rust because it compiles to native machine code, I do wish the cross-compilation tools were better out of the box (for example, like it is in Golang).
At the moment, depending on the target platform, cross-compilation for Rust is a bit involved and there are several options:
cross
, cargo tool that seems like it automates 2.My use case is building for my Raspberry Pi and I’ve only tried the first 2, but that last one looks to be the winner here and it would be awesome to see something like that included by default as part of rustup or cargo.
Just a few things I still don’t quite get:
ref
?In my opinion, ref
is unnecessarily confusing. From what I can tell, it’s mostly used for binding pointers during pattern matching
1 2 3 4 5 |
|
&mut
When handing out references of something bound with let mut, why do i need to do &mut
instead of just &
?
1 2 3 4 5 |
|
I somehow managed to code my way into a deadlock when using RWLock
because the lifetime-scoping behaviour of {}
braces when used with pattern matching is, in my opinion, non-intuitive. If you’re interested, more about it in this issue.
I know these things are in the pipeline but I wish they were in Rust yesterday:
for A
, then it clashes with every other implementation you write. Specialisation should remedy that (tracking RFC)do
or for
comprehension for working with container types (there are libs out there but built-in would be nice)That concludes my take on what it’s like to use Rust, from a Scala dev’s perspective, one year on, in 2017. Overall I’m very happy that the me a year ago decided to look into Rust. It’s been a fun and exciting ride: for a while it felt like every few months I was getting new toys that I could immediately use: type macros and custom derives were game changers because they made it ergonomic to write Hlist types by hand, and made Generic/LabelledGeneric practical, respectively.
Overall, I believe there are a lot of things in Rust for Scala engineers to like. The community is friendly and diverse so you can easily find a library that interests you to get involved in (shameless plug: contributions to frunk are always welcome). Or, you can do your own side project and write a small system utility or program a microcontroller; online resources are very easy to find. In any case, with Rust, you really can’t say it’s hard to get started !
]]>pluck()
and sculpt()
. Although each of those have impressive party tricks of their own, I’d like to share how you can use them to write a reuseable, generic function that handles converting between structs with mis-matched fields and thus have different LabelledGeneric
representations.
Unlike the last post, this one will be relatively light on recursion and mind-bending type-level stuff; it’s time to sit back and enjoy the fruits of our labour.
Much of this post will make use of Frunk’s types (e.g. HCons
, HNil
), methods, macros (esp. for describing macro types via the Hlist!
type macro), and terminology.
It might be easier to follow along if you add Frunk to your project and play around with it. Frunk is published to Crates.io, so to add it your list of dependencies, simply put this in your Cargo.toml
:
1 2 |
|
Alternatively, take a look at the published Rustdocs.
Suppose we have a bunch of structs that are similar-ish in terms of their data but ultimately, not necessarily
exactly the same. This means we can’t just use the normal LabelledGeneric
convert_from
method to convert between them.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 |
|
In our example, PresentableUser
and InternalApiUser
structs have fields that are subsets of the fields in UserFromDb
, and not in the same order either. The scenario is that UserFromDb
is a struct that we get from reading our persistence layer, and the other 2 are types that we use in our application for business logic.
Assuming a flow where we want to be able to go from UserFromDb
to either PresentableUser
or InternalApiUser
, the idea is that we don’t want be holding on to sensitive data like pw_hash
when we don’t need to, thus lowering the risk of accidentally leaking said data (e.g. serialising it by accident, or by rendering it in debug messages, etc).
While we could go about writing From
s by hand for each of these, and for every other time a similar situation arises, that’s quite a lot of boilerplate to write and maintain. Thankfully, we can make use of Frunk’s LabelledGeneric
and Sculptor
to write a single, reuseable generic function.
Note, for a review of:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Not bad. The body of the function is literally 3 lines long :) Now we can do this:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 |
|
In actuality, Frunk already ships with this function so you can use it out of the box.
Often times, you’ll hear that heterogeneous lists enable developers to write reuseable generic functions because they abstract over arity and types, and it might not be obvious exactly what that means on a practical level. The example shown in this post just scratches the surface of what is made possible through HList
and LabelledGeneric
, and there are definitely more creative usages out there, such as building of boilerplate-free (e.g. JSON) codecs (hint: look to Haskell and Scala libs for more).
As usual, please give it a spin and chime in with any questions, corrections, and suggestions !
]]>Getting the type signature right was 99% of the work in implementing pluck
and sculpt
for HList
s in Frunk.
Here’s what I’ve learnt along the way: what works, and what doesn’t work (and why).
As you may already know, Rust eschews the now-mainstream object-oriented model of programming (e.g. in Java, where behaviour for a type is added to the type/interface definition directly) in favour of a typeclass-like approach (e.g. in Haskell where you can ad-hoc add behaviour to a type separate from the type definition itself). Both approaches have their merits, and indeed, some languages, such as Scala, allow for a mix of both.
For those coming from the OOP school of programming, Rust’s system of adding behaviour to types might be daunting to come to grips with. At a glance, it might not be obvious how to get things done, especially when what you want to build goes beyond implementing Debug
or Eq
. If your abstraction has a certain degree of type-level recursiveness, it might be even harder to see the light at the end of the tunnel, and the lack of online material covering that sort of thing doesn’t help.
As a Scala guy with Haskell knowledge, I’m no stranger to typeclasses, but it took me a while and several failed attempts to figure out how to implement the following:
Of course, the type signature of the finished product can be intimidating !
In this post, I’ll briefly introduce Rust’s trait system and present my mental model for writing trait implementations that deal with type-level recursion. To do so, I will go through how pluck()
and sculpt()
were written in Frunk, as well as recount some of my failed approaches so you can learn from my mistakes.
Hopefully, by the end of it, you’ll be able to look at signatures like the one above and not go “WTF”, but rather, “FTW”.
Ok, I may be butchering/making up a term, but by “type-level recursion”, I’m referring to recursive expansions/evaluations of types at compile-time, particularly for the purpose of proving that a certain typeclass instance exists at a function call site. This is distinct from runtime “value”-level recursion that occurs when you call a function that calls itself.
If you’re having trouble understanding the difference:
In Rust, typeclass is spelt trait
, and although that word is somewhat ambiguous and overloaded with different meanings depending on context (e.g. in Scala), I’ll try to stick with it throughout this article. Subsequently, a typeclass instance is called an “implementation” (impl
in code) in Rust.
Here is a basic example of a simple trait and implementation for a type Circle
, taken from the official Rust book.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
For comparison, here is the Haskell equivalent
1 2 3 4 5 6 7 8 9 10 |
|
In both of these cases, what we see is
HasArea
, which describes behaviour (must implement an area
function that takes as its first argument the implementing type) for types that want to belong to, or join it.Circle
, which has one purpose: hold data.Circle
to the HasArea
trait by implementing an instance of the trait, fulfilling the contract by writing the area
function.The key difference between this approach and the OOP approach is that adding behaviour to an existing type does not require us to edit the original type declaration, nor does it require us to create a wrapper type. This allows us to add behaviour to types that do not belong to us (e.g. we don’t have access to its source)! This flexibility is a key advantage of the typeclass/trait approach. For a much more detailed comparison between OOP and typeclasses (traits), checkout this wiki entry on haskell.org.
Sometimes, you’ll want to write trait implementations for data types that have one or more type parameters. In these cases, your trait implementation will likely require that implementations of the trait exist for each of those type parameters.
For example
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 |
|
Making Cup
part of the Add
typeclass will allow us to call cup_a + cup_b
, which is kind of neat. One thing to take note of here is the Output
associated type. Pay attention to the fact that in our implementation of Add
for Cup
, the type of Output
is Cup<< A as Add<A> >::Output>
, which means that ultimately, the output of Add
ing of 2 Cup<A>
s will depend on what the Output
of Add<A>
is. The < A as Add<A> >
part can be read as “summon the Add<A>
implementation for the type A” (the compiler will do the actual lookup work here; if one doesn’t exist, your code will fail to compile), and the ::Output
following it means “retrieve the associated type, Output, from that implementation”. Let this sink in, because it’s important in order for us to move towards the concept of type-level recursion for traits.
Here is another way to write the same thing: using where clause syntax, so that the restriction goes at the end of the initial type signature in our implementation declaration. This is useful when you have more than 2 or 3 type parameters for your typeclass instance and you have a complex set of restraints. Using where
can help cut down on initial noise.
1 2 3 4 5 6 7 8 9 10 |
|
Here’s another, more general implementation of Add
for Cup
. It’s more general because it lets us add Cup
s of different content types, provided that there exists an Add<B>
implementation for whatever concrete type is bound to A
in any given Cup<A>
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
|
By this point, we have covered most of the basic understanding required to write more complex traits and implementations. To recap, they are:
A: Add<A>
or where
clauses) when writing implementations for generic types<A as Display>
)Output
in the above examples)For a more thorough introduction to Rust’s trait system, by all means refer to the official Rust docs on traits.
Before going any further, I’d like to provide you with my mental model of how to think about recursion on the type level.
You write a function that keeps calling itself until an exit condition is met, then returns a value.
You write implementations of your trait for exit-types and work-to-be-done types. In order to prove an implementation of your trait exists for a concrete type at a function call site, the compiler will try to lookup and expand/expand types recursively until it can figure out a concrete implementation to use, or gives up with an error.
This may not make much sense at the moment, but hopefully it will soon.
Much of this post will make use of Frunk’s types (e.g. HCons
, HNil
), methods, macros (esp. for describing macro types via the Hlist!
type macro), and terminology.
It might be easier to follow along if you add Frunk to your project and play around with it. Frunk is published to Crates.io, so to add it your list of dependencies, simply put this in your Cargo.toml
:
1 2 |
|
Alternatively, take a look at the published Rustdocs.
Given an HList, how can we write a function that allows us to pluck out a value by type (if the HList
does not contain this type, the compiler should let us know), and also return the rest of the HList
?
Suppose we call this function pluck()
, it should behave like so:
1 2 3 4 5 6 7 8 |
|
Our basic logic is fairly simple, given an HList
and a Target
type:
Target
type, return the head of the Hlist and the tail of the Hlist as the remainder in a pair (2 element tuple)current_head
pluck()
again on the tail of the current Hlist with the same Target
type (i.e. recursively call 1. with the tail), and store the result in (tail_target, tail_remainder)
pair.current_head
to the remainder from the tail. Return both in a tuple like so: (tail_target, HCons { head: current_head, tail: tail_remainder} )
.First, let’s assume we’ll be working with a trait; call it Plucker
. For now, let’s also assume that it will be parameterised with 1 type, the target type, and will also have an associated type, Remainder
. There isn’t really a hard and fast rule for when you should use type parameters vs associated types, but if you’re interested, you can take a look at this Stackoverflow question because Matthieu offers some great advice.
Personally, I always try use an associated type when I need to refer to the type from somewhere else (espescially recursively; more on this later). However, going with a type parameter is useful when you need to have different implementations of a trait for the same type in different circumstances. We saw this with Add
, where the right hand side was a type parameter, RHS
, allowing you to declare different Add
implementations for the same left-hand-side type and letting the compiler find the correct implementation to use at +
call sites depending on the type of thing being added with.
1 2 3 4 5 6 7 8 |
|
The “exit-type” implementation is for when the current head of the HList
contains the target type, so let’s jot that down that:
1 2 3 4 5 6 7 8 9 |
|
Now let’s implement the second piece; the non-trivial part where the target type is not in Head
, but in the Tail
of our HList. I’ll sometimes refer to this as the “work-to-be-done” type.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Looks good right? But if you send that to the compiler, you’ll be hit with this message:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
What the Rust compiler is helpfully is telling us, is that it can’t distinguish between our two implementations, and if we look closely at the types, that is indeed true:
1 2 3 4 5 |
|
The Plucker<Target>
part is exactly the same, and sure, we’ve used Target
instead of Head
in the for HCons<..>
part in the first case, but simply using different type parameters isn’t enough to distinguish between the two.
Furthermore, note that you can’t use the lack of constraints (or where
clauses) to distinguish between implementations either. This is because the current lack of an implementation for a given type parameter doesn’t mean that it can’t be added later (see this Stackoverflow questions for more details).
Welp, back to the drawing board.
What we’ve learnt is that we need to have another type parameter in order to distinguish the exit-type and the work-to-be-done-type implementations, so let’s add one to Plucker
. Intuitively, we know that we want to have a way to distinguish between “the target is here in the HList” (exit) and “the target is over there in the HList” (recursion), so let’s call our type parameter Index
.
1 2 3 4 5 6 |
|
Then, let’s add a type to identify the index
for the exit-type implementation. We’ll use an empty enum
because we just want to have a type, and we don’t want it to be available at runtime (ensuring zero runtime cost for our type).
1 2 3 4 5 6 7 8 9 10 11 12 |
|
What about the work-to-be-done-type? Let’s imagine a scenario where we want to pluck a Target
of type MagicType
(let’s assume it’s declared as struct MagicType
, so a type with a single element in it), and we have the following HList
s to pluck()
from; what would the Index
be?
HNil
Trick question, there is no Index
because our target of MagicType
isn’t here. The compiler should fail to find an instance/implementation of our trait.
hlist[ MagicType ]
(this is syntactic sugar for HCons<MagicType, HNil>
)
Index
would clearly be our Here
enum type
hlist![ Foo, MagicType ]
(this is syntactic sugar for HCons<Foo, HCons<MagicType, HNil>>
)
Index
can’t be Here
, but we know that in order for the compiler to be satisfied that it can reach our end-type in 1. Here
needs to be somewhere inside the type, but we can’t just use it as is, otherwise we’ll run into the same “conflicting implementation” error as before. So, let’s introduce new type There<A>
, that has one type parameter. In this case, the Index
should resolve to There<Here>
because the target type is in the head of the tail.
hlist![ Foo, Foo, MagicType ]
Following from 3. Index
would have to be There<There<Here>>
hlist![ Foo, Foo, Foo, MagicType ]
What else could Index
be but There<There<There<Here>>>
That Looks alright, so let’s give it a go. Since the new type has a type parameter but no real data to associate it with, we’ll need use the PhantomData
trick (discussed in the last post).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 |
|
And that’s it, we’ve written implementations of Plucker
for HList
. The implementation for work-to-be-done type is type-recursive in its Index
type as well as its Remainder
associated type. The cool thing is that the compiler is in charge of figuring out what the concrete types should be at any given pluck()
call-site. In fact, you can see from this example in Frunk that the compiler will also happily infer the remainder for us too.
Let’s take a step back and work through what we’ve done.
We’ve declared an implementation of Plucker
for the trivial exit-type (Target
is in the head).
We’ve also declared an implementation for the work-to-be-done type (Target
is in the tail). This implementation, however, is dependent on its recursive types of Tail
and TailIndex
(hint: look at the where
clause). Intuitively speaking, an implementation of this type only exists if the current HList’s Tail
has either:
Target
type is in the headPlucker
. This ultimately means that eventually there has to be a 1. in the tail somewhere.Let’s try to walk through a mental model of how pluck()
gets associated to the proper implementation.
1 2 3 4 5 6 7 |
|
We’re ignoring the remainder and its type (Rust will figure it out if we use the underscore binding _
), because it isn’t relevant for what we’re about to do.
In the following steps, we’ll substitute concrete types into our implementations where possible; similar to how functions get bound to values during the substitution model of evaluation (normally used for evaluating runtime values). We’ll do this in steps, so it’s possible that in the earlier stages, we don’t quite know the concrete type yet, but we’ll go down the “stack”, and come back up and fill those types in, too, once we know them.
pluck()
on Hlist![ &str, bool, f32, i32 ]
Since our Target
type (f32
) is not in the head, it doesn’t match with the Here
case, so we will try to use the work-to-be-done case (Index
is There<TailIndex>
) and fill in as many types as we can for now. Let’s replace some type parameters with their concrete types where possible.
Concrete types:
Head
→ &str
Tail
→ Hlist![bool, f32, i32 ]
(remember, this is syntactic sugar for HCons<bool, HCons<f32, HCons<i32, HNil>
)Target
→ f32
(this doesn’t change)Remainder
→ Don’t know yet, but we already know that the current Head
will be in it, since it isn’t the target
type. And we know the tail of Remainder
will be the remainder from pluck()
ing f32
from the tail, so we can reference it as HCons< &str, < Hlist![bool, f32, i32] as Plucker<f32, There<Here>> >::Remainder >
for now.TailIndex
→ Don’t know yet, but we’ll find out. Let’s call reference it as TailIndex1
for now.pluck()
on Hlist![bool, f32, i32]
(Tail
from 1.)
Again, f32
is not in the head of our type, so we know we aren’t going to be working with the exit-type typeclass implementation (e.g., Index
is not Here
yet.)
Concrete types:
Head
→ bool
Tail
→ Hlist![ f32, i32 ]
Target
→ f32
(again, this doesn’t change)Remainder
→ Still don’t know yet, but we do know that bool
will be in it since it isn’t our target. Similar to the previous step, we’ll tentatively call it HCons< bool, < Hlist![ f32, i32] as Plucker<f32, Here> >::Remainder >
TailIndex
→ Don’t know yet, but let’s rename it TailIndex2
for now and fill it in later.pluck()
on Hlist![ f32, i32 ]
(Tail
from 2.)
The head has type f32
and the target type is f32
, so we’ve arrived at the exit-type implementation.
Concrete types:
Head
→ f32
Tail
→ Hlist![ i32 ]
Target
→ f32
!Remainder
→ Since we’ve found our target, we know that Remainder
must be the tail, and thus Hlist![ i32 ]
, or its equivalent HCons< i32, HNil >
Index
→ Here
!Now, that we’ve finally resolved a concrete type for Index
, we can go backwards up the type-level stack and fill in our unknowns:
TailIndex2
→ Here
, which means that Index
is There<Here>>
Remainder
→ HList![ boo, i32 ]
TailIndex1
→ There<Here>
, which means that Index
is There<There<Here>>>
Remainder
→ HList![ &str, boo, i32 ]
The compiler is thus able to find a trait implementation to pluck()
a f32
out of an Hlist![ &str, bool, f32, i32 ]
that looks like this (with all the type parameters bound to a concrete type):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Whew! That took a while, but I hope it helps illustrate how you can use a mental model similar to the substitution model of evaluation, but with types, in order to prove the existence of implementations for a given type.
By the way, by default, the compiler has a limit on how many levels of recursion/expansion this search for a typeclass instance goes. In my testing, I found this to be 64 levels and verified it to be so by looking at Rust’s source code. If you hit the limit, the compiler blow up, but will helpfully offer you a solution:
1 2 3 4 5 6 7 8 |
|
So, simply add #![recursion_limit="128"]
to your crate. If you hit the limit again, the compiler will tell you to double the limit again. Ad infinitum.
Great ! Now that we’ve finished with Plucker
, let’s go one level deeper: making use of Plucker
to do something even more interesting; sculpting HList
s !
Here is the basic idea of what we want to be able to do:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
|
Let’s call our trait Sculptor
. We should be able to re-use our Plucker
trait, which which means we’ll work with Target
s and Index
s, but there’s more than one of each!
Intuitively, this is the kind of logic that we want:
Given TargetHList
(target HList) and SourceHList
(source HList), and assuming the types in the former is a subset (not necessarily in order though) of the latter:
TargetHList
from SourceHList
:
(plucked, remainder)
tuplesculpt
on remainder
, passing the tail type of the current TargetHList
as the new TargetHList
type.
(sculpted_tail, sculpted_remainder)
tuple(HCons { head: plucked, tail: sculpted_tail }, sculpted_remainder)
Note that in 1. we are making use of pluck()
, and there is a recursive call to sculpt()
in 2. Since there is a recursive call to sculpt()
, it means that we need an exit-type as well. Intuitively, we’ll pencil one in:
When the target HList is empty (HNil), return a tuple
(HNil, SourceHList)
Given our logic, let’s assume we want 4 type parameters in our trait. Our trait is a bit more complicated than our Pluck
trait, but not by much. We make use of the same associated-type trick to hold the type of Remainder
to be returned as the 2nd element in our type that will be filled-in when we write instances of the trait.
1 2 3 4 5 6 |
|
The instance of Sculptor
for the exit-type should be simple, right?:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Ooops; that didn’t work; our type signature for the trait can’t be fulfilled when implementing our instance! We simply have too many type parameters in our trait, even for the exit-type implementation (try implementing for the recursion case…it’ll become more apparent)
Back to the drawing board.
Let’s collapse our target-related type parameters into a single Target
type parameter and our indices-related type parameters into a single Indices
type parameter in our Sculptor
trait declaration, and rely on the implementations to dictate (specialise) what types they should be (similar to how the Plucker
trait had no mention of There
or Here
).
1 2 3 4 5 6 |
|
The exit-type implementation will still be when we have HNil
as the target. Thinking it through further, in the case that we don’t have a HNil
as the target, it’s obvious that Source
can then be literally anything, so we’ll rename its type parameter Source
. Since our intention for Sculptor
is for Indices
to be an HList of Here
or There<A>
(one for each type in our Target
HList), the exit Indices
must therefore be a valid Hlist. Since we don’t need an index to find an empty target, let’s make Indices
HNil
for simplicity.
1 2 3 4 5 6 7 8 |
|
To figure out the type parameters needed for our work-to-be-done type, let’s work through the logic we laid out earlier.
At minimum, we know we’re writing an instance of Sculptor
for a Source of type HList, and our Target type is also an HList, so we’ll use SHead
and STail
to describe the “Source” HList (so HCons<SHead, STail>
), and THead
and TTail
to denote the “Target” HList (similarly, HCons<THead, TTail>
).
- Pluck value with the head type of
TargetHList
fromSourceHList
:
- Store the result in a
(plucked, remainder)
tuple
Since we need to pluck()
a THead
from our Source HList, we’ll need a type parameter for the first index, so let’s name it IndexHead
. In addition, in order to pluck()
, we need a Plucker
too, so this constraint is needed somewhere in our implementation declaration:
1
|
|
- Call
sculpt()
onremainder
, passing the tail type of the currentTargetHList
as the newTargetHList
type.
- Store the result in a
(sculpted_tail, sculpted_remainder)
tuple
Since we want to sculpt the remainder of calling pluck()
in step 1. into type TTail
(tail of TargetHList
), we’ll need to have an HList of indices for that purpose too, so let’s call it IndexTail
. Note that we don’t need a separate type parameter for the remainder from 1 because we can take advantage of the associated type on Plucker
.
1 2 3 4 5 |
|
- Return
(HCons { head: plucked, tail: sculpted_tail }, sculpted_remainder)
What will the Remainder
type be? It should be the remainder of sculpting the remainder from plucking the head type (THead
) out of the current source HList into TTail
(yeah…)
1
|
|
Putting all these types together with the logic, we have
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
As you can see, our implementations of Sculptor
is type-recursive in an interesting way, and there are quite a few dependencies that need to be worked out between all the type parameters and the Plucker
trait as well as the Sculptor
trait itself (it appears in the where
after all). Fortunately, the Rust compiler will do that for us (and if need be, tell you to raise the #![recursion_limit]
in your crate).
If you’re not convinced this works, please by all means check out the hlist
module in Frunk, in particular the Sculptor trait.
One last thing: the Plucker
and Sculptor
things aren’t just cute exercises; Plucker
has already paid dividends when modeling Sculptor
, and Sculptor
, well, it’s instrumental in letting us do cool stuff like convert between structs with different LabelledGeneric implementations (to an extent, anyways), and other, even cooler generic functions. We’ll talk more about this in another post.
If you do a search, you’ll find a number of articles on the Interwebs that introduce Rust’s trait system, but not many that go deep into how to use it when you need to do non-trivial type-level recursion in your trait implementations (though how often this need arises is … another topic altogether). I also find that people generally don’t talk about what they did wrong, so I wanted to share my failed approaches as well.
The goal of this post is to hopefully help others who are curious, or have a need to do something similar, as well as to leave notes for myself in case I ever need to revisit this in the future. The mental models for breaking down the problem, defining types, and building up to an implementation might not work for everyone, but they’ve helped me.
Personally, I think it’s awesome that a close-to-the-metal systems programming language like Rust has a powerful enough compiler and type-system to allow for these kinds of techniques. As you can see, we’ve managed to build powerful, reusable abstractions without doing anything unsafe, and we’ve exposed an API that requires just the bare minimum of type annotations; Rust infers the rest :) In any case, I hope this post was useful, and as usual, please chime in with questions and suggestions.
Here
and There<A>
design was largely gleaned from this code. I stand on the shoulders of giants :)** It goes without saying that these operations need to be type-safe. That is, they are verified by the compiler without using any unsafe tricks that could blow up at runtime.
]]>LabelledGeneric
? How does one encode type-level Strings in Rust? What is a labelled HList?
Hold on, let’s take a step back.
In a previous post about implementing Generic
in Rust, I briefly mentioned the fact that Generic
could cause silent failures at runtime if you have 2 structs that are identically shaped type-wise, but have certain fields swapped.
While we can work around this using wrapper types, that solution leaves something to be desired, because, well, more boilerplate adds noise and requires more maintenance.
Ideally, we want to have something like this, where the following works:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
but the following fails at compile-time because the fields are mis-matched (first_name
and last_name
have been swapped):
1 2 3 4 5 6 7 8 9 10 11 |
|
The solution to this sort of problem has been in Shapeless for some time; by using HList
s where each cell contains not just a value, but instead hold named fields, where each value is labelled at the type level.
Let’s take a look at how Frunk implements Field
values and LabelledGeneric
in Rust :)
Frunk is published to Crates.io, so to begin, add the crate to your list of dependencies:
1 2 |
|
Generic
To illustrate the problem, observe that the following 2 structs have the exact same “shape”
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
That is, the Generic
representation of their fields as Generic
is simply HList![&'a str, &'a str, usize]
. As a result, when we do the following:
1 2 3 4 5 6 7 8 |
|
Oh no! s_user
has first_name
and last_name
flipped :(
As explained near the end of the post introducing Generic, you can catch this sort of mistake by introducing wrapper types like FirstName<'a>(&' str)
for each field, but that introduces more boilerplate. This sucks, because Generic
is supposed to help avoid boilerplate!
Can we have our cake and eat it too ?
LabelledGeneric
to the rescueLabelledGeneric
was introduced in v0.1.12 of Frunk to solve this exact problem. This is how you use it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
|
There isn’t a whole lot different to using LabelledGeneric
vs using Generic
:
Generic
, derive LabelledGeneric
convert_from
, call labelled_convert_from
These 2 changes buy you a lot more type-safety at compile time, with zero boilerplate. By the way, if you’d like the compiler to automatically “align”, the generic representations so that you could instantiate a JumbledUser
from a NewUser
, then stay tuned for a later post ;)
The tl;dr version of how this works is that deriving by LabelledGeneric
, we make the struct an instance of the LabelledGeneric
typeclass. This typeclass is almost identical to the Generic
typeclass, but the derive
does something a bit different with the generic representation of the struct: it isn’t just an HList
wrapping naked values.
Instead, the generic representation will be an HList
where each cell will contain field name information, at the type-level, and conceptually has the following types:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
This difference in type-level representation is how the compiler knows that one can’t simply convert a NewUser
or SavedUser
into a JumbledUser
via labelled_convert_from
.
Field
??What is Field
? It’s simply a container struct that is parameterised by 2 types, and has the following signature:
1
|
|
The first type parameter is Name
and its purpose is to contain a type-level String, and the second type parameter is Type
, which reflects the type of value contained inside the struct.
It may help to think of Field
as an ad-hoc wrapper type.
Field<Name, Type>
The full definition of Field
is currently as follows:
1 2 3 4 |
|
PhantomData
is used to allow us to bind a concrete type to the Name
type parameter in an instance of Field
without actually having it take up any space (for more details on Phantom data, refer to the official docs).
To construct a Field
, Frunk exposes a macro called field!
so that you don’t need to touch PhantomData
yourself.
1 2 3 4 5 |
|
For more information about the field!
macro, please refer to its Rustdoc page. Astute readers will notice the odd (a,g,e)
type used for naming. What is that about ???
In order represent characters at the type level, Frunk currently uses enum
s that have zero members. This is because empty enums have distinct types, and yet cannot be instantiated at runtime and thus are guaranteed to incur zero cost.
Conceptually, we declare one enum for every character we want to represent:
1 2 3 4 5 6 7 8 9 10 11 |
|
This means that characters outside English alphanumeric range will need to be specially encoded (the LabelledGeneric
derivation uses unicode, but more on this later), but for the most part, this should suffice for the use case of encoding field names as types.
As you may have guessed, type-level strings are then simply represented as tuple types, hence (a,g,e)
. For the sake of reducing noise, in the rest of this post, we will refer to these name-types without commas and parentheses.
Note: This type-level encoding of strings may change in the future.
Combining the Field
and HList
constructs gets us something else: Records. I believe once upon a time, Rust supported anonymous structs; well, you can get most of that functionality back with Frunk!
1 2 3 4 5 6 7 8 9 10 11 |
|
This kind of thing is sometimes called an “anonymous Record” in Scala (see scala-records, or Shapeless).
In the future, the anonymous Records API in Frunk might be improved. As it stands, it exists mostly for the purpose of LabelledGeneric
and is a bit noisy to use.
Field
and LabelledGeneric
So, what is the relationship between Field
and the LabelledGeneric
typeclass?
Quite simply, the associated Repr
type of an instance of LabelledGeneric
should have the type of an anonymous record (labelled HList
).
So, given the following
1 2 3 4 |
|
This is one possible implementation of LabelledGeneric
for Person
:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
But writing that yourself is tedious and error-prone, so Frunk provides a derivation for you.
LabelledGeneric
derivation is generatedAs illustrated earlier, you can do the following to create an instance of LabelledGeneric
for your struct:
1 2 3 4 5 |
|
It generates something conceptually similar to what we had above, so we won’t repeat that here.
That said, there is something special about the way that characters outside the range of the standard English alphabet and digits are handled. For each of those characters, we get the Unicode hexcode and use those digits, sandwiched by _uc
and uc_
delimiters, as the type-level representation.
1 2 3 4 5 6 7 8 |
|
This allows us to effectively represent virtually any legal identifier at the type level, even when the ASCII-only restriction for identifiers is lifted from stable Rust. For more details, take a look at how characters are matched to identifiers here.
In closing, I’d like to stress that all the abstractions and techniques described in this post are type-safe (no casting happening) and thus get fully verified by Rust’s compiler and its strong type system.
As far as I am aware, this is the first implementation of labelled HLists (aka anonymous Records) and LabelledGeneric
in Rust, and I hope this post did a good job of explaining what problems they solve, what they are, how they work, and why you might want to use them. As usual, please give them a go and chime in with questions, comments, ideas, or PRs!
Also, as alluded to in the section introducing LabelledGeneric
, there is a way to automatically match up out-of-order fields. We’ll go through this in another post.
Hlist
s into Structs or to reuse logic across different types that are structurally identical or very similar (e.g. same data across different domains)? Generic
can help you do that with minimal boilerplate.
Generic
is a way of representing a type in … a generic way. By coding around Generic
, you can write functions that abstract over types and arity, but still have the ability to recover your original type afterwards. This can be a fairly powerful thing.
Thanks to the new Macros 1.1 infrastructure added in Rust 1.15, Frunk comes out of the box with a custom Generic
derivation so that boilerplate is kept to a minimum. Without further ado, let’s dive in to see what Generic can do for us.
Frunk is published to Crates.io, so to begin, add the crate to your list of dependencies:
1 2 |
|
Have an HList
lying around and want to turn it into a Struct with the same shape (maybe you’re using Validated)?
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
This also works the other way too; just pass a struct to into_generic
and get its generic representation.
One usecase for something like this is if you have a bunch of fields that you want to validate “simultaneously”, and you want to transform the end result into a single Struct; this is often the case when you want to turn external input (e.g. coming into your API, a web form, or fields read from a database), and in a previous post I introduced Validated as a way of doing that.
With the introduction of Generic
, that last step of transforming an HList
into your struct gets much simpler:
1 2 3 4 |
|
Sometimes you might have 2 or more types that are structurally the same (e.g. different domains but the same data) and you’d like to convert between them. An example of this might be when you have a model for deserialising from an external API and another one for internal application business logic, and yet another for persistence.
Generic comes with a handy convert_from
method that helps here:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Another example of where this might be useful is if you want to use different types to represent the same data at different stages (see this post on StackOverflow).
At a glance, Generic
might look magical and dangerous, but really it is no more mysterious than the From
trait in the standard lib; the only difference (for now) is that every Generic
instance is bidirectional (can turn an A
into a Repr
and a Repr
into an A
). If you don’t believe me, just look at the type signatures.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Most of the magic resides in how the custom derive of Generic, made possible by the 1.15 release of Rust, is implemented. If you want to find out more, take a look at the derives
directory of Frunk on Github. In regards to the end-result though, the following:
1 2 3 4 5 6 |
|
Gets expanded at compile-time to something resembling:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
To be clear, the actual expanded coded is much gnarlier because we use fully qualified names for the sake of hygiene and I’ve sugared some things up with their macro-powered equivalents to cut down on noise (namely the HList type signature, pattern matching, and construction).
Someone on Twitter raised the point that if you had mixed up the ordering of the fields in your struct declaration (e.g. last name and first name are swapped between structs), then Generic
would cause silent errors at runtime because the Structs’ shape would be the same, and that implementing From
was more typesafe. With all due respect to that individual, the same could happen even if you hand-wrote your From
implementation and got your field assignments crossed. In the worst case; you’ve now got fields that are not ordered correctly, your From
is wrong, and you’ve got more boilerplate to maintain.
Really, the only way to truly prevent this kind of fat-fingering error is to have wrapper types (like struct FirstName(String)
, etc) for all your fields, in which case Generic
conversion would be foolproof (if you got your field declaration orders wrong, you’d get a compile-time error). Ultimately, how typesafe you want to be is a choice you will need to make while weighing the risk of fat-fingering against the burden of maintaining more code.
I hope you’re now convinced that there is no dirty casting / unsafe stuff going on, so you can rest easy knowing your code is still as type-safe as it would have been if you had gone with something like From
instead.
There are probably many other ways that Generic
can be used to make code nicer (more reusable, DRYer, less noisy), so go ahead and see what you can cook up. As always, please don’t hesitate to get in touch via comments, on Github or on Gitter with suggestions, issues, questions, or PRs.
a systems programming language that runs blazingly fast, prevents segfaults, and guarantees thread safety. ### Featuring * zero-cost abstractions * minimal runtime *efficient C bindings
So, it’s likely that developers who choose to program in Rust are focused on performance. You can make sure your code is efficient by writing benchmarks, but in order to prevent performance regressions, you’ll need to run benchmarks on your Pull Requests or patches and somehow compare before and after. Doing this can be tedious, especially as the changeset evolves over the course of code review or miscellaneous refactoring.
Let’s see how we can get automated benchmark comparisons across commits on Travis CI.
First off, you’ll need to have benchmarks in your codebase. There are a few ways to do this:
benches
directory in your project root, putting your benchmarks there, and running cargo bench
(this is how I’ve done it in Frunk)Next, in order to run benchmarks on Travis, we’ll need to make sure that your .travis.yml
file has nightly
listed as one of the Rust versions that your project is built with:
1 2 3 |
|
Then, in after_success
, we’ll want the following in order to have benchmarks run when we are on a build that uses Rust nightly
:
1 2 3 4 |
|
Some readers might be wondering why I’m not using travis-cargo
here. The reason is because travis-cargo
doesn’t support arbitrary cargo libraries/commands, which is needed in the next section ;)
So we have benchmarks running automatically on Travis, but what about the before-after comparisons that we talked about earlier? This is where the cargo-benchcmp
library comes into play. benchcmp
is:
A small utility for comparing micro-benchmarks produced by cargo bench. The utility takes as input two sets of micro-benchmarks (one “old” and the other “new”) and shows as output a comparison between each benchmark.
What we’ll want to do next is add a condition to only run these benchmarks when we’re building a Pull Request (henceforth PR), install the benchcmp
tool, and use it:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
|
The first conditional is simply to check that the current branch being built is not master. It’s a bit verbose because $TRAVIS_BRANCH
does not always provide the current branch name. So instead, we use ${TRAVIS_PULL_REQUEST_BRANCH:-$TRAVIS_BRANCH}
, which consists of $TRAVIS_PULL_REQUEST_BRANCH
because it gives us the current branch if the build was triggered by a PR, and a default of $TRAVIS_BRANCH
, which gives us the branch name of non-PR builds.
The second conditional checks that the current Travis build is using nightly
, which is a requirement for running benchmarks (as of writing).
Inside the if statements body, we first cd
out of our provided directory and clone our project anew. I’m not entirely sure why, but in my testing, I was unable to checkout another branch (e.g. master) otherwise. Next, we run cargo bench
on the master branch, sending the output to benches-control
. Afterwards, we checkout the commit for the current build by using TRAVIS_COMMIT
, and run cargo bench
again, sending the output to benches-variable
.
Lastly, we install and run cargo benchcmp
, passing the path of the control and variable benchmark result files as arguments, letting cargo-benchcmp
do its job.
Oh, we shouldn’t forget to add our script to the after_success
block in our Travis file.
1 2 |
|
Here is some sample output from my Rust functional programming library, Frunk.
The benchmark comparisons show up in the build log.
That’s it. Now, you can go to the Travis build log of your PRs and see how performance has been affected. Please give it a try, and send any questions or feedback. Oh, if you’re interested in a library that does this for you or if you want to turn this into some kind of a service, do let me know ;-)
]]>Result<T, E>
type in its standard library. For those not familiar with it, it is a union-like enum type where T
is a type parameter denoting the kind object held in a Result
in the success case (Result::Ok<T>
), and E
is a type paramter denoting the kind of error object held in the failure case (Result::Err<E>
). In Scala, this is represented in the standard library as Either[+A, +B]
, where the the success and error type params are swapped (traditionally, the one on the left stands for error and the one on the right is…well, right).
By default, Result
comes with really good support for what I call “early return on error”. That is, you can use map
, and_then
(flatMap in some other languages) to transform them, and if there’s an error at an intermediate step, the chain returns early with a Result::Err<E>
:
1 2 3 4 5 6 |
|
But .. what happens when you have multiple Result
s that are independent of each other, and you want to accumulate not only their collective success case, but also all their collective errors in the failure case?
Let’s have a look at Validated in Frunk (which is itself inspired by Validated
in Cats)
Frunk is published to Crates.io, so to begin, add the crate to your list of dependencies:
1 2 |
|
By the way, to take a dive into the deep end, jump straight to Validated’s Rustdocs.
Next, let’s add a few imports.
1 2 |
|
Suppose we have a Person
struct defined as follows:
1 2 3 4 5 6 |
|
And, we have 3 methods that produce age, name and email for us, but all could potentially fail with a Nope
error.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
|
In real life, these methods would probably be taking an HTML form as an argument and doing some kind of parsing/validation or making calls to a service somewhere, but for simplicity, in our example, each of them takes a single argument that will let us toggle between the success and error cases.
Having set all that up, actually using Validated
to accumulate our Results
is actually very simple:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
As you can see, all we need to do is call into_validated()
on a given Result
to kick off the validation context, and use +
to add subsequent Result
s into it. At the end, you call into_result()
on the Validated
to turn it back into a Result
and map on the HList
that is contained inside. Inside the lambda, we destructure the HList
using the hlist_pat!
macro, and then instantiate our Person
.
Oh, in case it isn’t obvious, the hlist
passed to the lambda when we map is statically typed in the order that your Result
s were added into the Validated
context, so your code is completely type safe. If you want to learn more about HLists in Frunk, check out this blog post.
Having said that, perhaps in the success case, not much has really changed in comparison to using naked Result
s. That is, you could have gotten here simply by chaining with map
and/or and_then
. But take a look at what happens when one or more of these fail:
1 2 3 4 5 6 7 8 9 |
|
As you can see, the failure case is more interesting because Validated
gives us the ability to accumulate all errors cleanly. For operations like parsing user input or checking parameters passed into our API, this non-early-abort behaviour is highly desirable compared with telling the user what went wrong One. Thing. At. At. Time.
Oh, Validated
s can also be appended to each other:
1 2 3 4 5 6 7 8 |
|
Please take Validated
out for a spin and send suggestions, comments, PRs ! I’ve found this abstraction to be helpful in the Scala world so I’m eager to hear impressions from Rustaceans.
Vec
, Slice
, Array
), a heterogenous list is able to hold elements of different types (hence heterogenous) and expose those types in its own type signature.
1 2 |
|
Now, you might be thinking “Isn’t that just a tuple?”. The answer is: in a way. Indeed, in terms of data structure, a given implementation of HList is usually really nothing more than deeply nested pairs (tuple of 2 elements) that each hold an element of arbitrary type in its 1st element and knows that its 2nd element is itself an HList-like thing. While it may seem convoluted, HList buys us the ability to abstract over arity, which turns out to be extremely useful, as you can see from this Stackoverflow answer by Miles Sabin, the creater of the Shapeless library, which provides an HList implementation in Scala.
Given that description and justification for the existence of HLists, let’s take a look at how to use Frunk’s implementation of HList in Rust.
Frunk is published to Crates.io, so to begin, add the crate to your list of dependencies:
1 2 |
|
By the way, to take a dive into the deep end, jump straight to HList’s Rustdocs.
Next, let’s add a few imports. In particular, note that we have a #[macro_use]
directive in order to enable the hlist!
macro, which makes declaring HList
s nicer by saving you the trouble of writing deeply nested HCon
s.
1 2 |
|
Making an HList is easy if you use the hlist!
macro:
1 2 3 4 |
|
Since HLists are a bunch of nested HCons
s, you may think that writing the type annotation for one would be a PITA. Well, it might have been if not for the type-level macros introduced in Rust 1.13.
1 2 3 4 |
|
To retrieve the head element of an HList, use the .head
accessor
1 2 |
|
To retrieve multiple elements, it’s highly recommended to use the hlist_pat!
macro to deconstruct your HList
.
1 2 3 4 5 6 7 8 |
|
The Add<RHS>
trait is implemented for HList
so that you can simply call +
to append to an existing HList
1 2 3 4 |
|
To get the length of an HList, simply call its length()
method
1 2 |
|
It will be interesting to see what you can cook up with HList. As mentioned before, abstracting over arity allows you to do some really cool stuff, for example Frunk already uses HList to define a Validated
abstraction to help accumulate errors over many different Result<T, E>
(we’ll go through this in another post):
1 2 3 4 5 6 |
|
So please check it out, take it for a spin, and come back with any ideas, criticisms, and PRs!
ValueEnum
, as well as an integration with the Circe JSON library.
Points of interest:
The 1.4.0 release page on Github has a more detailed list of changes, but we’ll specifically go through:
What is a ValueEnum
? It’s an enum that represents a primitive value (e.g. Int
, Long
, Short
) instead of a String
. I may have just made up the term, but it doesn’t matter as long as you know what I mean.
1 2 3 4 5 6 7 8 |
|
This may sound mundane, since you can already build something like this yourself with the standard library’s Enumeration
(or previous versions of Enumeratum ), but sometimes the most straightforward solutions are suboptimal.
Enumeration
The standard lib’s Enumeration
comes with the notion of a customisable id: Int
on each member, which is a great starting point for implementing numbers-based enumerations.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
|
This funny behaviour is caused by the fact that Enumeration#Value
s (First
, Second
, Third
, Fourth
) are not checked for unique ids at compile time, and are instantiated when their outer Enumeration
object is lazily instantiated. When a Value
is instantiated, its id
is stuffed into a HashMap[Int, Value]
after an assertion check that the id does not already exist in the map.
What has happend in the above example is the enumeration code compiles, but when we call Things.First
, object Things
gets instantiated, and throws an assertion error when val Fourth
is being instantiated with an id of 3, which has already been assigned to Third
and thus is already in the aforementioned HashMap
. This prevents the singleton Things
from getting instantiated, and the next time you try to use it, Scala will throw a NoClassDefFoundError
.
One way to work around this is to write tests for every such Enumeration
to make sure that no one working in the code base has fat-fingered any ids. I’m a big proponent of writing tests, but tests are also code and come with a maintenance and cognitive cost, so I would prefer not having to write tests to make sure my simple value enums can be safely initialised.
This kind of problem is not limited to Enumeration
: careless implementation of something similar may result in arguably freakier outcomes such as silent failures (2 members with the same value but only one of the members can be retrieved by value).
ValueEnum
In version 1.4.0 of Enumeratum, we’ve introduced 3 pairs of traits: IntEnum
and IntEnumEntry
, LongEnum
and LongEnumEntry
, and ShortEnum
and ShortEnumEntry
. As their names suggest, these are value enum traits that allow you to create enums that are backed by Int
, Long
and Short
respectively. Each pair extends ValueEnum
and ValueEnumEntry
. Note that this class hierarchy is a bit extensive for now, and it may be more streamlined in the future.
This is an example of how you would create an Long
based value enum with Play integration (JSON readers and writers, Query string binders, Path binders, Form formatters, etc):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 |
|
The findValues
method of ValueEnum
s works similarly to the findValues
method of Enumeratum’s older Enum
, except the macro will ensure that there is a literal value
member or constructor for each enum entry and fails the compilation if more than one member shares the same value.
As the above example demonstrates, there are Play (and standalone Play-JSON) integrations available for this new kind of enum, as well as for UPickle, and Circe.
~~Note that this new feature is not yet available in Scala 2.10 and in the REPL due to Macro expansion differences~~ (update: now works in the REPL and is available for 2.10.x!).
Enumeratum 1.4.0 also adds support for serialising/deserialising to JSON using Circe, an up-and-coming performant and feature-filled JSON library published for both JVM and ScalaJS.
This is how you would use Circe with Enumeratum’s Enum
(integrations for ValueEnum
also exist)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 |
|
Hopefully, Enumeratum’s new ValueEnum
implementations will make development easier and safer for engineers out there who need to use value enumerations. Since uniqueness is checked at compile-time, you can save yourself the trouble of writing a bunch of pedantic tests. Circe is a promising JSON library that was really easy to integrate with and I look forward to taking advantage of the fact that it works on both server side and on the front end.
As always, if you have any problems, questions, suggestions, or better yet, PRs, please do not hesitate to get in touch on Github.
We will build on the foundations from the previous post and continue with the usage of Akka Streams, modeling our application as a series of small transformations that are run asynchronously, with backpressure handled automatically.
Previously, our app could be represented by a somewhat trivial flow chart that nonetheless had all the elements of a useful Akka stream: a Source
, multiple transformations, and controlled side-effecting.
To build our face detector, we will add the following:
Our updated flow chart is as follows (new transformations are highlighted by a light green rectangle):
To convert a given Mat
to a greyscale Mat
, we can make use of the OpenCV method cvtColor
. The only slight niggle is that the method isn’t idempotent: if you try to convert a greyscale image to greyscale, the method will throw. No matter, we can try handle that scenario ourselves by detecting the number of channels in the matrix.
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
However, since we want to pass the original colour image and the new greyscale image down the pipeline, we’ll make things a bit easier for ourselves by defining a simple WithGreyscale
case class to hold both:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
To find faces in the images in our video feed, we will make use of Haar feature-based cascade classifiers, which are supported directly by OpenCV. Haar Cascade classifers define how to look at an image and quickly identify any areas in it that are of interest to us. A given classifier definition will usually contain multiple stages, so that a region is considered to test positive if all features in all stages of the definition return positive (thus cascade).
In actual usage, this relies on careful training and tuning of classifier defintions, as well as a combination of clever mathematics and pragmatic optimisation for detection. I will not cover exactly how they work in this tutorial (my understanding is dubious and there is a wealth of information online about them), but the following are a couple links that really helped me understand the theory behind them and how they work in practice:
OpenCV’s Haar Classifier API (or perhaps JavaCV’s wrapping of it) is fairly straight forward and boils down to:
CascadeClassifier
, passing in a path to a classifier definition (you can find some here) as a constructor argumentRectVector
, which is aptly named because it is a wrapper for a native vector of rectangles.RectVector
to the CascadeClassifier
’s detectMultiScale along with a greyscale image and some other options (yes, OpenCV will mutate the RectVector
you pass in by adding in Rect
s)In our implementation of a face detector, we’ll wrap a few raw (but aliased) primitives that serve as option flags in OpenCV, just for our own sanity. We’ll also create a delegator class that has a detect(withGrey: WithGrey): (WithGrey, Seq[Face])
method and wraps the classifier to hold constant values for the classifier options because for our purposes, those won’t be changing on the fly.
1 2 3 4 |
|
1 2 3 4 5 6 7 |
|
1 2 3 4 5 6 7 8 9 10 |
|
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
To be clear, there is really nothing face-specific in our classifier because what it detects is entirely dependent on the Haar cascade XML file passed to it on construction.
Once we have a list of rectangles that denote where our objects are in the image matrix, the last thing we need to do is draw the rectangles on the original image matrix. OpenCV provides a rectangle
method that takes a Mat
and two points denoting the top left and bottom right corners of a rectangle and draws the rectangle to the matrix it in-place. Here again, our implementation will clone the matrix first before calling the OpenCV method so as to keep our code easy to reason about.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 |
|
Our FaceDrawer
will expose adrawFaces
method that takes a WithGrey
with a list of detected Face
s and use the above method to draw rectanges around each face. We’ll also make use of OpenCV’s putText
method to write the word “Face” along with a number right on top of the rectangle.
We’ll hook up all our components in a simple Swing app. To make things a little more interesting, the app will consist of 2 frames:
resources
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 |
|
Notice that once again, the code defining the Akka Flow Graph maps almost one to one to our flow chart.
We now have a face detector that uses OpenCV’s Haar cascade classifier toolbelt and draws rectangles around any identified faces, and we made it by expanding on the Akka Stream foundations laid in the previous post. As before, the code for this tutorial can be found on Github.
In the next post, we’ll expand this further by classifying the faces that we’ve detected as smiling or not using a supervised machine-learning model. We could of course continue to use Haar cascades to identify smiles in our feed (we can simply choose to load a smile Haar cascade classifier file), but what would be the fun in that ? :)
project/plugins.sbt
. Having handled the issue of getting the proper dependencies into a project, we can turn our attention to actually using the libraries to do something cool.
This post is the beginning of a series, where the end goal is to build a smile detector. Akka and OpenCV will be used, with Spark joining later on to complete the buzzwords treble.
A well-rounded and fun first step is to get a video feed from a webcam showing on our screen. To do this, we will cover a variety of things, including how to define a custom Akka Source
, how to use JavaCV, and some basic OpenCV image manipulation utilities.
Many of the OpenCV tutorials floating around on the interwebs use a procedural approach; perhaps because it better fits the programming language of the tutorial, or for performance. In this series of posts, we will instead adopt a stream processing model, specifically in the manner of Reactive Streams.
There are many benefits of using the Reactive Stream model (this blog post, and this slide deck by Roland Kuhn are great places to start reading), but the main ones I feel are relevant for us are:
Simplicity: by turning data processing into a series of simple stateless transformations, your code is easy to maintain, easy to change, and easy to understand: in other words, it becomes agile (relax: your code, not your team…).
Backpressure: Reactive Streams implementations ensure that backpressure (when downstream transforms take too long, upstream is informed so as to not overload your system) is handled automatically
Asynchronous: Reactive Streams are run asynchronously by default, leaving your main thread(s) responsive
In Scala, Akka-Streams is the defacto implementation of the Reactive Streams spec, and although it is labelled experimental, its adoption looks imminent (for example, there is already a Play integration and the innards of Play are being rewritten to use Akka-Http, which is based on Akka-Streams). Another nice Reactive Streams implementation in Scala is Monix, which offers a (subjectively) cleaner interface that is more familiar for people who come from RxScala/RxJava.
For the purposes of this tutorial, we will be using Akka-Streams because it seems to have higher chances of wide-spread adoption.
Note that this tutorial was written based on an experimental version of Akka streams.
Asides from wrapping OpenCV, JavaCV comes with a number of useful classes. One such class is CanvasFrame
, which is a hardware-accelerated Swing Frame implementation for showing images. CanvasFrame
’s .showImage
method accepts a Frame
, which is the exact same type that OpenCVFrameGrabber
(another useful JavaCV class) returns from its .grabh()
method.
Before showing the image, we will flip the image so that the feed we see on screen moves in the direction we expect. This requires us to do a simple transformation to a Mat
, a wrapper type for OpenCV’s native matrix, do the actual flipping of the matrix, convert the Mat
back into a Frame
, and then show it on the CanvasFrame
.
In short, our pipeline looks something like this:
As the diagram suggests, the first thing we need is a Source
that produces Frames
; in other words, a Source[Frame]
.
The OpenCVFrameGrabber
API for grabbing frames from a webcam is fairly simple: you instantiate one passing in an Int
for the device id of the webcam (usually 0), optionally pass some settings to it, and then call start
to initialise the grabber. Afterwards, it is simly a matter of calling .grab()
to obtain a Frame
.
1 2 3 4 5 6 7 8 9 10 |
|
In order to create an Akka Source[Frame]
, we will make use of the Akka-provided ActorPublisher
class, which provides helper methods that specifically make it easy to send data only when there is downstream demand (this is how backpressure is automagically handled).
In the actor’s receive
method, we match on
Request
message type, which use to then call emitFrames()
Continue
object, which also calls emitFrames()
Cancel
in order to know when to stop the actor.The emitFrames()
method is a method that checks to see if the Actor is currently active (whether it has any subscribers), and if it is, grabs a frame and sends it to the onNext
helper method from ActorPublisher
to send a piece of data. It then checks if totalDemand
(another ActorPublisher
method) is greater than 0, and sends itself a Continue
message, which invokes emitFrames()
again. This somewhat convoluted way of sending data downstream is required because grabber.grab()
is a blocking call, and we don’t want to block the Actor threadpool for too long at a time (this pattern is used by the built-in InputStreamPublisher
).
In order to make a Source[Frame]
, we instantiate an instance of our actor, pass its ActorRef
to a method that creates a Publisher[Frame]
, and then pass the publisher to a method that makes a Source[Frame]
.
For the purposes of keeping our API clean, we make it a private class and expose only a static method for creating a source.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 |
|
We’ll also define a simple Dimensions
case class to make things a bit clearer (keyword arguments FTW)
1 2 3 4 |
|
In order to begin processing our feed with OpenCV, we first need to transform our Frame
, which is a JavaCV type, into a type that works with JavaCV’s wrapping of OpenCV’s main representation of images, the matrix, aka Mat
. Fortunately, JavaCV has a OpenCVFrameConverter.ToMat
helper class that helps us do this. Since the class uses a mutable private field for holding on to temporary results, it normally isn’t advisable to use it in multithreaded code unless we make new copies of it each time, but we can make it thread safe by binding it to a ThreadLocal
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
|
Once we have our Mat
, we can use OpenCV methods to do manipulation. One thing though, is that (perhaps for efficiency) by default, these methods mutate the original object. This can cause strange issues in a multi-threaded, multi-path Flow graph, so instead of using them as is, we make use of the convenient clone
method before doing our flip so that the original matrix remains as-is.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Now that we have all our components, all we need to do is create a simple application that instantiates all our components and hooks them all together:
ActorSystem
and Materializer
CanvasFrame
Source[Frame]
Graph
by using our components to transform it1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
|
Looking at the code, one of the rewards of using the stream processing model over the procedureal approach might jump out at you: the near 1 to 1 correspondence that the graph definition has with our earlier diagram.
So, with that we should now have a very simple app shows what your webcam sees, flipped so that when you move left, the image moves with you. We’ve done it by declaring a custom Akka Stream Source
and transforming it a little bit before shoving it onto the screen.
In the next post, we will look at how to do something a bit more complex: face detection using OpenCV.
Note the code for this post is on Github
JavaCV, written by Bytedeco is a library that makes it more bearable to use OpenCV from JVM projects by providing a bunch of wrapper classes and logic around OpenCV (there’s a lot more to it, see their page for details).
Still, because JavaCV depends on JavaCPP for common and OpenCV C++ wrappers, and JavaCPP requires you to set your target platform (what platform you want to run on), I thought getting started could be easier still.
After taking a look at this Github project, I created an SBT plugin, SBT-OpenCV, that allows you to add just one line to your project/plugins.sbt
to begin playing around with OpenCV:
1
|
|
The following is a list of SBT setting keys that you can set in order to customise the behaviour of the plugin:
1 2 3 4 |
|
I think javaCVPlatform
is the one that will be most interesting, since you may want to compile JARs for different target platforms; for a list of supported strings, look at the classifiers supported by JavaCPP presets, or work out the different strings that can result from the JavaCPP Loader.
For example:
1
|
|
Feel free to try it out and submit issues, ideas, and PRs at the Github page :)
]]>After looking around I began suspecting that Play comes with the ability to be slimmed down. By combining the String Interpolating Routing DSL and Compile-time dependency injection of Play 2.4, I was able to build a Scala app that would give Sinatra a run for its money in terms of the whole brevity thing.
All I did was:
$ activator new slim-play play-scala
)AppLoader.scala
file in the ./app
directory, which holds an ApplicationLoader and the router, which is
super simple:
<div class=’bogus-wrapper’></pre></td><td class=’code’><pre>import play.api.ApplicationLoader.Context
import play.api._
import play.api.libs.concurrent.Execution.Implicits._
import play.api.mvc.Results._
import play.api.mvc._
import play.api.routing.Router
import play.api.routing.sird._
import scala.concurrent.Future
class AppLoader extends ApplicationLoader {
def load(context: Context) = new BuiltInComponentsFromContext(context) {
/**
* Simple & fairly self-explanatory router
*/
val router = Router.from {
// Essentially copied verbatim from the SIRD example
case GET(p"/hello/$to") => Action {
Ok(s"Hello $to")
}
/*
Use Action.async to return a Future result (sqrt can be intense :P)
Note the use of double(num) to bind only numbers (built-in :)
*/
case GET(p"/sqrt/${double(num)}") => Action.async {
Future {
Ok(Math.sqrt(num).toString)
}
}
}
}.application
}
</pre></td></tr></table></div></figure></notextile></div>
4. Add play.application.loader=AppLoader
to ./conf/application.conf
so that Play knows to load our custom app (that
contains our simple router)
The end result is a small, one-file Play app powered by a custom router and compile-time dependency injection. For more information, take a look at the slim-play repo on Github.
Play is an awesome framework; scalable, idiomatic (type-safe, threadsafe), well documented, and well supported by Typesafe and a great community. I’ve been happily using it to build various-sized apps for the better part of 2.5 years. If you want to have a well-structured app, it comes out of the box configured to provide that. However, it also has the surprising ability to shed weight and turn into a slim API-focused engine.
Ruby is fairly ubiquitous when it comes to server-side web programming. Rails aside, Sinatra has made its mark on the world and made a name for itself as the DSL to mimic, with imitators in Ruby (Cuba), Python (Bottle, Flask), PHP (Laravel), Scala (Scalatra and its wrapper Skinny), and Javascript (Express). Thanks to its simple and easy to follow DSL routing, it’s gained a large following as well.
That said, blindly copying Sinatra’s DSL in other languages may be problematic, because Sinatra’s DSL relies on the Rack execution model (one request at a time per process/thread), and embraces Ruby’s spirit of developer happiness at the cost of performance. This is especially true in Scala, where the language was designed for concurrency and the community places heavy emphasis on adhering to a non-blocking execution model, eschewing mutation of data.
For example, I filed an issue with Scalatra a few months ago that was largely caused by indiscriminate copying of Sinatra’s DSL, as well being based on the Servlet async API (an intro to why we should move away from Servlets). Among other things, it led to:
Any
as the result of a route definition, including…yes, shutting down the Servlet container. In addition, it encourages you to mutate existing data (setting statuses on responses).If you’re coming to Scala from Ruby and what you want is to build a small app using Sinatra-esque DSL in Scala, I would highly suggest evaluating Spray or slim-Play (as presented here) before choosing to go with Scalatra and friends: “Thar be dragons” in the long-run.
]]>Enumeration
that’s provided out-of-the-box. This is especially true if you have colleagues who come from a Java background and yearn for the Java-style Enum
that gave them lots of power and flexibility.
A quick search on the internet for “Scala enumeration alternative” will yield a lot of results (perhaps on StackOverflow) where people have cooked up their own implementation of enumerations, usually built on sealed traits
. Personally, I found most of them to be either too inconvenient to use, too over-powered, or too complicated, and I really didn’t want to have to copy-paste enum-related code into all my projects.
Thus Enumeratum was born.
Enumeratum aims to be simple to use, idiomatic, small (LoC), yet flexible enough to allow Scala devs to make power enums if they so wish. It is also Mavenised for easy import into any project.
To use it, simply add it as a dependency
1 2 3 4 5 |
|
Then
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
|
You get nice things like exhaustive match warnings at compile-time, enums with methods, no more Enum-value type erasure, and other nice stuff.
Some of the solutions for custom enums out there are based on macros that reflect at compile time using knownDirectSubclasses
to find enum values, but as of writing, there is a 2 year old bug for that method.
As a result, Enumeratum uses another method of finding enum values: looking in an enclosed object to find the enum values. The macro behind findValues
does this for you so that you don’t have to maintain your own collection of enum values, which is both error-prone and tedious.
If you want to use Enumeratum in a Play app, you may as well add enumeratum-play
as a dependency instead so that you can use the PlayEnum[A]
trait (instead of Enum[A]
), which will give you nice things like QueryStringBinders, PathBinders, form mappers, and Json Reads/Writes/Formats. To make use of this integration, just extend from PlayEnum
instead of Enum
in the above example.
This means less boilerplate in your project, which is A Good Thing, right?
There are a few limitations with Enumeratum:
Ordering
in your companion object for your sealed trait.~~val myPhone: Phone = Iphone
)withName
relies on the toString
method of the Enum values for lookup. Make sure to override this if you have specific requirements.~~Update 2016/04/22 Crossed out a bunch of limitations that no longer apply.
I hope Enumeratum can help you out of your Enumeration
woes. Have a look, play around, and send a PR or two !
In Scala, things are not so simple, but with the introduction of quasiquotes and some refinements brought by Scala 2.11, things are smoother. Still, for a guy like me, the documentation was both sparse and DRY. Since I learn best when I’m actively engaged in building something, I decided to try writing the run-of-the-mill unless-when macros in Scala.
This post aims to summarise my journey towards implementing unless-when and hopefully along the way make Scala macros accessible, at least at an introductory level, for Most People. There are already a few Scala macro blog posts out there but another one can’t hurt.
Note: this blog post aims to explore macros as they are usable in Scala 2.10+. It also focuses on implementing macros with quasiquotes, as using them is more human-friendly than manually constructing Abstract Syntax Trees (AST).
For those unfamiliar with when
and unless
: the basic idea is that when
is an if
without an else, and unless
is it’s opposite. The main reason for their existence is to make code more readable by adding a tiny bit of syntatic sugar. Without further ado, an example of what we want to achieve
1 2 3 4 5 6 7 8 9 10 11 12 13 |
|
Since we’re writing Scala, it would be nice if these constructs returned something useful; using the Option monad seems reasonable: If the block is run, we return the result in a Some and otherwise return a None. This tutorial is a good guide for Options in case you are unfamiliar with the concept.
Taking a look at the documentation, you will quickly notice the general pattern for implementing a simple Scala macro
1 2 3 4 5 6 7 8 9 10 |
|
What does this mean? Let’s break it down:
import scala.language.experimental.macros
and import scala.reflect.macros._
are standard Scala imports that allow us to play around with macros. What’s not listed in this example is the declaration that your project depends on scala-reflect
. You can do so by adding the following to your build.sbt:
libraryDependencies ++= Seq("org.scala-lang" % "scala-reflect" % scalaVersion.value)
def meth[A](x: A): A
this is still just normal Scala code that we would normally see. It simply declares a method belonging to the Example singleton that is parameterised on the input type, and we want to make sure that the output type matches this type (e.g. if we invoke meth
with an Int
, we expect the output to be an Int
because that is the contract of the method). For more info on writing parametric polymorphism, please check out this guide:macro implRef[A]
this is where things start looking macro-ish. The macro
keyword lets the compiler know that the body of this method is going to be implemented via a macro definition, in this case implRef
.def implRef[A: c.WeakTypeTag](c: Context)(x: c.Expr[A]): c.Expr[A]
.. wow. This itself needs to be broken down:
def implRef[A: c.WeakTypeTag]
The first part def implRef
is still standard Scala(c: Context)
(we’ll cover [A: c.WeakTypeTag]
in a bit). In this part, (c: Context)
declares that the first argument passed to the macro implementation must be a Context. This is a requirement for playing around with Scala macros, and is actually passed by the compiler when it invokes macro expansion, so that you can write code that accesses the compiler API.[A: c.WeakTypeTag]
This is a bit mischievous because we combine Scala-shorthand for typeclasses with macro-magic. This probably deserves a post in and of itself, but for now, please consider this to mean “A is a type parameter passed during macro invocation, but we must ALSO have in scope a WeakTypeTag coming from the Context that is parameterised to type A, which can be written in full as c.WeakTypeTag[A]”. This WeakTypeTag business is required so that we can pass along the type parameter from meth
into the implRef
macro expansion implementation, allowing us to have a type parameterised macro definition.
(x: c.Expr[A])
means that the first non-Context parameter of the macro implementation (remember that the first one is always taken by the compiler and must be a Context) is x
and it is a c.Expr[A]
. It is important that the name of the parameter matches that used in the invoking method (see how meth
also has x
as the first parameter). c.Expr
is type of object that wraps the abstract syntax tree that represents the input to the invoking function, and it is typed to A.
c.Expr
(essentially an abstract syntax tree), any expression passed to the method meth
actually may not get invoked or evaluated even though it is not a pass-by-name parameter. In other words, while the macro is expanding, it acts like a pass-by name parameter and is “lazy”.: c.Expr[A]
all this means is that the result of the macro expansion is also a c.Expr
type parameterised to A.Quasiquotes are not a Scala-exclusive construct, and a Google search will show that they are used in other languages that support metaprogramming, like Scheme.
In short, they offer the macro programmer an easy way to manipulate or create abstract syntax trees without having to build them manually. This makes them extremely helpful in Scala because:
1. Scala syntax does not map to ASTs easily like Lisps
2. Scala is typed, which means your manually-built AST also needs typing…which wraps non-macro-land types (notice how a normal type parameter like [A]
becomes c.Expr[A]
… that’s twice as many characters !)
Quasiquotes allow us to use string-interpolation-like syntax to interpolate elements into a tree as we define it.
For example:
1 2 3 4 5 |
|
The above example was taken from the official documentation on quasiquotes, which I highly recommend you take a look at if you find the rest of this post hard to follow.
For when
, we know that we roughly want the following:
1
|
|
To expand via our macro into the following (yes we are using an inline if .. if you don’t like it, pretend we didn’t)
1
|
|
Using what we know, the following should work:
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Implementing unless
is left as an exercise for the reader :)
Putting the above into a Scala REPL (you will probably need to use :paste
mode) will prove that it works.
For example:
1 2 3 4 5 |
|
Also, remember that since our when
is backed by a macro, the f
argument (our block) passed to the second parameter list, behaves “lazily” and won’t execute if our predicatep
returns false. This is because when when
is invoked, the compiler knows to pass the entire AST for that block parameter (well, wrapped inside a c.Expr
) to our macro, which interpolates the it into the final tree.
For the performance-conscious, this means that we get “lazy” for free; that is, without using Scala’s call-by-name parameter feature, which, although nice to use in many cases, does incur some run-time performance penalty because it is implemented by instantiating anonymous classes (see this paper for more information about the performance cost of call-by-name parameters .. among other performance-related Scala things).
I’ve put the above into a library and included trailing variants of when
and unless
as bonuses (Rubyists should be familiar with these).
You can find the lib here on Github. It is fully tested and Mavenised for easy out-of-the-box usage.
I hope this post has been helpful in giving a simple, but full example of how to get started with macros in Scala. If you spot any errors, have questions or suggestions, please feel free to leave a comment!
]]>The original version of Schwatcher allowed you to tell a MonitorActor
what callback you want to fire when a certain type of event happened on a file path. This is fine and there are people out there using it in production as is. The limitation to this approach is that (at least by default), the events are difficult to treat as data and thus difficult to compose.
With Rx, we turn file path events into an asynchronous stream/channel. Essentially, you tell a RxMonitor
object what path and event type you want to monitor and when an event happens, it will get pushed into its observable
(the stream). You can then choose to filter, map, or fold over this data stream, creating new data streams. If you wish to cause side-effects, you can add one or more observer
s to these data streams.
Note: this blog post applies to v0.1.3 of Schwatcher, which uses v0.18.1 of RxScala. Future versions may introduce breaking changes that invalidate the examples in this blog post.
Suppose we have the following directory structure:
1 2 |
|
Let’s set up an RxMonitor
object to monitor for file creation and modifications events in directory1
(note: all operations on RxMonitor
objects are thread-safe).
While we’re at it, let’s grab the base observable
from the monitor as well. Note that this Observable
will, according to the registerPath
and unregisterPath
calls made to its parent RxMonitor
, push all EventAtPath
s to its Observer
s. More on what an Observer
is later, but for now, think of an Observable
as a data stream and an Observer
as an object gets pushed new objects from the Observable
that it is, well, observing.
1 2 3 4 5 6 7 8 9 10 11 12 |
|
Let’s create 2 more Observables
. Let’s make one called createsOnly
that will only care about create events in the directory and another one called scalaSourceCreatesOnly
that only cares about create events for files ending in .scala
. Notice that we’re composing here :)
1 2 |
|
Now, let’s create some basic Observers
that we can pass to the subscribe
method of our new Observable
s. An Observer
at minimum implements an onNext function, which takes an element that will be pushed to it from the Observable
that it subscribes to and returns nothing (Unit
). It may optionally implement onError (a function which takes a Throwable
as an argument and returns nothing) and onCompleted (0 argument function that is called when the Observable
it is subscribed to is finished and will no longer send further objects):
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
|
Now let’s make stuff happen in another terminal.
1 2 3 4 |
|
The following will be outputted
1 2 3 4 5 6 7 |
|
Lastly, since we’re done, let’s call the stop()
method on the RxMonitor
object so that subscribed Observers
are notified and we stop the underlying MonitorActor
as well. Cleaning up is A Good Thing (TM).
1
|
|
I hope this post has demonstrated the power of using RxScala’s Observable
as an abstraction of asynchronous events into a tangible data structure, and how using it through Schwatcher might simplify the process of building your own applications. If you have any questions or spot any mistakes, please feel free to leave a comment.
This version brings a new Observable interface that exposes a “stream” (or channel) of EventAtPath
s that can be composed. Using this interface, you no longer need to register callbacks - you simply register paths and get notifications for events on them either by subscribing to the Observable or by composing.
For more information on how to use Observables (especially how they compose in awesome ways), checkout the Rx homepage
Example usage:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
|
Relevant links: - Github page with how to install and example usage - Release page
]]>Changes:
Relevent info:
]]>