In this post I’d like to talk about some aspects of Akka persistence (being still experimental at the moment of writing). The official documentation does a good job explaining the concept behind the Akka persistence and how to make an actor to be persistent. However I noticed one thing to be confusing to many people when dealing with akka-persistence at the beginning. Persistent View. Before we talk about it let’s start off with the pattern that will help to understand persistent views better.
CQRS (Command Query Responsibility Segregation)
The idea of CQRS pattern is to use a different model to update data and the model to read data instead of more classical way doing CRUD (create, read, update, delete) operations at the same time and place. In terms of CRUD, create, update and delete are Commands and read is Query. One of the architectural patterns CRQS fits with is Event Sourcing (you can read how Akka persistence implements this idea in the official documentation). Before looking at persistent views let’s take a look at non persistent actor first.
CQRS with Non Persistent Actors
Let’s imagine that there is an actor which keeps transactions in its internal state. We can use the same actor for storing new transactions and reading them. One of the issues with such a design is need of having absolutely different representations to display. For instance we would like to show all invalid transactions on the administration panel and the current balance for particular account. The main issue with this is not the code becoming more complex. If the actor is in the middle of updating the state it cannot be queried what makes the system less responsive. To separate commands and queries most likely we will end up with more than one actor (each with its own state reflecting changes from the main actor). It will work pretty well until the main actor (with all the transactions) goes down. When this happens the state is lost forever. The other actors can still function but the data will stay out of sync.
Persistent Views and CQRS
To make it clear the Persistent View is a Query part of CQRS. The persistent actor is used for managing state (persisting events, deleting messages, restoring from the journal, saving the snapshots, etc.). The persistent views are polling message from a persistent actor’s journal directly (instead of being coupled with the persistent actor itself). All they have to know is identifier of the persistent actor. If the persistent actor dies it does not affect the persistent views as they still can serve data from the journal. When the persistent actor is recovered the persistent views will become consistent eventually. Moreover the persistent view can use another sources to optimize data presentation. Basically it’s as simple as that. Use persistent actors for handling commands and persistent views for queries.
Persistent Views in Action
The working demo project can be found here. The source code consists of 3 files where one of them is the persistent actor and another two are the persistent views. TransactionActor is a persistent actor that persists transactions. All the machinery in the source code is about maintaining and recovering state. InvalidTransactionsView is a persistent view that shows all invalid transactions. You can see how it’s decoupled from the TransactionActor. Another persistent view is BalanceView. Instead of keeping the list of transactions it keeps a map where stores the current amount per account. In traditional approach both persistent views would be the methods of the same object. In case of separate views even if the persistent actor and one of the views die the remaining view will be able to process the queries.
I think Persistent Views play an important role when using event sourcing. They help to follow CQRS pattern more easily than ordinary actors. I would even say that they are as important as the persistent actors. I hope the official documentation will be more verbose about them and that this analogy with CQRS can help to develop intuition.
The Function object has been there since Scala 1.0 version. It provides some utility methods for dealing with higher-order functions. Despite the simplicity and usefulness of some of them I found that not only do many beginning developers not use it, but they don’t even know about them. In this post I would like to remind you about some of the functions I find useful in some cases.
chain
To get started let’s introduce 2 extremely simple functions from Int to Int. One of them, let’s call it inc, will increase a number by one and the other one, double, with multiply a number by 2.
Note: that for the both functions we could use shorter definition form like val inc: Int => Int = _ + 1.
So what do we do when there is a sequence of these functions and we want to combine them? One of the most popular option in my practice is the following:
1
scala>List(inc,double,inc)reduce(_andThen_)
This piece of code takes a sequence of functions and combines them using andThen method starting with the first one in the list and returning resulting function from Int to Int. It is by no means bad code. Personally I like it. Let’s see how it can be simpler with using chain function from the Function object:
1
scala>Function.chain(List(inc,double,inc))
It does exactly the same as the previous snippet but it might be more intuitive for the beginning developers. If we import all the functions from Function object first it will be dead simple:
We can’t just map the auth function over the user because the former expects 2 arguments and our user is one tuple of 2 elements. One of the options would be extracting id and name and pass them as the separate arguments to the auth function. It requires some boilerplate code to write. This is where we can use tupled function from the Function object. What it does is taking a function of let say 2 arguments and convert it to a function taking a tuple of 2 elements with the types of the initial arguments. That’s exactly what we need to map over the auth function:
There are tupled functions defined for tupling functions of arity from 2 to 5 inclusive. I rarely use tuples with elements more than 2 or 3 so I find it convenient.
Technically in the previous example the function tupled was called on the function itself (as it’s defined both in the Function object and Function* traits). Another “trick” that can be useful is mapping over a map in what many people would say a natural way. (Also it demonstrates using the tupled function from Function object and not defined in Function* trait). In Scala you can’t write code like this:
123456789
scala>valm=Map(1->"first",2->"second")scala>mmap{(k,v)=>s"$k:$v"}<console>:12:error:missingparametertypeNote:Theexpectedtyperequiresaone-argumentfunctionacceptinga2-Tuple.Considerapatternmatchinganonymousfunction,`{ case (k, v) => ... }`mmap{(k,v)=>s"$k:$v"}^<console>:12:error:missingparametertypemmap{(k,v)=>s"$k:$v"}
What many developers would do is something like this:
Maybe not so useful but helps to understand its application.
unlift
To illustrate the usage of unlift let’s write a function that takes an Int and returns Some(x) if x is equal or greater than zero and None otherwise:
1
scala>valf:Int=>Option[Int]=Option(_)filter(_>=0)
What unlift function does is it turns a function A => Option[B] into a PartialFunction[A, B]. It lets us to use our function in any place where partial function is required. To make it clear that’s how our function can be used to filter out the positive integers in the list:
I don’t use this on a daily basis but there are some cases like this when it comes very handy.
Bonus: there is an opposite function defined in PartialFunction called lift. To see how it is related to unlift function the following equation is always true:
12
scala>f==unlift(f).liftres8:Boolean=true
uncurried/untupled
These functions are the opposite to curried and tupled respectively. I didn’t see them used as much as the functions described above.
Although there is nothing new I found that it’s easy to forget about some useful API provided by Scala standard library. I hope this reminder is on time and can save you a couple of lines of code now or in the future.
And what it does is just calling to_proc method on the symbol :name (which returns a Proc) and converting the proc to a block with & operator (because map takes a block, not a proc).
The naive implementation of the Symbol#to_proc would look like this:
It’s all well understood and described over the Internet. To understand why Symbol#to_proc is a lambadass (and what it means) let’s move on to the kinds of Ruby Procs.
Different kinds of Ruby Procs
As you know there are two kinds of Ruby Procs - procs and lambdas. They not only differ in how they check arity and treat return keyword, but also look different in irb:
You can see (lambda) suffix displayed for lambdas only and something like context (irb):2 for both of them. It turns out that there is a third kind of procs which I call lambadass but let’s talk about lambda scope or context at first.
Scope
Procs and lambdas (which are objects of class Proc too) are closures like blocks. The only thing which is important for us is that they are evaluated in the scope where they are defined or created.
It means that any block (or proc or lambda) includes a set of bindings (local variables, instance variables, etc.) captured at the moment when it was defined. Simple example demonstrating this in action:
As method definition is a scope gate the only known binding with name x inside method z is the method parameter. To visually grab the context of defined lambda you can consider {} as constructor (not the lambda word).
It also means that the lambda defined inside method body knows nothing about any bindings defined out of the method scope:
12345678
irb>z=1irb>defxirb>lambda{z}irb>endirb>x.callNameError:undefinedlocalvariableormethod`z' for main:Object
Despite the fact that the lambda was called on the top level it was defined in the method where binding with name z didn’t exist. Once the scope or context is captured it remains the same inside the block no matter where it’s called from.
Lambadass
Lambadass is a proc or a lambda which looks similar to a normal proc or lambda but behaves differently.
So let’s back to the Symbol#to_proc. Usually it is used instead of block in such a method like map.
As to_proc method returns a proc what if we want to use it standalone as any other proc? Let’s do just that:
But wait a minute… Why does the returned Proc object look different? #<Proc:0x007fcfa305dca8> instead of #<Proc:0x007f944a090c50@(irb):1>? It’s still a Proc, it’s still a callable object but it’s missing something. Looking at the object representation I would say it’s missing a context. How can we check it?
Binding
All the bindings captured from the scope where a block is defined are stored in the Binding object. We can get it by calling Proc#binding method:
12
irb>lambda{}.binding=>#<Binding:0x007fcfa30363b0>
One thing we can do with Binding object is to evaluate any binding captured by block:
123
irb>x=1irb>eval('x',lambda{}.binding)=>1
or
123
irb>x=1irb>lambda{}.binding.eval'x'=>1
It will raise an exception if binding with such a name is not defined but every block (or proc) has an associated binding object.
1234
irb>lambda{}.binding=>#<Binding:0x007fcfa40ab808>irb>lambda{}.binding.eval'y'NameError:undefinedlocalvariableormethod`y' for main:Object
Meet the Lambadass
Now let’s try to get the binding object of the proc created using Symbol#to_proc:
Obviously there is something wrong with it. It turns out that Symbol#to_proc method is implemented in C in MRI (Matz’s Ruby Interpreter which is written in C). Of course it doesn’t make any sense to get the context of C level Proc object (would be nice though).
Let’s try another interpreters.
Rubinius
12345
rubinius>x=1rubinius>lambda{}.binding.eval'x'=>1rubinius>(lambda&:name).binding.eval'x'NameError:undefinedlocalvariableormethod`x' on name:Symbol.
We got the exception again. But it says that binding with name x is not defined. As Rubinius (at least Symbol#to_proc) is written in Ruby itself let’s look at its implementation:
It looks very similar to what we initially defined. So what’s the problem? Let’s look at the error message again:
1
NameError:undefinedlocalvariableormethod`x' on name:Symbol.
Of course there is no variable `x’ on the symbol :name! The key to understand it is that
1
lambda{}
is defined just here, where {} are, but
1
lambda&:name
is defined inside the Symbol class in the to_proc method which knows nothing about any bindings defined elsewhere Symbol object. As a callable object it behaves correctly but the scope is absolutely different.
To better understand it let’s take a look at the Binding object:
You can see that in the first case the module is Object and compiled code is block in irb where in the second output the module is Symbol and compiled code is to_proc method in the file kernel/common/symbol19.rb.
Of course if you wrap lambda &:name in another lambda the scope of this top lambda will be Object because it is not defined in Symbol anymore. Anyway the scope of the inner lambda will remain unchanged:
That’s what almost everybody I asked would expect. No errors, works identical. But if you remember self-written to_proc method, how scope is defined in Ruby and Rubinius implementation this behaviour should be wrong even if it seems the only working without big surprises.
Epilogue
There is a Proc. Sometimes it can be a lambda. The same object with different behaviour. Different from just a proc but the same across the interpreters at least. They called it lambda. They even created a new syntax for it. With current implementation of Symbol#to_proc we have third behaviour of Proc. Behaviour that differs across interpreters. I call it lambadass.
I always wanted Scala to have something like Ruby string interpolation. It’s a pretty small feature and somebody would definitely call it unnecessary syntactic sugar (e.g. Java has no string interpolation at all) but it always felt just right speaking about Scala. Starting in Scala 2.10 there is a new mechanism for it called String Interpolation (who would have thought!). The documentation can be found here with corresponding SIP here. It’s quite small overview so I would recommend to read it through. Although the documentation is clear (it’s not so much to be covered actually) I would like to highlight a few points.
String Interpolation is safe
There are three string interpolation methods out of the box: s, f and raw.
Let’s look at s String Interpolator in action with variables:
123
scala>valx=1scala>s"x is $x"res1:String=xis1
and with expressions:
12
scala>s"expr 1+1 equals to ${1 + 1}"res2:String=expr1+1equalsto2
So it works as expected (at least for a Ruby developer :)). Let’s try to interpolate a variable that doesn’t exist:
It just doesn’t compile at all! Very nice of the compiler, isn’t it? If you have read the documentation it’s not difficult to understand why it’s safe. If not we’ll touch this in the next section anyway.
String Interpolation is extensible
If you’re still asking yourself what is this s before string literal the answer is that processed string literal is a code transformation which compiler transforms into a method call s on an instance of StringContext. In other words expression like
1
s"x is $x"
is rewritten by compiler to
1
StringContext("x is ","").s(x)
First of all it explains why using string interpolation is safe (see previous section for example). Using nonexistent variable as a parameter for method call leads to not found: value [nonexistent variable here] error. In the second place it allows us to define our own string interpolators and reuse existing ones.
To see how it’s easy let’s create our own string interpolator which will work as s interpolator with added some debug info to the resulting string:
Creation and usage of simple Log Interpolator
1234567891011121314
importjava.util.Dateimportjava.text.SimpleDateFormatobjectInterpolation{implicitclassLogInterpolator(valsc:StringContext)extendsAnyVal{deflog(args:Any*):String={valtimeFormat=newSimpleDateFormat("HH:mm:ss")s"[DEBUG ${timeFormat.format(new Date)}] ${sc.s(args:_*)}"}}vallogString="one plus one is"defdemo=log"$logString ${1+1}"}
In the code above implicit classes and extending AnyVal (so called Value Classes) are also new features in Scala 2.10 which we’ll talk about later. Since any interpolator is in fact a method of StringContext class we can easily use them in our own ones (in the example we use s method forming the resulting string to not bother with implementing it in our new interpolator). The string interpolation
which is a nice combination of new Scala 2.10 features itself.
This new technique is useful writing more readable code, safe and allows to extend and combine existing functionality. The only limitation that it’s not working within pattern matching statements but it’s going to be implemented in Scala 2.11 release. I would call it String Interpolation with Batteries Included :)
If there are good news it means there should be some bad ones nearby. So bad news about Unicode support in Erlang is that it’s just impossible to use Unicode string literals in source files because Erlang compiler assumes they are Latin-1 encoded. Therefore in order to write something like "a∘b" in source code file you should use "a\x{2218}b" or even uglier [$a, 8728, $b] both of which are equal to the original string literal "a∘b". Even if you save the source file as UTF-8 the compiler still assumes it’s Latin-1 and there is no way telling the truth so far. Another thing that can be used is keeping Unicode string literals in separate files and reading them at runtime with built-in Erlang functions. (But hey, Swedish alphabet is covered by Latin-1 charset and it’s definitely better than bare US-ASCII :)).
Good News (near future)
Now then for the good news and all I can do is to quote decisions affecting Erlang releases R16 & R17:
The board decided to go for a solution where comments in the code (in the same way as in Python) informs the tool chain about input file encoding formats. This means that only UTF-8 and ISO-Latin-1 encoding will be supported. All source files can be marked as containing UTF-8 encoded Unicode characters by using the same mechanism (even files read using file:consult/1), namely formalized comments in the beginning of the file.
The change to the file format will be done incrementally, so that the tools will accept Unicode input (meaning that source code can contain Unicode strings, even for binary construction), but restrictions regarding characters in atoms will remain for two releases (due to distribution compatibility). The default file encoding will be ISO-Latin-1 in R16, but will be changed to UTF-8 in R17.
Source code will need no change in R16, but adding a comment denoting ISO-Latin-1 encoding will ensure that the code can be compiled with the R17 compiler. Adding a comment denoting UTF-8 encoding will allow for Unicode characters with code points > 255 in string and character literals in R16. The same comment will allow for atoms containing any Unicode code point in R18. From this follows that function names also can contain any Unicode code point in R18.
UTF-8 BOM’s will not be handled due to their limited use.
Variable names will continue to be limited to Latin characters.
It looks like the right decision overall for those who want to use characters out of Latin-1 character set in string literals.