3 Layer Scala Cake
(org-babel-do-load-languages
'org-babel-load-languages
'((ditaa . t)))
I've been using Free Monad in production for a while, as the projects scale and more members contribute, the boundary of different layer reveal them self very clearly.
Thanks to the summary from Matt Parsons' 3 layer Haskell Cake. I found my Scala code also happen to fell into these 3 layer cake.
To make the future project has more clear boundary of abstraction, I created 鸬鹚 It means cormorant for better taste of Free Monad mixing MTL and ReaderT.
Here it is the Scala version of Cake, prepare your folks please.
But before cutting the cake, let's wash our hands describe the most common program.
The Program
It will read output a log, query a data base, and do the calculation.
val xa = Transactor.fromDriverManager[IO]("org.postgresql.Driver", "jdbc:postgresql:postgres", "postgres")
def program : Int = {
val initState = 1
getLogger.log(s"init state is $initState")
val valueInDatabase = try{
sql"select 32".query[Int].unique.transact(xa) unsafeRunSyn ()
} catch { case _: Throwable => 0}
initState + valueInDatabase
}
Your real business logic might be fancier, but basically this is the most common case
- Something to ask
- Something to write
- Some effects on dependencies
- Some error to handle
- Some business computation
Anyway, this is really awful piece code in production.
First of all, how do you unit test such thing? Oh, wait, maybe not that awful, we just mock getLogger
and stub it's log
method
and then mock doobie's ConnectionIO
and stub transact
method.
Sounds reasonable, just 2 mocks and 2 stubs.
But think about it, here is just 2 line of effectful operation, count again such effectful code in production, how many mocks and stub you will need then?
Such way of unit testing is definitely not gonna scale.
Let us refactor in a baby step, with IO
Refactor
Take 1: IO
modeling your problem with IO
is easy, just wrap everything in IO
def program : IO[Int] = for {
initState <- IO(1)
_ <- IO(getLogger.log(s"init state is $initState"))
valueInDatabase <- sql"select 32".query[Int].unique.transact(xa)
.handleError{_=>0}
} yield initState + valueInDatabase
How the hell this is better than previous one?
Yes and No.
It's not better in the sense of testable, but, it's Pure
When we mention something that is Pure, it means that thing is Referential Transparent.
In human words, it means I can do this:
val logState = IO(getLogger.log(s"log something"))
def program : IO[Unit] = for {
_ <- logState
_ <- logState
_ <- logState
} yield ()
But you can't do:
val logState = getLogger.log(s"log something")
def program : Int = {
logState
logState
logState
}
Probably not the best example but you should get the idea, that I can place IO anywhere and that won't change the semantic and behavior of your program.
Fine, let's place those IO else where, so we can inject them into program
def program(getState: IO[Int], log: String => IO[Unit], queryDB: IO[Int]): IO[Int] = for {
initState <- getState
_ <- log(s"initState is: $initState")
valueInDatabase <- queryDB
} yield initState + valueInDatabase
Now it's better, in our test, we could simply test program
like
program(IO(1), (msg: String)=>IO(()), IO(10)) must_== 11
To verify our program can do the math.
But still, it won't scale, e.g. we need to query 3 value from db
def program(getState: IO[Int], log: String => IO[Unit], queryDB: IO[Int], queryDB2: IO[Int], queryDB3: IO[Int]): IO[Int] = for {
initState <- getState
_ <- log(s"initState is: $initState")
valueInDatabase1 <- queryDB1
valueInDatabase2 <- queryDB2
valueInDatabase3 <- queryDB3
} yield initState + valueInDatabase1 + valueInDatabase2 + valueInDatabase3
our program will end up very long parameters
Clearly IO isn't enough for more complex scenario, let's see what we can improve by adding another layer of abstraction
Take 2: ReaderT Pattern
The problem of Dependent Injection via parameters is limited and not scalable, when your program get bigger, eventually you will need to have sub programs and then you will find the dependency has to be passed all the way down to each sub program.
Here's the ReaderT pattern to help.
First we move all dependency out, let's model it as trait Env
trait Env{
val state: Int
def log(msg: String): IO(Unit)
def query[A](c: ConnectionIO[A]): IO[A]
}
Then we can move parameters of program
out as standalone methods:
def log(msg: String): ReaderT[IO, Env, Unit] = for {
env <- Kleisli.ask[IO, Env]
_ <- Kleisli.liftF(env.log(msg))
} yield ()
def doobieQuery[A](query: ConnectionIO[A]): ReaderT[IO, Env, A] = for {
env <- Kleisli.ask[IO, Env]
res <- Kleisli.liftF(env.query(query))
} yield res
These methods just return data type that describe that they need a Env
but not provided yet,
so you could put it anywhere you want, without knowing where exactly the instance of Env
is.
Finally, the program, without any parameters!!!
def program: ReaderT[IO, Env, Int] = for {
env <- Kleisli.ask[IO, Int]
initState = env.state
_ <- log(s"initState is: $state")
valueInDatabase <- doobieQuery(sql"select 32".query[Int].unique)
} yield initState + valueInDatabase
Retro
Let us retro the evolving progress of the type of program
Imperative
def program: Int
I'd name this layer Bare Metal. Here only exists raw values, 0 abstraction.
IO
def program(deps...): IO[Int]
Introduce a new layer of abstraction IO
, and I'd like to name it VM layer
It's better than Bare metal, but still low level abstraction.
when we need value, just run the IO layer
program(deps...).unsafeRunSync()
Effects are now Referential Transparent, but the way to inject and use effects is not scalable.
ReaderT
def program: ReaderT[IO, Env, Int]
ReaderT[IO, Env, Int]
consists 2 layers, IO
and Reader[Env, Int]
, this is the layer of Functional Programming
pure business, 0 effect, lazy
program // <- ReaderT[IO, Env, Int]
.run(new Env{
val state = 1
def log(msg: String) = IO(getLogger.log(msg))
def query[A](c: ConnectionIO[A]) = c.transact(xa)
}) // <- IO[Int]
.unsafeRunSync() // <- Int
We need to run this layer by layer, first Reader
, and then IO
And the time we run Reader
can provide all the dependencies.
ReaderT is pretty good "pattern" after all:
- Pure: effectful part is factor out of program into Env (Bare Metal), so program can be Pure and RT
- Modular: Dependency Injections are happened in Monad context, scalable in sense of easy to break program into smaller sub program
- Data Type: since ReaderT is just a Data Type, lots of benefits for free from ReaderT's typeclasses instances, such as
Monoid
,Applicative
,MonadError
…
Tagless Final is nothing but a fancy name of ReaderT
if we make some type alias for readerT, it's pretty much the same thing as the recent trending "design pattern" - Tagless Final
trait AlgInterp[F[_]] {
val state: F[Int]
def log(msg: String): F[Unit]
def query[A](c: ConnectionIO[A]): F[A]
}
type Alg[F[_], A] = ReaderT[F, AlgInterp[F], A]
def state[F[_]]: Alg[F, Int] = Kleisli(_.state)
def log[F[_]](msg: String): Alg[F, Unit] = Kleisli(_.log(msg))
def doobieQuery[F[_], A](query: ConnectionIO[A]): Alg[F, A]
def program[F[_]]: Alg[F, Int] = for {
env <- state
_ <- log(s"initState is: $state")
valueInDatabase <- doobieQuery(sql"select 32".query[Int].unique).handleError{_=>0}
} yield initState + valueInDatabase
val interp = new AlgInterp[IO]{
val state = IO(1)
def log(msg: String) = IO(getLogger.log(msg))
def query[A](c: ConnectionIO[A]) = c.transact(xa)
}
program[IO].run(interp).unsafeRunSync()
If you look close enough, here it actually becomes 3 layers:
- Layer 1: IO
- Layer 2: Alg ~> IO (state, log, doobieQuery)
- Layer 3: Alg (program)
|
Production| Test
+--------------+ |
|Layer 3: | |
| Alg +-------------+
|cGRE | | |
+------+-------+ | :run
|run | |
/--------------\ | /--------------\
|Layer 2: | | |Layer 2: |
| AlgInterp[IO]| | |FakeAlgInterp[IO]
|cRED | | |cRED |
\--------------/ | \--------------/
| | |
v | :
+--------------+ | |
|Layer 1: | | |
| IO |<------------+
|cBLU | |
+--------------+ |
|
both layer 2 and 3 are pure, but the different is,
- Layer 2 is just 1-1 mapping from IO to Alg
- Layer 3 is orchestration of Layer 2 for pure business
3 Layer Cake
We now have a solid 3 Layer Scala Cake base made of ReaderT
But you know, single flavor of cake won't satisfy everyone's taste.
The Needs of State
remember the 5 factors that compose our program?
- Something to ask
- Something to write
- Some effects on dependencies
- Some error to handle
- Some business computation
It has a missing part - Some state!
In a 5 lines of code program, you won't see a state is necessary.
In the real world, there are so many scenario needed a state
i.e. a user's login info
supposed that our program has a middleware, controller, repository layer
Usually we will need to get user's info in middleware, and use the user info in repository layer for some database query.
So, here is the case, since I want a modular code base, so these 3 layers should not be just single Alg[F]
, but 3
def middleware[F[_]]: Alg[F, User] = ???
def controller[F[_]]: Alg[F, Response] = ???
def repository[F[_]](user: User): Alg[F, DBResult] = ???
val program = for {
user <- middleware
dbresult <- respository(user)
response <- controller
} yield response
someday your controller become bigger and bigger and tech lead said there should be another layer - service, between controller and repository
val program = for {
user <- middleware
dbresult <- service(user)
response <- controller
} yield response
def service(user) = for {
dbresult <- respository(user)
result <- doSomthingWith(dbresult)
} yield result
then it will become a nightmare that you have to pass such thing all over your code base.
But that's exactly State Monad solved, no matter how many State monad you split, every piece always can share the exactly same state.
def middleware[F[_]]: StateT[F, User, Unit] = StateT.set[F, User](User("abc"))
def controller[F[_]]: StateT[F, User, Response] = ???
def repository[F[_]]: StateT[F, User, DBResult] = for {
user <- StateT.get[F, User]
dbResult <- findResourceInDB(user)
} yield dbResult
So our program don't have to passing around user as parameter everywhere
val program = for {
_ <- middleware
dbresult <- service
response <- controller
} yield response
val service = for {
dbresult <- respository
result <- doSomthingWith(dbresult)
} yield result
The new problem introduced by StateT
is, everything else(controller, repository, service) need to be StateT
as well. If we change them
to StateT
, then we will lose the effects of ReaderT
MTL
To able to use both ReaderT
and StateT
, MTL
Monad Transformer Library
is an elegant solution though.
MTL is like stacking those transformers together to F
, at the end, you will get something very nice:
def middleware[F[_]](implicit S: MonadState[F, User]): F[Unit] = S.set(User("abc"))
def controller[F[_]]: F[Response] = ???
def repository[F[_]](implicit S: MonadState[F, User]): F[DBResult] = for {
user <- S.get
dbResult <- findResourceInDB(user)
} yield dbResult
Such DSL is nearly ideal, where
- if you look closer to
contorller
's type, it doesn't have any info aboutUser
orMonadState
, because it shouldn't care about such things. middleware
andrespository
connected to each other via the implicit instance of
MonadState[F, User]
, which is perfect as well, no need to passing state around, request the implicit instance just in the place you need
it.
At the very end, provide the monad transformer stacks, so F
is finally:
program[Alg[StateT[IO, User,? ], A]]
// which expended to..
program[ReaderT[StateT[IO, User,? ], AlgInterp[StateT[IO, User,? ]], A]]
There's no perfect solution for all, only perfect solution for particular case. For these example, MTL is perfect. But when you have more effects, e.g. WriterT(to output something Monoid), EitherT(to provide MonadError) … the MTL stack is not very easy to reason about, DSL is still nice though.
program[ReaderT[WriterT[StateT[IO, User,? ], Chain[String], ?], AlgInterp[WriterT[StateT[IO, User,? ], Chain[String], ?]], A]]
Free Monad
Thus, the Free Monad will save your ass by providing monad for free if you have a Functor
via CoYoneda Lema you just need to provide a data type(case class) F[A]
, and you will get Functor for free.
.
The advantage of Free Monad is providing ability to create your own custom Monad Transformer, and it helps you stack them in a nicer way.
But let's try rewrite previous example first, since State
and Reader
are already data type, we could just stack(inject) them into a Free Monad.
import Free.{liftInject => free}
type Program[A] = EitherK[Reader[AlgoInterp[IO], ?], State[User, ?], A]
type ProgramF[A] = Free[Program, A]
Oh, since we have Free Monad, we don't need a reader monad to inject Interpreter, we can provide interpreter later for foldMap
.
Great, we save one effect then. Let's have another effect for demonstration i.e. Writer
type Program[A] = EitherK[Writer[Chain[String], ?], State[User, ?], A]
type ProgramF[A] = Free[Program, A]
Next you'll need to create two interpreters correspond to Writer
and State
effect.
def stateInterp(initState: User) = Lambda[State[User, ?] ~> IO[(Int, ?)]] { _.run(initState).value}
def writerInterp = Lambda[Writer[Chain[String], ?] ~> IO] { _.run(Chain.empty[String])}
val program = for{
initState <- free[Program](State.get[User])
_ <- free[Program](Chain.one(s"init state: $initState").tell)
_ <- free[Program](State.set(User("jcoy")))
res <- free[Program](State.get[User])
logs <- free[Program](Chain.one(s"next state: $res").tell)
} yield (res, logs)
program foldMap (writerInterp or stateInterp(User("anonymous")))
Guess what, the output will not be User("jcoy")
, instead, User("anounymous")
, also writer will only contain the second log.
Sorry I choose a bad example for Free Monad on purpose, in Monad Transformer, we need to run only once StateT
or WriterT
at
the end, thus the state maintained across all monads in program. But here each Free Monad in program will be mapped with interpreter.
So State
and Writer
run many times. Each time they start with empty value in the interpreter.
Cats provide us FreeT to cater such scenario, but again, transform one effect is OK, transfrom multiple effect is nightmare, it will bring the same problem of Monad Transformer, then what's the whole point of using Free again?
ReaderT + MTL + Free
Now that we know all approach has their own pros and cons, to eliminate their cons and magnify pros, the ultimate solution for abstracting effects is Why not all of them
recall the 3 layers from ReaderT patter:
- Layer 1: IO (AlgInterp)
- Layer 2: IO ~> Alg (state, log, doobieQuery)
- Layer 3: Alg (program)
The ideal solution is to apply these approaches in different layers where they are good at.
Free Monad is good at providing nice DSL so it naturally serves well in Layer 3.
We also need MTL to transform some stateful effects in, so it's best place should be Layer 2, to provide various combination to provide support for domain models.
Finally instead of interpreting Free Monad into IO directly, we can interpret Free Monad into ReaderT, so we get all pros from ReaderT pattern to inject dependencies seamlessly.
To ultilize all these approaches together, we can use luci to apply ReaderT, MTL and Free in the following 3 layer
- Layer 1: Binary
ReaderT[IO, Env, ?]
- Layer 2: Compiler
Effects ~> ReaderT[IO, Env, ?]
- Layer 3: Program
Free[Program, ?]
Similar to a real world program, we need to go through the same 3 layer to execute our program:
- write program in scala
- compile scala code to binary(jar file)
- run the jar in JVM with some parameters
But here we just keep doing the same thing to our EDSL Embedded Domain Specific Language
For demonstrating the power of ReaderT + MTL + Free, here we compose a program that contains all 6 factors:
- Something to ask
- Something to write
- Some effects on dependencies
- Some error to handle
- Some business computation
- Something stateful
Layer 3: Business (Free)
to create a EDSL for Business, just use the Effects
val program = for {
// 1. Somthing to read
config <- free[Program](Reader(identity[Config]))
// 2. Something to write
_ <- free[Program]((Chain.one("config: " + config.token)).tell)
// 3. Some effects on dependencies
response <- free[Program](
GetStatus[IO](GET(Uri.uri("https://blog.oyanglul.us"))))
// 4. Some error to handle, response is Either[Throwable, Response[IO]], hence MonadError instance
_ <- free[Program](response.ensure(new Exception("oops my website is down"))(_.status == Status.Ok))
// 6. Something stateful
_ <- free[Program](State.modify[Int](1 + _))
state <- free[Program](State.modify[Int](1 + _))
} yield state
from here we can easily tell that Program
should contain following effects:
type Program[A] = Eff5[
Http4sClient[IO, ?],
Writer[Chain[String], ?],
Reader[Config, ?],
State[Int, ?],
Either[Throwable, ?],
A
]
EffX
is predefined alias of type to construct multiple kind in EitherK in luci.
This is the Layer 3, which only has business DSL no IO Http4sClient is exception since the effect itself is a Tagless Final thanks to the Abstraction Barrier provided by Layer 2.
Layer 2: MTL + Interpreter
Layer 2 is like a VM, it connects the business and bare metal, here we can deal with stateful effect with transformer
i.e. the state compiler
implicit def stateCompiler[L](implicit ev: Monad[E]) =
new Compiler[State[L, ?], E] {
type Env = MonadState[E, L] :: HNil
val compile = Lambda[State[L, ?] ~> Bin](state =>
ReaderT(env =>
for {
currentState <- env.head.get
(nextState, value) = state.run(currentState).value
_ <- env.head.set(nextState)
} yield value))
}
the original MTL way is to implicitly find MonadState
from context
- implicit def stateCompiler[L](implicit ev: Monad[E]) =
+ implicit def stateCompiler[L](implicit ev: Monad[E], S: MonadState[E, L]) =
new Compiler[State[L, ?], E] {
- type Env = MonadState[E, L] :: HNil
val compile = Lambda[State[L, ?] ~> Bin](state =>
ReaderT(env =>
for {
- currentState <- env.head.get
+ currentState <- S.get
(nextState, value) = state.run(currentState).value
- _ <- env.head.set(nextState)
+ _ <- S.set(nextState)
} yield value))
}
But since we have ReaderT, I'll prefer
explicitly inject MonadState
to have everything explicitly managed in one place.
Layer 1: ReaderT
Once we compile the program, a binary is produced.
import us.oyanglul.luci.compilers.io._
val binary = compile(program)
Everything is automatically infer thanks to Shapeless, so you don't need to figure out what type binary
has, just follow the compiler.
Now we can safely run the binary with all dependencies explicitly
val args = (httpclient ::
logRef.tellInstance ::
config ::
stateRef.stateInstance ::
Unit ::
HNil).map(coflatten)
binary.run(args)
Don't worry about types and order since compiler will tell you where exactly the type of args you provided is wrong.
Anyway It's very easy to explain and compose args though!
- binary for
Http4sClient[IO, ?]
needsClient[IO]
to run, so herehttpclient
is instance ofClient[IO]
- binary for
Writer[Chain[String], ?]
needsFuntorTell[IO, Chain[String]]
to run, presented by meow-mtl.tellInstance
- binary for
Reader[Config, ?]
needsConfig
to run - binary for
State[Int, ?]
needsMonadState[IO, Int]
to run, which presented here by meow-mtl from.stateInstance
- binary for
Either[Throwable, ?]
needs nothing soUnit
is provided
Of course there is one more layer missing - Layer 0, if you focus enough you will find out binary.run(args)
will return IO
|
Production| Test
+--------------+ |
|Layer 3: | |
| Program +------------+
|cGRE | | :
+------+-------+ | |compile
|compile | v
/--------------\ | /--------------\
|Layer 2: | | |Layer 2: |
| Compiler | | | Fake Compiler
|cRED | | |cRED |
\--------------/ | \------+-------/
| | |
v | :
+--------------+ | |
|Layer 1: |<-----------+
| Binary | |
|cPNK +------------+
+------+-------+ | :
| run(args) | |
v | run(fake_args)
+--------------+ | |
|Layer 0: | | |
| IO |<-----------+
|cBLU | |
+--------------+ |
|
Run the last layer 0 binary.run(args).unsafeRunSync()
then all effects will be touching your bare metal.
Recap
There's never one solution that fits all problem, when should choose what really depends..
But overall if your business isn't that complex, the 3 layer ReaderT pattern is an reasonable complexitive of abstraction.
I also have a confession to make, that I was intentional making ReaderT + MTL so clumsy.
Actually with help of meow-mtl, one Ref[IO]
is good enough provide us stateful effects, just like how Free + MTL works.
ReaderT + MTL
- Something to ask:
ReaderT
- Something to write:
FunctorTell[IO, ?]
- Some effects on dependencies: Tagless Final style
- Some error to handle:
ReaderT
itself has instance ofMonadError
- Something stateful:
MonadState[IO, ?]
Free + MTL + ReaderT
- Something to ask:
Reader
- Something to write:
Writer
- Some effects on dependencies: custom Data Type and interpreter
- Some error to handle:
Either
has instance ofMonadError
- Something stateful:
State
There are a lot of awesome approaches that solve the exact problem though, if let me rank them: Sorted based on my experience/observation by difficulty ASC
- ReaderT
- Tagless Final
- ZIO Environment
- ReaderT + MTL
- Free
- Eff
- Free + MTL + ReaderT
My preference is to choose one that >= 3
for medium to large project,
for small project number of effects are predicable and managable <= 3
is quite enough.
No matter what your choice is, keep in mind the 3 layer cake, it will always help your structure an extensible, composable, testable and scalable project.
Footnotes:
It means cormorant
Monad Transformer Library
via CoYoneda Lema you just need to provide a data type(case class) F[A]
, and you will get Functor for free.
Embedded Domain Specific Language
Http4sClient is exception since the effect itself is a Tagless Final
Sorted based on my experience/observation by difficulty ASC