Quantcast
Channel: Grails Info
Viewing all articles
Browse latest Browse all 45

New Blog: Ratpack’s execution model, part 1.

$
0
0

A not well known aspect of Ratpack is its execution model.
It’s quite well known that Ratpack uses the non-blocking/asynchronous paradigm, but its execution model is more than this.
Asynchronous programming is full of traps (like any kind of programming), and Ratpack’s execution model avoids a certain class of these traps.

Let’s quickly cover what the implication of non-blocking is for a Ratpack app.

Let’s imagine an app that uses JDBC to read from a database.
JDBC calls are blocking.
You ask for some data by calling a method, IO happens between the client library and the database, then the result is returned to the method caller.
While the IO is happening, the executing thread is blocked and can’t be used.
The alternative is asynchronous IO.
With this approach, the IO is initiated in a way where the the calling thread does not wait for the IO to complete,
but instead continues on with other work and relies on the operating system to notify in someway when the IO is complete.
The idea being that the thread can go on doing other useful stuff while the IO is “happening”.
The implication of this is that you don’t need very many threads because they are always busy doing work,
and this is the basis of claims that non blocking systems are more performant than blocking ones.
Blocking apps (by far the majority) employ large pools of threads for handling requests.
Non blocking apps employ surprisingly small thread pools (usually one or two threads per core) and the golden rule is that you can’t block on these threads (preferably not at all, but definitely not on these threads), which means doing things asynchronously.

Machines fundamentally perform tasks asynchronously. Synchronous programming is just an abstraction; a very prevalent and useful one. It allows programmers to reason about an operation as being continuous and deterministic. You give this up with the asynchronous/non-blocking paradigm, and therein the problem begins. What you gain in terms of “performance” you pay for in convenience. At its worst, multithreaded asynchronous programming is completely non deterministic and disjointed. Ratpack tries to bridge the performance/convenience gap via its execution model.

(all subsequent code in this post will be in Groovy, but could easily be Java)

The “execution model” is wrapped up in the ratpack.exec package.
Notably, ExecControl which defines the interface for performing asynchronous operations.
Ratpack request handlers work with a Context which implements ExecControl.

We’ll use the blocking(Callable) method as our asynchronous operation.

So, performing an asynchronous operation looks like this:

handlers {
  get {
    blocking {
      getValueFromDb() // returns string
    }.then {
      render it // it is the return of getValueFromDb() 
    }
  }
}

The blocking() method returns a Promise.
The then() method effectively takes a callback that gets executed when the promised value becomes available.
What’s being promised here is the result of the function (as a Groovy Closure here).
Because the function performs blocking IO, we have to execute it in this way.
What’s going to happen is that the function is actually executed on a separate thread, from the blocking thread pool.
When it completes, the value is returned back to the non-blocking thread pool and given to the “then” callback.
This is asynchronous.

This is pretty standard stuff from promise based asynchronous APIs.
Promises aren’t really anything special, to the extent we’ve used them so far.
This code would look fairly similar with raw callbacks.

Ok, so let’s dig deeper.

handlers {
  get {
    print "1"
    blocking {
      print "2"
      getValueFromDb() // returns string
    }.then {
      print "3"
      render it // it is the return of getValueFromDb() 
    }
    print "4"
  }
}

What’s the cumulative output of the print's here?
Based on what we know so far, and with most asynchronous APIs, you can't know; it's not deterministic.
It's probably going to be 1423, but could also be 1243 or 1234.
It’s going to depend on how long it takes for the blocking operation to start and finish, and how long it takes for the result to be given back to the non blocking thread pool.

What about this?

handlers {
  get {
    print "1"
    blocking {
      print "2"
      getValueFromDb() // returns string
    }.then {
      print "3"
      render it // it is the return of getValueFromDb() 
    }
    sleep 10000
    print "4"
  }
}

Very likely 1234, but again you can’t be sure.
You don’t know how much time it’s going to take to start the blocking operation.

Ratpack’s execution model guarantees that the output will be 1423.

Let’s look at a more interesting example.

handlers {
  handler {
    next()
    throw new Exception("ex1")
  }
  get {
    blocking {
      getValueFromDb() // returns string
    }.then {
      render it // it is the return of getValueFromDb() 
    }
  }
}

What you need to know about Ratpack here is that next()
passes control to the next handler, which is the get {} handler.
After that handler is “done”, the next() call will return.

This particular example is a little contrived, but it does represent a real problem in asynchronous programming.
A race is on.

With a lot of asynchronous APIs, when you initiate an asynchronous operation you inherently create a race because you split the execution.
The execution is going to continue with the callback, and execution has to unwind the call stack.
If we consider this in the context of a logical operation, such as servicing a request, we now have to reason about a non deterministic outcome.
What should happen if when the promised value arrives and the exception was already thrown?
How should the exception be handled if the promised value arrived and was written to the output before the exception was thrown?
Such a logical inconsistency is not always this easy to spot, and chances are it makes itself known for the first time in production.

Ratpack’s execution model guarantees that the exception being thrown is the logical outcome.
In fact, the background operation is never initiated.

The point here is that the given examples are deterministic in Ratpack, while they aren’t with most asynchronous APIs.
Ratpack serializes the segments of a logical execution to avoid races.

The given examples so far represent a single logical execution, responding to a request.
We can break it down into 3 pieces…

  1. Everything up to the asynchronous call
  2. After the asynchronous call (i.e. unravelling the call stack)
  3. The callback, initiated when the async operation completes

The problem we have been discussing is that #2 and #3 are are racing because they can be executed concurrently.

Ratpack breaks this into 4 pieces:

  1. Everything up to just before the asynchronous call
  2. After everything up to just before the asynchronous call
  3. The asynchronous call
  4. The callback

The key difference here is that the asynchronous call is not initiated straight away.
It’s deferred until the stack has unwound and there is no more code to execute.
This avoids races.

To see how, let’s look at promise() (which is what blocking() builds on).
The promise() method is for initiating arbitrary async operations.

handlers {
  get {
    // (1)
    promise { fulfiller ->
      // (3)
      someCallbackTakingAsyncApi {
        // (4)
        fulfiller.success(it) // it is the result of the async call
      } 
    }.then {
      // (5)
      render it
    }
    // (2)
  }
}

The number comments indicate the guaranteed sequence of events.

The promise() method takes a function that initiates an asynchronous operation.
It integrates with the execution machinery to wait until the current execution segment is complete before executing it.
This effectively serializes the segments of an execution and avoids races.

There’s an assumption built in here.
We assume that the initiation of the async operation is atomic, in that once it has been initiated nothing else interesting can happen until it completes.
That is, it cannot throw an exception or make any kind of state change.
This is the same assumption that naive async APIs make about the call stack; that unwinding it will not produce errors or side effects.
Ratpack reduces the risk/scope of the assumption by initiating the async operation with a new call stack of constrained scope.
With the naive approach, there’s no real way to know what’s waiting to happen while the call stack unwinds which is riskier.

Why does this matter? Determinism.
You want as much as you can get.
It reduces the mental burden in reasoning about what a system is doing, and more importantly will do.

What are other ways of being deterministic?

  1. Synchronous/blocking thread-per-request (e.g. traditional Servlet API)
  2. Asynchronous, single threaded (e.g. Node.js)

The problem with #1 is performance (generally speaking).
The problem with #2 is under utilization on multi core hardware, unless you introduce multiple processes which introduces other problems.

I mentioned that synchronous programming allows reasoning about an operation as something that is continuous and deterministic.
We’ve discussed how Ratpack’s execution model allow asynchronous programming to be deterministic, via the promise() method.
We have discussed how it supports continuousness, which refers to cumulative state, contextual error handling and resource management, which will be the subject of a follow up post.
Another future post may be how this all relates to streaming data while supporting determinism and continuousness.

To round things out, let’s look at a more complex example.
If you can reason that the net result of this code is that the string '1,2,3' is rendered to the response, and that the collection l does not need to be concurrent safe, then you’ve got it under control.

handlers {
  get {
    def l = []
    promise { f ->
      Thread.start {
        sleep 3000
        f.success("1")
      }
    }.then {
      l << it
    }
    promise { f ->
      Thread.start {
        sleep 2000
        f.success("2")
      }
    }.then {
      l << it
    }
    promise { f ->
      Thread.start {
        sleep 1000
        f.success("3")
      }
    }.then {
      l << it
      render l.join(",")
    }
  }
}

If you want to go implementation exploring, you can browse the code and tests.

The post New Blog: Ratpack’s execution model, part 1. appeared first on Grails Info.


Viewing all articles
Browse latest Browse all 45

Trending Articles