<--- Last few GCs ---> [90303:0x102801600] 90966 ms: Mark-sweep 1411.7 (1463.4) -> 1411.3 (1447.4) MB, 1388.3 / 0.0 ms (+ 0.0 ms in 0 steps since start of marking, biggest step 0.0 ms, walltime since start of marking 1388 ms) last resort GC in old space requested

[90303:0x102801600] 92377 ms: Mark-sweep 1411.3 (1447.4) -> 1411.7 (1447.4) MB, 1410.9 / 0.0 ms last resort GC in old space requested <--- JS stacktrace ---> ==== JS stack trace ========================================= Security context: 0x2c271c925ee1 <JSObject>

1: clone [/Users/abhinavdhasmana/Documents/Personal/sourcecode/circuitBreaker/client/node_modules/hoek/lib/index.js:~20] [pc=0x10ea64e3ebcb](this=0x2c2775156bd9 <Object map = 0x2c276089fe19>,obj=0x2c277be1e761 <WritableState map = 0x2c27608b1329>,seen=0x2c2791b76f41 <Map map = 0x2c272c2848d9>)

2: clone [/Users/abhinavdhasmana//circuitBreaker/client/node_modul...

Now, instead of one, we have two services which are not working. This would escalate throughout the system and the whole infrastructure will come down.

Why we need a circuit breaker

In case we have serviceB down, serviceA should still try to recover from this and try to do one of the followings:

Custom fallback: Try to get the same data from some other source. If not possible, use its own cache value.

Try to get the same data from some other source. If not possible, use its own cache value. Fail fast: If serviceA knows that serviceB is down, there is no point waiting for the timeout and consuming its own resources. It should return ASAP “knowing” that serviceB is down

If knows that is down, there is no point waiting for the timeout and consuming its own resources. It should return ASAP “knowing” that is down Don’t crash: As we saw in this case, serviceA should not have crashed.

As we saw in this case, should not have crashed. Heal automatic: Periodically check if serviceB is working again.

Periodically check if is working again. Other APIs should work: All other APIs should continue to work.

What is circuit breaker design?

The idea behind is simple:

Once serviceA “knows” that serviceB is down, there is no need to make request to serviceB . serviceA should return cached data or timeout error as soon as it can. This is the OPEN state of the circuit

“knows” that is down, there is no need to make request to . should return cached data or timeout error as soon as it can. This is the state of the circuit Once serviceA “knows” that serviceB is up, we can CLOSE the circuit so that request can be made to serviceB again.

“knows” that is up, we can the circuit so that request can be made to again. Periodically make fresh calls to serviceB to see if it is successfully returning the result. This state is HALF-OPEN.

Circuit breaker in open position

This is how our circuit state diagram would look like

state diagram of circuit breaker

Implementation with Circuit Breaker

Let’s implement a circuitBreaker which makes GET http calls. We need three parameters for our simple circuitBreaker

How many failures should happen before we OPEN the circuit.

What is the time period after which we should retry the failed service once the circuit is in OPEN state?

In our case, the timeout for the API request.

With this information, we can create our circuitBreaker class.

circuitBreaker Class with its constructor

Next, let’s implement a function which would call the API to serviceB .

circuitBreaker call function

Let’s implement all the associated functions.

circuitBreaker.js with all the function about state, failure and reset

Next step is to modify our serviceA . We would wrap our call inside the circuitBreaker we just created.

Important changes to note in this code with respect to the previous code:

We are initializing the circuitBreaker const circuitBreaker = new CircuitBreaker(3000, 5, 2000);

We are calling the API via our circuit breaker const response = await circuitBreaker.call(‘http://0.0.0.0:8000/flakycall');

That’s it! Now let’s run our jMeter test again and we can see that our serviceA is not crashing and our error rate has gone down significantly.

Further reading