What is concurrency?

Processes

Running multiple processes is not actually about concurrency, it’s about parallelism. Although parallelism and concurrency are often confused, they are different things. I like this simple analogy:

Concurrency : having a person juggle many balls with only 1 hand. Regardless of how it seems, the person is only catching / throwing one ball at a time.

: having a person juggle many balls with only 1 hand. Regardless of how it seems, the person is only catching / throwing one ball at a time. Parallelism: is having multiple people juggle their own set of balls simultaneously.

Sequential execution

Imagine, we have a range of numbers, which we need to convert to an array and find an index for the specific element:

# sequential.rb

range = 0...10_000_000

number = 8_888_888

puts range.to_a.index(number) $ time ruby sequential.rb

8888888

ruby test.rb 0.41s user 0.06s system 95% cpu 0.502 total

Executing this code takes approximately 500ms and utilizes 1 CPU.

Parallel execution

We can rewrite the code above by using multiple parallel processes and splitting the range . With the fork method from the standard Ruby library we can create a child process and execute the code in the block. In the parent process we can wait until all child processes are finished with Process.wait :

# parallel.rb

range1 = 0...5_000_000

range2 = 5_000_000...10_000_000

number = 8_888_888

puts "Parent #{Process.pid}"

fork { puts "Child1 #{Process.pid}: #{range1.to_a.index(number)}" }

fork { puts "Child2 #{Process.pid}: #{range2.to_a.index(number)}" }

Process.wait $ time ruby parallel.rb

Parent 32771

Child2 32867: 3888888

Child1 32865:

ruby parallel.rb 0.40s user 0.07s system 153% cpu 0.309 total

Because each process works in parallel with just a half of the range , the code above works a bit faster and consumes more than 1 CPU. The process tree during the execution may look like:

# \ - 32771 ruby parallel.rb (parent process)

# | - 32865 ruby parallel.rb (child process)

# | - 32867 ruby parallel.rb (child process)

Pros:

Processes don’t share memory, so you can’t mutate data from one process in another. It makes it much easier to code and debug.

Processes in the Ruby MRI are the only way to utilize more than a single-core since there is a GIL (global interpreter lock, find more information below in the post). It may be useful if you’re doing, let’s say, some math calculation.

Forking child processes may help avoid unwanted memory leaks. Once the process finishes, it releases all the resources.

Cons:

Since processes don’t share memory, they use a lot of memory—meaning that running hundreds of processes may be a problem. Note that since Ruby 2.0 fork uses OS Copy-On-Write, which allows processes to share memory as long as it doesn’t have different values.

uses OS Copy-On-Write, which allows processes to share memory as long as it doesn’t have different values. Processes are slow to create and destroy.

Processes may require inter-process communication. For example, DRb.

Beware of orphan processes (a child process whose parent has finished or terminated) or zombie processes (a child process which completed execution but still occupies space in the process table).

Examples:

Unicorn server — it loads the application, forks the master process to spawn multiple workers which accept HTTP requests.

Resque for background processing — it runs a worker, which executes each job sequentially in a forked child process.

Threads

Even though Ruby uses native OS threads since version 1.9, only one thread can be executing at any given time within a single process, even if you have multiple CPUs. This is due to the fact that MRI has GIL, which also exists in other programming languages such as Python.

Why does the GIL exist?

There are a few reasons, for example:

Avoids race conditions within C extensions, no need to worry about thread-safety.

Easier to implement, no need to make Ruby data structures thread-safe.

Back in 2014, Matz started thinking about gradually removing GIL. Because GIL doesn’t actually guarantee that our Ruby code is thread-safe and doesn’t allow us to use better concurrency.

Race-conditions

Here is a basic example with a race-condition:

# threads.rb

@executed = false

def ensure_executed

unless @executed

puts "executing!"

@executed = true

end

end

threads = 10.times.map { Thread.new { ensure_executed } }

threads.each(&:join) $ ruby threads.rb

executing!

executing!

We create 10 threads which execute our method and call join for each of them, so the main thread will be waiting until all other threads are finished. The code printed executing! twice because our threads share the same @executed variable. Our read ( unless @executed ) and set ( @executed = true ) operations are not atomic, meaning that once we read the value it could be changed in other threads before we set a new value.

GIL and Blocking I/O

But having GIL, which doesn’t allow to execute multiple threads at once, doesn’t mean that threads can’t be useful. Thread releases GIL when it hits blocking I/O operations such as HTTP requests, DB queries, writing / reading from disk and even sleep :

# sleep.rb

threads = 10.times.map do |i|

Thread.new { sleep 1 }

end

threads.each(&:join) $ time ruby sleep.rb

ruby sleep.rb 0.08s user 0.03s system 9% cpu 1.130 total

As you can see, all 10 threads slept for 1 second and finished almost at the same time. When one thread hit sleep , it passed the execution to another thread without blocking GIL.

Pros:

Uses less memory than processes; it’s possible to run thousands of threads. They are also fast to create and destroy.

Threads are useful when there are slow blocking I/O operations.

Can access the memory area from other threads if necessary.

Cons:

Requires very careful synchronization to avoid race-conditions, usually by using locking primitives, which sometimes may lead to deadlocks. All this makes it quite difficult to write, test and debug thread-safe code.

With threads you have to make sure that not only your code is thread-safe, but that any dependencies you’re using are also thread-safe.

The more threads you spawn, the more time and resources they’ll be spending by switching the context and spending less time doing the actual job.

Examples:

Puma server — allows to use multiple threads in each process (clustered mode). Similarly to Unicorn it preloads the app and forks the master process, where each child process has its own thread pool. Threads work fine in most cases because each HTTP request can be handled in a separate thread and we don’t share a lot of resources between the requests.

Sidekiq for background processing — runs a single process with 25 threads by default. Each thread processes one job at a time.

EventMachine

EventMachine (aka EM) is a gem which is written in C++ and Ruby. It provides event-driven I/O using the Reactor pattern and can basically make your Ruby code looks like Node.js :) Under the hood EM uses Linux select() during its run through the event loop to check for new inputs on file descriptors.

One common reason to use EventMachine is the case when you have a lot of I/O operations and you don’t want to deal with threads manually. Manually handling threads can be difficult or often too expensive from a resource usage point of view. With EM you can handle multiple HTTP requests with a single thread by default.

# em.rb

EM.run do

EM.add_timer(1) do

puts 'sleeping...'

EM.system('sleep 1') { puts "woke up!" }

puts 'continuing...'

end

EM.add_timer(3) { EM.stop }

end $ ruby em.rb

sleeping...

continuing...

woke up!

The example above shows how to run asynchronous code by executing EM.system (I/O operation) and passing a block as a callback, which will be executed once the system command has finished.

Pros:

It’s possible to achieve great performance for slow networked apps such as web servers and proxies with a single thread.

It allows you to avoid complex multithreaded programming, the disadvantages of which were described above.

Cons:

Every I/O operation should support EM asynchrony. This means that you should use specific versions of system , DB adapter, HTTP client, etc. which can result in monkey-patched versions, lack of support and limited options.

, DB adapter, HTTP client, etc. which can result in monkey-patched versions, lack of support and limited options. Work done within the main thread per event-loop tick should be small. Also, it’s possible to use Defer, which executes the code in separate threads from the thread pool, however, it may lead to the multithreaded problems discussed earlier.

Hard to program complex systems because of the error handling and callbacks. Callback Hell is also possible in Ruby, but it can be prevented with Fibers, see below.

EventMachine itself is a huge dependency: 17K LOC (lines of code) in Ruby and 10K LOC in C++.

Examples:

Goliath — single threaded asynchronous server.

AMQP — RabbitMQ client. However, creators of the gem suggest using the non-EM-based version Bunny. Note that migrating tools to EM-less implementations is a general trend. For example, creators of ActionCable decided to use low-level nio4r, creator of sinatra-synchrony rewrote it with Celluloid, etc.

Fibers

Fibers are light weight primitives in the Ruby standard library which can be paused, resumed and scheduled manually. They are pretty much the same as ES6 Generators if you’re familiar with JavaScript (we also wrote a post about Generators and Redux-Saga). It’s possible to run tens of thousands of Fibers within a single thread.

Often, Fibers are used with EventMachine to avoid callbacks and make code look synchronous. So, the following code:

EventMachine.run do

page = EM::HttpRequest.new('https://google.ca/').get page.errback { puts "Google is down" }

page.callback {

url = 'https://google.ca/search?q=universe.com'

about = EM::HttpRequest.new(url).get about.errback { ... }

about.callback { ... }

}

end

Can be rewritten like:

EventMachine.run do

Fiber.new {

page = http_get('http://www.google.com/')

if page.response_header.status == 200

about = http_get('https://google.ca/search?q=universe.com')

# ...

else

puts "Google is down"

end

}.resume

end def http_get(url)

current_fiber = Fiber.current

http = EM::HttpRequest.new(url).get

http.callback { current_fiber.resume(http) }

http.errback { current_fiber.resume(http) }

Fiber.yield

end

So, basically, Fiber#yield returns control back to the context that resumed the Fiber and returns the value which was passed to Fiber#resume .

Pros:

Fibers allow you to simplify asynchronous code by replacing nested callbacks.

Cons:

Don’t really solve concurrency problems.

They are rarely used directly in application-level code.

Examples:

em-synchrony — a library, written by Ilya Grigorik, a performance engineer at Google, which integrates EventMachine with Fibers for different clients such as MySQL2, Mongo, Memcached, etc.

Conclusion

There is no silver bullet, so choose a concurrency model depending on your needs. For example, need to run CPU and memory intensive code and have enough resources — use processes. Have to execute multiple I/O operations such as HTTP requests — use threads. Need to scale up to the maximum throughput — use EventMachine.

In the second part of this series we will take a look at such concurrency models as Actors (Erlang, Scala), Communicating Sequential Processes (Go, Crystal), Software Transactional Memory (Clojure) and of course Guilds — a new concurrency model which may be implemented in Ruby 3. Stay tuned!