We’ve come to rely on integration tests as part of a balanced testing approach. They support outside-in development, they catch regressions, and they bridge the gap between Ruby and JavaScript unit tests.

However, integration tests involving Ruby and JavaScript are fraught with danger. Developers frequently complain of tests which fail erratically. Debugging these tests can be somewhat of a mystery: you see records inserted into your test database during logs, but they somehow don’t show up on the page. These annoyances drive some developers to abandon integration tests entirely.

Why are integration tests with JavaScript so much harder?

First, a little background.

Integration tests in Rails applications work by simulating a user’s experience through the HTML interface. You load up your Rails application, and your integration test harness (such as Capybara) simulates a user clicking around on the site.

Without JavaScript, Capybara uses rack-test as a driver by default and looks something like this:

You trigger user action by invoking one of Capybara’s DSL methods, such as visit .

. Capybara tells its driver (rack-test) to load the requested URL .

. rack-test uses the URL to generate a fake Rack request and passes it directly to the in-memory instance of your Rails application.

to generate a fake Rack request and passes it directly to the in-memory instance of your Rails application. The Rack response is parsed by rack-test and saved as the current page.

Other actions like click_link work similarly, and rack-test will do things like save cookies, follow redirects, and post forms to make it feel like a real browser. This simulated browser can do many of the things a real browser can, but it’s missing one killer feature: JavaScript.

JavaScript drivers work a little differently. Because browser tools like WebKit are difficult to load inside of a Ruby process, these drivers boot up an external process which can interact with the browser engine. Because the external process doesn’t have access to the in-memory instance of your Rails application, they must make actual HTTP requests.

Interactions performed using Capybara’s JavaScript drivers look something like this:

You trigger user action by invoking one of Capybara’s DSL methods, such as visit .

. Capybara tells its driver (such as capybara-webkit) to load the requested URL .

. The driver starts its external process to hold the browser engine.

In order to serve actual HTTP requests, Capybara boots a new instance of your Rails application in a background thread.

The driver’s browser process uses the URL to create a real HTTP request which is passed to your application server, such as Thin.

to create a real HTTP request which is passed to your application server, such as Thin. Thin translates the HTTP request into a Rack request, which is passed to your Rails application in the background thread.

Thin also translates your Rack response into a real HTTP response, which is accepted by the browser process.

There’s a lot of extra machinery in here to make this work: process forking, HTTP requests, and background threads. However, there’s really only one bump which frequently affects users: that pesky background thread.

If requests are served in a background thread, that means that your tests keep running while your application responds to simulated interactions. This provides for an endless number of race conditions, where your tests look for elements on the page which have not appeared yet.

Much of Capybara’s source code is dedicated to battling this asynchronous problem. Capybara is smart enough to understand that a page may not have loaded by the time you try to interact with it. A typical interaction might look like this:

Test Thread Application Thread Your test invokes visit . Waiting for a request. Capybara tells the driver to load the page. Waiting for a request. The driver performs a request. Waiting for a request. Your test invokes click_link . Your application receives the request. Capybara looks for the link on the page, but it isn’t there. Your application sends a response. Capybara tries to find the element again, but it’s not there. The driver receives the response. Capybara successfully finds the element from the response. Waiting for a request.

As you can see, Capybara handles this interaction gracefully, even though the test starts looking for a link to click on before the page has finished loading.

However, if Capybara handles these asynchronous issues for you, why is it so easy to write flapping tests with Capybara, where sometimes the tests pass and sometimes they fail?

There are a few tricks to properly using the Capybara API so as to minimize the number of possible race conditions.

Bad:

first ( ".active" ). click

If there isn’t an .active element on the page yet, first will return nil and the click will fail.

Good:

# If you want to make sure there's exactly one find ( ".active" ). click # If you just want the first element find ( ".active" , match: :first ). click

Capybara will wait for the element to appear before trying to click. Note that match: :first is more brittle, because it will silently click on a different element if you introduce new elements which match.

Bad:

all ( ".active" ). each ( & :click )

If there are no matching elements yet, an empty array will be returned, and no elements will be affected.

Good:

find ( ".active" , match: :first ) all ( ".active" ). each ( & :click )

Capybara will wait for the first matching element before trying to click on the rest.

Note: there is usually a better way to test things than iterating over matching elements, but that is beyond the scope of this post. Think carefully before using all .

Bad:

execute_script ( "$('.active').focus()" )

JavaScript expressions may be evaluated before the action is complete, and the wrong element or no element may be affected.

Good:

find ( ".active" ) execute_script ( "$('.active').focus()" )

Capybara will wait until a matching element is on the page, and then dispatch a JavaScript command which interacts with it.

Note: execute_script should only be used as a last resort when running into driver limitations or other issues which make it impossible to use other Capybara methods.

Bad:

expect ( find_field ( "Username" ). value ). to eq ( "Joe" )

Capybara will wait for the matching element and then immediately return its value. If the value changes from a page load or Ajax request, it will be too late.

Good:

expect ( page ). to have_field ( "Username" , with: "Joe" )

Capybara will wait for a matching element and then wait until its value matches, up to two seconds.

Bad:

expect ( find ( ".user" )[ "data-name" ]). to eq ( "Joe" )

Capybara will wait for the matching element and then immediately return the requested attribute.

Good:

expect ( page ). to have_css ( ".user[data-name='Joe']" )

Capybara will wait for the element to appear and have the correct attribute.

Bad:

it "doesn't have an active class name" do expect ( has_active_class ). to be_false end def has_active_class has_css? ( ".active" ) end

Capybara will immediately return true if the element hasn’t been removed from the page yet, causing the test to fail. It will also wait two seconds before returning false , meaning the test will be slow when it passes.

Good:

it "doesn't have an active class name" do expect ( page ). not_to have_active_class end def have_active_class have_css ( ".active" ) end

Capybara will wait up to two seconds for the element to disappear before failing, and will pass immediately when the element isn’t on the page as expected.

When interacting with the page, use action methods like click_on instead of finder methods like find whenever possible. Capybara knows the most about what you’re doing with those methods, and can more intelligently handle odd edge cases.