In today’s world of complex web applications, developers often struggle to understand the nuances of user interactions when production applications fail. For example, a client-side error is reported in the wild but we have no idea how or why it happens.

So we play detective and try to piece together various clues. We rely on logs and error reporting to give us insight on how our system failed, but these data points are not interactive. So we come up with steps to reproduce the bug in our local development setup.

Why can’t our application play back what happened to trigger the error? After all, the application was able to reach the error to begin with. In the next sections we will go over how we use a support viewer to help debug client errors. We leverage the simplicity Om Next and the Untangled framework to make this possible. This feature allows developers to step through the history of user actions, replay these actions in development builds, and even run tests against these actions.

Stepping through a user session

Debugging With Sessions

The support viewer component will use a unique URL for each user session that is saved. Simply visiting this link will load a user session you can replay, as demonstrated in the first gif. Click forward or backwards to see each user action. You can also use these URLs in admin dashboards, share them with team members, and log them in post-mortems.

While observing user sessions is insightful, the real benefit comes from debugging these sessions. For example, we once had a production error involving a chart widget incorrectly parsing a date string. Our web application plots a bunch of different sources on this chart component, and at some point we started to see inconsistencies among string formats. Soon some users ran into an edge case where our charts wouldn’t load correctly they picked specific options for a chart.

So a developer loaded a user session in his local development environment and observed the error as the user saw it. The developer was able to modify the code until the charts started rendering properly.

Fixing a chart render bug in a user’s session.

The only prerequisite to load user sessions in a development is to add CORS support in your production server. Then you can take a URL the support viewer provides you and change the hostname to localhost. Now you’ve got a reproduction case!

How It Works: Om Next State Management

Om Next’s focus on declarative components and a global app state makes setting up a Support Viewer component trivial. Om Next renders the UI from a global atom via a reconciler. Moving state out of component local state means we can treat UI render as a pure function. When we stepped through the session history before we were changing the value — the global app state — that was passed to the render function. When we changed the code, we were changing the render function itself.

The reconciler renders the UI using the global app state.

Updates to the app state are applied via reified transactions, and each successful transaction returns a new app state with a UUID. Om Next also records the last 100 app states of the application by default. This makes it trivial to traverse the history of user actions. If we save this collection of app states while a user is interacting with our application, we can later render our application with that user’s session. Further, if we save this collection when errors are thrown we have a reproduction case for bugs.

Transactions create a new App State to pass to the render function..

Setting up the Untangled Support Viewer

The Untangled support viewer component works nicely out of the box, but there are some improvements you might consider when using it in production. We decided to implement the start-untangled-support-viewer method ourselves to add some features. The goal of this function is to mount two Om Next applications on the page. One is our original app with the initial state it is generally loaded with. Second is a tiny support viewer app that continually swaps out the global atom with a new one as you click through the history. You can see the support viewer app in the top left corner of the first gif.

To mount the original app we provide the initial state from our main build and some additional information such as :ui/support-viewer? and :ui/no-router . These extra values turn off things like error reporting, the router, and web tracking.

The viewer app is pointed to our production endpoint, allowing the viewer to load one of the user sessions that are stored in the production database via the :started-callback . This callback runs a function to fetch the user session, passing the :id URL parameter to the mutation. You will probably want to point your viewer to your development server when you setup or debug the Untangled support viewer.

Saving Client Errors

So when do we want to save a session, so that developers can use them as a debugging tool? One approach could be to make a support ticket component that lets users submit a ticket along with their session history. You can see an example of this in the Untangled TodoMVC.

We decided to use airbrake to save user sessions on each production error. Airbrake’s onerror handler accepts custom filters where you can add logic to run when an error is thrown. We add a custom filter to save a user session for each client error. When the app mounts, our filter gets added to the handler via the started-callback that Untangled exposes.

We even take this idea a step further and setup a method to ping our team whenever a client error happens.

Sumologic is a log management tool that we use at AdStage. It has a feature that lets you continuously run a search on your server logs. You can also instruct Sumologic to send an email if the search finds any matches. Since we add a server log whenever a user session is saved, we can setup a Sumologic search to email our team messaging channel with error reports.

Our error reports include the support viewer URL mentioned earlier, which load a user session to replay. So now when a user hits a production error, we get a notification of the incident along with a URL to reproduce the case locally. At this point, it’s hard to ignore bugs!

Flowdock messages about production errors, each with an attached user session.

Admin panel to access a User’s session.

Testing Sessions

We’ve started running invariant testing after being inspired by a talk by Sebastian Bensusan. Mostly we have been inspecting the collection of app states on the server and asserting that none get into an invalid state. If any do, we log the invalid state and message our team.

Indeed, we have caught a few regression with this approach.

However, render bugs are only a subset of the class of bugs we see in a production web application. It would be interesting to also record and test the transitions between app states. After all, the majority of bugs in our applications tend to exist around state change.

Sebastian describes an approach to render an application in an automated browser and run browser/integration tests against each user action. For example, if we run a user action to close a modal, we could verify that a modal is closed. Or we could run a mutation that submits a form, and verify the application transitions to the correct page and loads the proper data from the server.

A mutation with high failure rates would suggest itself as a candidate for refactoring, which could help us make our web application more robust. Right now, Om Next does not record mutations out of the box. There are some open issues to add this feature in the future; however.