The read timeout is how long you want to spend reading data from the server once the connection is open. It's common for this to be set higher, as waiting for a server to generate an answer (run queries, fetch data, serialize it, etc.) should take longer than opening the connection.

When you see the term "timeout" on its own (not an open timeout or a read timeout) that usually means the total timeout. Faraday takes `timeout = 5` and `open_timeout = 2` to mean "I demand the server marks the connection as open within 2 seconds, then regardless of how long that took, it only has 5 seconds to finish responding."

Some must die, so that others may live

Any time spent waiting for a request that may never come is time that could be spent doing something useful. When the HTTP call is coming from a background worker, that's a worker blocked from processing other jobs. Depending on how you have your background workers configured, the same threads might be shared for multiple jobs. If Job X is stuck for 30s waiting for this server that's failing, Job Y and Job Z will not be processed or will be processed incredibly slowly.

That same principle applies when the HTTP call is coming from a web thread. That's a web thread that could have been handling other requests! For example, `POST /payment_attempts` is making an HTTP call in the thread which is usually super quick, but unfortunately, some service it talks to is now blocking it for 30s. Other endpoints usually respond in 100ms, and they will continue to respond so long as there are threads available in the various workers... but if the performance issues for the dependency continue, every time a user hits `POST /payment_attempts`, another thread becomes unavailable for that 30s.

Let's do a bit of math. For each thread that gets stuck, given that thread is stuck for 30s, and most requests go through in 100ms, that's 3000 potential requests not being handled. 3000 requests not being handled because of a single endpoint. There will continue to be fewer and fewer available workers, and given enough traffic to that payment endpoint, there might be zero available workers left to work on any the traffic to any other endpoints.

Setting that timeout to 10s would result in the processing of 2000 more successful requests.

As a general advice: please do try and avoid making requests from the web thread, put them in background jobs whenever possible. Sometimes it is unavoidable, but please go to painstaking efforts to try and avoid doing it.

Making timeouts happen early is much more important than getting a fast failure. The most important benefit of failing fast is to give other workers the chance to work.

Figuring Out what Timeout to Set

If the server is a third party company, you might have a service-level agreement stating: "Our API will always respond in 150ms". Great, set it to 150ms (and retry on failure if the thing is important.)

If the service is in-house, then try to get access to NewRelic, CA APM or whatever monitoring tool is being used. Looking at the response times, you can get an idea of what should be acceptable. Be careful though, do not look only at the average.