Timeout - don't wait for your order forever

Restaurant

Real world analogy

Imagine you're at a restaurant. The waiter comes and takes your order. Then comes the waiting time. You expect the food to arrive within 30 to 45 minutes. If the waiter doesn't return with your order, you usually take action. You either check on the status or decide to leave. The thing is, you don't wait forever. At some point, you act.

Problem

By default, when a system makes a network request, it waits for the response forever. If the response doesn't arrive, the system will not proceed with code execution.

Example of the problem

Blocking code execution wastes limited resources. It can lead to performance issues or even system crashes.

For instance, databases support a finite number of concurrent connections. If all of them are in use, new incoming requests are forced to wait or fail.

In user interfaces, blocking code can freeze the UI, negatively impacting the user experience and often causing users to leave the application.

Solution

A solution is to set a timeout. A timeout is a mechanism that allows the system to stop waiting for a response once the normal waiting time for a request is exceeded.

Setting timeout value

Timeout value

Different use cases require different timeout values. Choosing a correct value involves a mix of experience and testing the API.

For example, a Google reCAPTCHA verification request usually takes less than 1 second. A 5-second timeout should be enough.

Setting it to 30seconds will be too high. Users will not want to wait 30 seconds to see that the system faced a timeout error. They will leave the page way sooner.

However, setting the timeout value too low is even worse. It will cause all of the requests to fail with a timeout error, making the system completely broken.

Timeout value affecting system stability

If you can not decide on the value. Set any value that is higher than the worst-case scenario imaginable.

You don't want to set it too low. But setting it too high is still better than not setting it at all.

Error handling

When a timeout occurs, the system usually throws an error.

Example of a timeout error

A timeout error does not mean that the request was not processed. It means that confirmation about processing the request has not been received. Thus, the request was either executed or it was not.

To handle a timeout error, it's necessary to understand what the request does. For example, retrying to add a new post may cause duplicates. Then, retrying the request without checking if it was added is forbidden. But retrying a request that fetches order history is fine. Extra reads typically don't lead to issues. Remember, there is no silver bullet. Each use case must be carefully thought out to be handled correctly.

To learn more about handling network errors, see the "TODO" page.

Final Thoughts

Timeouts are a fundamental mechanism for maintaining system stability.

For instance, AWS API Gateway, a cloud service for building APIs, enforces a default timeout of 30 seconds.

AWS engineers set this limit based on the reasoning that if a response takes longer than 30 seconds, it's generally not worth waiting any further, even if a response eventually arrives.

Think about this. What's worse than waiting a long time for ordered food? Waiting for a long time, only to receive a call saying it won't be delivered.

Remember always to set the timeout.