On May 22 at 8:07 UTC, there was an issue impacting all customers of UiPath Automation Cloud Public Sector. Users were either unable to sign in, received timeouts, or their sign in took greater than a minute. The issue was resolved by 10:00 UTC.
To ensure the health of our products, UiPath runs automated testing against all of our services. At the time of the issue, a feature flag was activated. The automated tests then generated a lot of data, which exceeded the capacity of our licensing service. In particular, there were many expensive SQL queries, which maxed out the resources of a critical SQL database. Consequently, all new authentications had degraded performance.
Our automated monitoring system detected the issue within minutes, and our engineers immediately started investigating.
Engineers were able to quickly narrow down the problem to an issue with the SQL database. They then further identified the specific query which was causing performance impact. They performed the following actions in parallel:
After these actions were performed, the database returned to a health state and users could sign in normally.
We are continuing to analyze this incident to identify all possible improvements: