Expired TLS cert resulted in hour-long Spotify outage

The massive hour-long outage affecting music streaming service Spotify on Wednesday occurred because the company reportedly failed to renew a TLS certificate before it expired.

Hundreds of millions of music fans across the globe were left amazed on Wednesday when Spotify went offline for more than an hour, starting at 13:00 BST. "We’re aware of some issues right now and are checking them out! We’ll keep you posted," the streaming giant said without revealing the exact cause behind the outage.

Spotify's services resumed after a little more than an hour, with the company tweeting "Good news! Everything is good to go and looking happy. Still having issues? Give @SpotifyCares a tweet." However, given the company's silence on what exactly caused the outage, it was natural for many to fear the worst.

However, Louis Poinsignon, a network engineer at Cloudflare, provided a clue into what exactly occurred inside Spotify's systems. According to him, the company apparently failed to renew its TLS certificate in time and the certificate's expiration led to the outage. Spotify's services came back online soon after the TLS certificate was renewed.

"Spotify’s outage as a result of an expired certificate once again raises important questions around certificate lifecycle management. This episode is only the most recent in a long litany of major service outages owing themselves to failed certificate renewals," says Tim Callan, Senior Fellow at Sectigo.

"Problems like these can be prevented by implementing certificate automation for deployment, renewal, and discovery of certificates in your environment. IT professionals have a strong set of automation options available to them today, so outages of this nature need not continue to occur."

Pratik Salva, senior security engineer at Venafi, told Teiss that expired certificates can make sites and services inaccessible and have been the cause of various incidents over the last several years. Ultimately, if a certificate expires for a high impact service, like Spotify, it can potentially impact millions of users.

“In addition, many large organisations often don’t know all the certificates they own and where they are all deployed because they don’t have an accurate and proper asset inventory. These issues can lead to certain certificates slipping under the radar and expiring at any point, which leads to outage incidents,” he added.

Last year, a survey commissioned by Keyfactor revealed that the average organisation will suffer up to £51.5 million in losses over the next two years to downtime and outages which could occur as a result of poor digital identity management practices as well as a lack of visibility over keys and certificates owned by the organisation.

Out of 500 IT and IT security professionals interviewed for the survey, over 70% said their organisation did not know how many keys and certificates it had and 74% said that digital certificates have caused and still cause unanticipated downtime or outages.

"We know that many organisations struggle with properly and efficiently managing certificates and there’s a clear gap in understanding how critical it is, especially at the executive level. Unfortunately, digital identity management is often siloed and assumed to be a pure IT function. This report should empower PKI and infosec teams to ask for the resources they need to fully manage and secure every digital identity," said Chris Hickman, Chief Security Officer at Keyfactor.

"The study shows that organisations are spending an average of $18.2 million (£13.9 million) on IT security annually and only 14% of that is allocated to PKI. Yet the average company is managing upwards of 83,000 digital certificates to encrypt data and authenticate servers and secure data on IoT devices. The burden of PKI should be offset by technology that reduces risk and operational costs, improves efficiencies and automates certificate lifecycle management," he added.

Copyright Lyonsdown Limited 2020