Teal Station Server: Hardware Issue

Uptime Impact: 2 hours, 31 minutes, and 6 seconds
Resolved
Resolved

Reason for Outage – Teal Station Server

On the evening of June 5th, the Teal station server experienced a hardware failure that resulted in approximately 2.5 hours of downtime for some stations.

The fault was traced to an electrical issue, which in turn affected the physical motherboard and CPU. These required replacement by our service provider. Although hardware issues like this are rare, we know any extended interruption is frustrating and appreciate your patience while the issue was resolved.

Following the repair, a secondary issue was identified with one of the server's hard drives. Thanks to our RAID5 setup, no data has been lost, but the degraded drive does require further investigation.

As a result, we’ll be bringing forward a planned migration to newer hardware to improve long-term reliability. This migration may involve a short period of scheduled downtime, and we’ll be contacting affected customers directly via email with details.

More broadly, we’re in the final stages of a significant infrastructure upgrade designed to make outages like this far easier to recover from. Once live, it will allow us to move stations between servers far more quickly and with minimal interruption. These improvements form part of our wider investment in Radio.co’s reliability and scalability, ensuring a more resilient platform as we continue to grow.

Thanks again for bearing with us.

Radio.co Incidents Team.

Resolved

We've now resolved the incident. Thanks for your patience.

Recovering

Stations are now back online.

We would like to apologise for any inconvenience caused during this time. An RFO report will be available within 72 hours.

This incident will be closed once we have confirmation from the data centre, the work on the server has been completed.

For further assistance please reach out to support.

Recovering

Our datacentre provider has fixed the core issue, and are waiting for things to recover.

Identified

Verification continues. We unfortunately do not have an ETA from the datacentre at this time.

Identified

The hardware component has been replaced and the server is being checked for integrity.

Identified

Following investigation, our service provider has determined a hardware replacement is needed. This process has started. Further updates to follow.

Identified

Our service provider continues to work on restoring the service.

Investigating

We are working with our server provider to determine the cause of this issue. Once we have more information we will update this incident.

Investigating

Our monitoring software has detected potential downtime on a Station Server.

Our DevOps team has been informed and are working to investigate the problem. Further updates will be provided in due course.

If your station has stopped broadcasting we apologise for the inconvenience caused.

Began at:

Affected components
  • Station Servers