Go live button in backstage session is not working
Incident Report for Swapcard
Postmortem

I. Executive Summary:

We are issuing a post-mortem report regarding a backstage service disruption that affected Swapcard customers on Thursday, April 5th, 2024. The incident was linked to an outage of our video backstage infrastructure provider 100ms https://status.100ms.live/incidents/ww76zjqsj6j9. The issue affected rooms located in the Europe region. The initiation of RTMP utilized a cluster-specific, legacy endpoint. Prior to this incident, despite successful recording and streaming operations, the session and recording processes were executed in separate clusters. A recent update introduced a modification whereby the recording process would terminate if it did not detect the session within the same cluster, leading to recorder failures.

II. Impact Analysis:

  • User Impact:
    Backstage streaming capabilities
  • Service Impact:
    Streaming through Swapcard's backstage functionalities was experiencing issues, resulting in the "Go Live" button being reset immediately after activation. This was due to failures in the recording and streaming processes.

III. Mitigation Deployment:

To address the issue, our video infrastructure provider has taken significant steps by introducing a new API endpoint that guarantees intelligent routing. This advanced endpoint has been specifically designed to enhance the stability and reliability of our streaming and recording functionalities.

IV. Forward Planning:

Consistent with our dedication to maintaining high-quality service, Swapcard is thoroughly reviewing our communication with our external video backstage infrastructure provider to avoid similar issues in the future. This incident has led us to improve our processes and establish stronger measures to prevent reoccurrences. We are grateful for your patience and ongoing support.

We deeply regret any inconvenience this disruption may have caused. Should you need further assistance or additional information, please feel free to contact our support team. Your understanding and cooperation are highly appreciated.

Posted Apr 05, 2024 - 14:55 UTC

Resolved
This incident has been resolved.
Posted Apr 04, 2024 - 21:52 UTC
Investigating
We are currently investigating an issue with our backstage provider 100ms impacting our backstage functionality. The issue seem related to today incident on 100ms infrastructure in Europe https://status.100ms.live/incidents/ww76zjqsj6j9
Posted Apr 04, 2024 - 20:46 UTC
This incident affected: Backstage (Backstage - Overall).