History

June
No incidents reported
May
No incidents reported
April
No incidents reported
March
19
Thu
App Studio Analytics Outage
12:21 PM
Summary

Approximately two weeks ago, users began experiencing disruption when accessing analytics data within App Studio. Initial investigation did not immediately identify the root cause; a deeper analysis ultimately traced the issue to a routine infrastructure consolidation that was capping database capacity below the threshold required to handle specific event analytics queries spanning long date ranges. Full service has since been restored following a targeted increase in database resources.

Timeline
  • 2026-03-11 19:32 UTC — Incident created. Initial investigation launched following a rise in user-reported loading issues on analytics, despite no prior performance degradation. The initial investigation did not surface a clear root cause at this stage.

  • 2026-03-18 10:45 UTC — Automated monitoring triggered alerts for elevated database error rates, corroborated by incoming user reports of analytics failures.

  • 2026-03-19 09:30 UTC — Deeper analysis by the Engineering team identified the root cause: high latency and query failures isolated to specific event analytics queries over long date ranges, linked to the earlier infrastructure consolidation.

  • 2026-03-19 09:40 UTC — Remediation plan executed: database resources scaled up to meet query load demands.

  • 2026-03-19 10:05 UTC — Database performance stabilized; analytics functionality verified as fully restored.

Resolution

To restore service, database resources were immediately scaled up to accommodate the throughput required by analytics queries. Once additional capacity was provisioned, query timeouts ceased and application stability was reestablished. The system remained under active monitoring until no further degradation was observed, at which point the incident was marked resolved.

Root Cause Analysis

The incident was caused by a capacity mismatch introduced during a routine infrastructure consolidation. Such consolidations are a standard part of our infrastructure lifecycle and are conducted with active monitoring in place to ensure continued performance across all services. In this instance, however, the capacity allocated to the Analytics Database was capped at a level that, while adequate for standard workloads, could not sustain the computational demands of specific event analytics queries executed over long date ranges. The initial investigation did not immediately surface this connection; it was only through deeper analysis that the consolidation was identified as the underlying cause. When these queries were executed, they exceeded available processing capacity, resulting in database timeouts and repeated failures of the analytics service; conditions that persisted until capacity was restored.

The investigation also surfaced a broader pattern: a gradual uptick in user-reported loading issues over the preceding two weeks, despite analytics having previously performed reliably. This indicated the capacity cap had been quietly eroding performance well ahead of the full outage.

Conclusion

We are straightening our processes to prevent recurrence. Monitoring coverage applied during database consolidations will be strengthened to detect query latency degradation earlier, and capacity planning will be refined to better account for the demands of specific event analytics queries over long date ranges.Looking further ahead, the team is actively working on delivering a fully revamped analytics pipeline to production later this year. This is not an incremental improvement; the new pipeline is being built to deliver deeper, richer metrics that give event organizers a comprehensive view of their event performance.

13
Fri
Speaker & Attendee Profile Visibility Issue
2:02 AM
Summary

On March 11, 2026, we experienced a service disruption affecting the visibility of speaker and attendee profiles across multiple events. The incident was triggered by a timing issue during a scheduled infrastructure update — a data refresh was initiated before new field definitions had fully propagated to our Search Engine, causing the indexation system to be built from an outdated configuration.

Our Search Engine relies on an indexation system to serve profile data quickly across events. When the refresh completed against the stale configuration, new fields were rejected due to strict schema enforcement, leaving profiles absent from search results and listings.

The goal of this post-mortem is to share our assessment and the steps taken to resolve the issue, while providing full transparency to our customers.

Timeline

Time \(UTC\)

Event

Mar 11 – 11:04

Infrastructure update deployed to production. Changes confirmed as applied.

Mar 11 – 13:21

Data refresh initiated as part of the planned release process. Previous indexation remained fully functional during this period.

Mar 11 – 13:42

Release step complete. Data refresh running, expected duration ~3–4 hours.

Mar 11 – 14:49

Configuration mismatch errors first logged internally.

Mar 11 – 15:33

Data refresh reported as complete. Spot checks passed. Final release step initiated.

Mar 11 – 16:52

Reports of missing profiles across multiple events surfaced. Incident response team immediately activated.

Mar 11 – 17:29

Root cause identified in under few minutes: indexation system had been built before the new fields were fully available. Internal alerts corroborated the finding.

Mar 11 – 18:08

Analysis completed confirming the refresh did not produce an indexation with the expected new fields.

Mar 11 – 18:39

Corrective data refresh triggered with the correct configuration fully in place, within 30 minutes of initial reports.

Mar 11 – 18:43

New indexation confirmed to contain the correct fields. Full refresh in progress.

Resolution

Once the root cause was identified, the team immediately initiated a new data refresh against the correct, fully propagated configuration. Profile visibility and search behavior were monitored throughout the process. Once the refresh completed, all profiles returned to normal search and listing behavior.

Root Cause Analysis

The root cause was a race condition between the deployment of new Search Engine field definitions and the triggering of a data refresh. The refresh started before the new configuration had fully propagated, causing the indexation system to be built from a stale schema. Due to strict schema enforcement, subsequent write operations referencing the new fields were rejected, leaving profiles absent from search results and listings.

Several factors contributed to this incident:

  • Ambiguous deployment signal: A pipeline warning created uncertainty around whether the deployment had completed cleanly.

  • Insufficient spot checks: Post-refresh verification passed at a surface level, masking the underlying schema mismatch until live traffic exposed it.

Short-term improvements:

  • Add a mandatory pre-refresh validation step to confirm the Search Engine schema matches the expected configuration before any refresh is initiated.

  • Update the release process to include explicit schema validation checkpoints for sub field, as a required step.

Conclusion

This incident stemmed from a timing edge case in our release process. The improvements listed above reflect our commitment to preventing similar incidents and maintaining a reliable platform for all our customers.

For any questions or concerns regarding this incident, please reach out to our support team.