maestra - Notice history

All systems operational

Admin panel - Operational

100% - uptime
Feb 2026 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2026
Mar 2026
Apr 2026

Site personalization - Operational

100% - uptime
Feb 2026 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2026
Mar 2026
Apr 2026

Loyalty - Operational

100% - uptime
Feb 2026 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2026
Mar 2026
Apr 2026

Mass campaigns - Operational

100% - uptime
Feb 2026 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2026
Mar 2026
Apr 2026

Transactional messages - Operational

100% - uptime
Feb 2026 · 100.0%Mar · 100.0%Apr · 100.0%
Feb 2026
Mar 2026
Apr 2026

Notice history

Apr 2026

Degraded access to admin panel
  • Postmortem
    Postmortem

    What happened
    On April 8, about 1.5% of admin panel traffic started failing. API endpoints were not affected. The issue was fully resolved on April 9.

    Why it happened
    We recently rolled out a new version of the software that routes user traffic to our services (HAProxy) across our fleet of six load balancers. The rollout had been tested in a test deployment and initially looked healthy in production as well. However, about 7 hours after the upgrade, one of the six load balancers started failing to reach backend services — and that's what caused the errors users experienced.

    The root cause is a bug in the new HAProxy version that only shows up under a very specific, still-unidentified set of conditions. Our test deployment and the other five production balancers kept working fine, which is why the issue slipped past our pre-rollout checks.

    Why it took us a while to notice
    When a request failed, our system automatically tried it again, and the retry usually succeeded. So from the outside things looked a bit slow or occasionally flaky rather than clearly broken. That automatic retry behavior is normally a good thing — it hides small, transient glitches from users — but in this case it also hid the growing problem from our monitoring long enough to delay detection.

    What we did
    - Rolled back the affected balancer to the previous HAProxy version, which immediately restored normal traffic.
    - Paused all further HAProxy upgrades across the fleet until we understand the trigger.

    What's next
    We're working to reproduce the bug in a controlled test environment so we can pinpoint the trigger, confirm a fix (either a patched version or a config change), and safely resume the rollout.

  • Resolved
    Resolved

    This incident has been resolved. We'll get back with details at a later date.

  • Investigating
    Investigating

    We are currently investigating elevated error rates affecting approximately 1.5% of traffic to some internal web services. API endpoints are operating normally and are not affected by this incident.

Feb 2026

No notices reported this month

Feb 2026 to Apr 2026

Next