Our Engineering and Operations teams are continuing to investigate the root cause of the outage.
At this time, we know the outage correlates to the IP change of the DNS A record for the AWS ELB used to route traffic to the clustered master nodes in one of our Kubernetes clusters. While we cannot state conclusively this to be the cause of the outage, logging does not reveal any other contributing factors. We are working with the Kubernetes community to correctly identify and address the issue.
While investigating the root cause, our Operations team has changed the deployment architecture to bypass the ELB, removing the suspect component. This is intended as a short-term measure.