Amazon Web Services (AWS) has added a new outage simulator to its Fault Injection Service (FIS) to help customers see how resilient their companies are to major outages.
The announcement, made during the company’s AWS re:Invent event, lets customers “put chaos engineering into practice at scale” by introducing simulation for AWS Availability Zone full power interruption or connectivity loss to another AWS region.
Amazon says engineers can do this to get a better understanding of their direct and indirect dependencies and to test recovery time after an outage.
AWS outage simulator levels up
Though cloud services, on the whole, have proven reliable, increasing geopolitical tensions have had enterprises worried over potential outages and the effect they could have on their business. Not to mention some pretty embarrassing blunders that have happened recently, including a simple typo that caused an hours-long Azure outage in Brazil.
Among the new additions to FIS is “AZ Availability: Power Interruption.” Amazon says this will fake “pull[ing] the plug” on a targeted set of resources in an Availability Zone, including “EC2 instances (including those in EKS and ECS clusters), EBS volumes, Auto Scaling Groups, VPC subnets, Amazon ElastiCache for Redis clusters, and Amazon Relational Database Service (RDS) clusters.”
Another test, “Cross-Region: Connectivity,” will prevent applications from being able to access resources in another target region, including traffic from “EC2 instances, ECS tasks, EKS pods, Lambda functions attached to a VPC… traffic flowing across Transit Gateways and VPC peering connections, as well as cross-region S3 and DynamoDB replication.”
Amazon has confirmed that these tests will be available in all commercial AWS Regions where FIS is already available, and will cost the action-minutes consumed by the experiments run.