Chaos faults for GCP

Last updated on Jun 22, 2026

Introduction

GCP faults disrupt resources that run on Google Cloud Platform: Compute Engine VM instances, persistent disks, and managed Cloud SQL instances. Each fault calls the GCP API (using a service account JSON key uploaded as a File Secret in Harness Secret Manager, or Workload Identity on GKE) to inject the disruption, then reverses it cleanly at the end of the configured duration. Go to Authentication options to set up credentials, and GCP IAM integration to use Workload Identity.

GCP SQL instance failover

Trigger a failover on a high-availability Cloud SQL instance so you can test how the application behaves when the primary node fails over to its standby.

instancefailover

GCP VM disk loss by label

Detach a percentage of non-boot persistent disks selected by label from GCP VM instances for a configurable duration, then reattach them.

disk loss

GCP VM disk loss

Detach one or more named non-boot persistent disks from GCP VM instances for a configurable duration, then reattach them.

disk loss

GCP VM instance stop by label

Stop a percentage of Compute Engine VMs selected by label for a configurable duration, then start them again (or rely on the MIG auto-healer).

instance stop

GCP VM instance stop

Stop one or more named Compute Engine VMs for a configurable duration, then start them again (or rely on the MIG auto-healer).

instance stop

Page 1 of 1

GCP SQL Instance Failover

GCP SQL instance failover triggers a failover on a high-availability Cloud SQL instance (SQL_INSTANCE_NAME in GCP_PROJECT_ID). The standby node becomes the new primary; the original primary becomes the new standby once the failover completes.

Use cases

Test that application connection pools reconnect cleanly when the primary fails over.
Validate that in-flight transactions surface clean rollback errors rather than data corruption.
Confirm the failover time (typically 30-90s for Cloud SQL HA) fits the application's SLO.

View details

GCP VM disk loss

GCP VM disk loss detaches one or more named non-boot persistent disks (DISK_VOLUME_NAMES in ZONES/GCP_PROJECT_ID) from their attached VMs for TOTAL_CHAOS_DURATION seconds, then reattaches them on the same device path. Boot disks are excluded by design.

Use cases

Test how a stateful workload (Postgres, MySQL, Cassandra) handles a brief storage outage.
Validate that filesystems remount cleanly when the disk returns.
Confirm DR snapshot strategies cover sudden volume loss.

View details

GCP VM disk loss by label

GCP VM disk loss by label resolves the set of non-boot persistent disks matching DISK_VOLUME_LABEL in ZONES/GCP_PROJECT_ID, picks DISK_AFFECTED_PERCENTAGE of them, detaches them from their attached VMs for TOTAL_CHAOS_DURATION seconds, then reattaches them.

Use cases

Test how replicated stateful workloads survive losing a tagged subset of storage.
Validate DR procedures for losing a labeled subset of disks across zones.

View details

GCP VM instance stop

GCP VM instance stop stops one or more Compute Engine VMs listed in VM_INSTANCE_NAMES (in ZONES/GCP_PROJECT_ID) for TOTAL_CHAOS_DURATION seconds, then starts them again. With MANAGED_INSTANCE_GROUP=enable, recovery is driven by the MIG auto-healer.

Use cases

Validate that managed instance groups recreate VMs inside the alerting SLA.
Test GKE node-down handling when a worker VM disappears.
Confirm that clients connected to stopped VMs fail over to surviving instances cleanly.

View details

GCP VM instance stop by label

GCP VM instance stop by label resolves Compute Engine VMs matching INSTANCE_LABEL in ZONES/GCP_PROJECT_ID, picks INSTANCE_AFFECTED_PERCENTAGE of them, stops them for TOTAL_CHAOS_DURATION seconds, then starts them again (unless MANAGED_INSTANCE_GROUP=enable).

Use cases

Test how the workload survives losing a tagged subset of VMs across zones.
Validate cluster-level resilience when multiple labeled VMs disappear at once.

View details

Introduction​

GCP SQL Instance Failover​

GCP VM disk loss​

GCP VM disk loss by label​

GCP VM instance stop​

GCP VM instance stop by label​

Introduction

GCP SQL Instance Failover

GCP VM disk loss

GCP VM disk loss by label

GCP VM instance stop

GCP VM instance stop by label