Skip to main content

VMware network rate limit

Last updated on

VMware network rate limit is a VMware chaos fault that caps egress bandwidth on the network interface NETWORK_INTERFACE of the Linux VM VM_NAME to NETWORK_BANDWIDTH (with optional BURST and LIMIT) for TOTAL_CHAOS_DURATION seconds, then removes the cap. The fault uses VMware Tools (Guest Operations API) to apply the rule inside the guest as VM_USER_NAME.

Use this fault to test how a workload on a VMware-hosted VM behaves when bandwidth is constrained: whether streaming/transfer workloads degrade gracefully, whether retries amplify the slowdown, and whether monitoring detects the regression within the alerting SLA.

Run your first experiment

If you have not configured the chaos infrastructure yet, go to Quickstart to install the chaos infrastructure and run an experiment end to end.


Use cases

  • Constrained bandwidth: When bandwidth is throttled, does the workload degrade inside the SLA?
  • Replication backlog: Does database replication catch up after the cap is removed?
  • Backup window: Does a backup job complete within the maintenance window when bandwidth halves?

Prerequisites


Supported environments

PlatformSupport status
Linux VMs hosted on vSphere / vCenter (any distro with VMware Tools and tc)Supported
Windows VMsNot supported

Permissions required

On vCenter. Map GOVC_USERNAME to the chaos role described in VMware permissions. The role needs Guest Operations (Program execution, Modifications, Queries).

On the guest OS. VM_USER_NAME must be able to run tc qdisc on NETWORK_INTERFACE.


Authentication

LayerTunables
vCenterGOVC_URL, GOVC_USERNAME, GOVC_PASSWORD, GOVC_INSECURE
Guest OSVM_USER_NAME, VM_PASSWORD

Store each credential as a text secret in Harness Secret Manager and reference the secret identifier when configuring the experiment.


Fault tunables

Required parameters

TunableDescriptionDefault
VM_NAMEName of the target VM as it appears in vCenter.(required)
VM_USER_NAMEOS user account on the target VM.(required)
VM_PASSWORDPassword for VM_USER_NAME.(required)

Rate-limit parameters

TunableDescriptionDefault
NETWORK_INTERFACEName of the interface to throttle.eth0
NETWORK_BANDWIDTHBandwidth cap with unit (for example 1mbit, 512kbit).1mbit
BURSTMaximum burst size with unit (for example 2kb).""
LIMITQueue limit in bytes (for example 20mb).""
PEAK_RATEPeak rate for the bucket (for example 1mb).""
MIN_BURSTMinimum chunk size with unit (for example 1540).""

Chaos parameters

TunableDescriptionDefault
TOTAL_CHAOS_DURATIONTotal duration of the fault in seconds.30
CHAOS_INTERVALDelay in seconds between iterations.10
SEQUENCEparallel or serial.parallel
RAMP_TIMEWait period in seconds before and after the fault.0

vCenter authentication

TunableDescriptionDefault
GOVC_URLvCenter server URL.""
GOVC_USERNAMEvCenter user mapped to the chaos role.""
GOVC_PASSWORDPassword for GOVC_USERNAME.""
GOVC_INSECURESkip SSL certificate verification when set to true.true

Tunables that apply to every fault are documented in common tunables for all faults.


Fault execution in brief

Authenticates to vCenter, opens a Guest Operations session on VM_NAME as VM_USER_NAME, installs a token-bucket queueing discipline on NETWORK_INTERFACE that caps egress bandwidth at NETWORK_BANDWIDTH (with BURST, LIMIT, PEAK_RATE, MIN_BURST when set) for TOTAL_CHAOS_DURATION seconds, then removes the rule.


Expected behavior during fault execution

  • Egress throughput from VM_NAME is capped at NETWORK_BANDWIDTH.
  • Large transfers slow down; replication may fall behind.
  • After the duration ends, the cap is removed and throughput returns to baseline.
When the fault ends

The chaos pod removes the tc qdisc rule via Guest Operations. Throughput returns to baseline within seconds.

Signals to watch

  • Throughput: Use a Prometheus probe on node_network_transmit_bytes_total.
  • Replication lag: Use a command probe that reads the replication lag from your database.

Verify the fault execution effect

  1. Inspect the qdisc on the guest.

    sudo tc qdisc show dev eth0

    Look for the token-bucket filter with the configured rate.

  2. Run an iperf/scp transfer from the VM.

    Throughput should be capped at NETWORK_BANDWIDTH during the window.


Recovery and cleanup

  • End of duration: The chaos pod removes the rule.
  • Abort: Stopping the experiment also removes the rule.
  • Manual recovery: sudo tc qdisc del dev <NETWORK_INTERFACE> root.

Limitations

  • Egress only: The cap applies to egress traffic only.
  • Single interface per run: Repeat the fault for additional interfaces.
  • Unit format: NETWORK_BANDWIDTH must include a tc-compatible unit (kbit, mbit, etc).

Troubleshooting

VMware network rate limit has no observable effect in Harness Chaos Engineering

Verify NETWORK_INTERFACE matches the active interface inside the guest (ip a). Verify the workload actually transmits more than NETWORK_BANDWIDTH before the cap. Verify VM_USER_NAME can run tc with sudo.

tc rule remains after the experiment

Run sudo tc qdisc del dev <NETWORK_INTERFACE> root inside the guest to remove lingering rules.