Skip to main content

VMware HTTP reset peer

Last updated on

VMware HTTP reset peer is a VMware chaos fault that resets TCP connections to the HTTP service listening on TARGET_SERVICE_PORT inside the Linux VM VM_NAME after RESET_TIMEOUT milliseconds. The fault inserts an HTTP proxy on PROXY_PORT (on interface NETWORK_INTERFACE) that affects a TOXICITY percentage of traffic for TOTAL_CHAOS_DURATION seconds, then restores normal routing. The proxy is launched via VMware Tools (Guest Operations API) as VM_USER_NAME.

Use this fault to test how callers behave when a service rudely drops connections: whether the caller distinguishes connection reset from clean response, whether retries kick in, whether circuit breakers trip, whether monitoring detects the regression within the alerting SLA, and whether on-call alerts fire correctly.

Run your first experiment

If you have not configured the chaos infrastructure yet, go to Quickstart to install the chaos infrastructure and run an experiment end to end.


Use cases

  • Connection reset by peer: When the service resets connections, does the caller retry inside the SLO budget?
  • Half-open connection handling: Do load balancers detect resets and remove the upstream?
  • Alert fidelity: Do downstream alerts fire correctly when reset rate spikes?

Prerequisites

  • Kubernetes version: 1.21 or later for the chaos infrastructure cluster.
  • VMware Tools running on the guest: Verify with vmware-toolbox-cmd -v.
  • HTTP proxy binary installed inside the guest: Go to VMware Linux binary installation to install the HTTP chaos prerequisite.
  • Free port: PROXY_PORT is not already in use on NETWORK_INTERFACE.
  • Capability for the port: VM_USER_NAME can bind PROXY_PORT (ports below 1024 require sudo or CAP_NET_BIND_SERVICE).
  • Traffic redirected to the proxy: The fault requires iptables (or equivalent) on the guest to route service traffic through PROXY_PORT.
  • vCenter chaos role: GOVC_USERNAME is mapped to the chaos role per VMware permissions.

Supported environments

PlatformSupport status
Linux VMs hosted on vSphere / vCenter (any distro with VMware Tools, iptables, and the HTTP chaos binary)Supported
Windows VMsNot supported

Permissions required

On vCenter. Map GOVC_USERNAME to the chaos role described in VMware permissions. The role needs Guest Operations (Program execution, Modifications, Queries).

On the guest OS. VM_USER_NAME must be able to launch the HTTP chaos binary, bind PROXY_PORT, and update iptables rules for traffic redirection.


Authentication

LayerTunables
vCenterGOVC_URL, GOVC_USERNAME, GOVC_PASSWORD, GOVC_INSECURE
Guest OSVM_USER_NAME, VM_PASSWORD

Store each credential as a text secret in Harness Secret Manager and reference the secret identifier when configuring the experiment.


Fault tunables

Required parameters

TunableDescriptionDefault
VM_NAMEName of the target VM as it appears in vCenter.(required)
VM_USER_NAMEOS user account on the target VM.(required)
VM_PASSWORDPassword for VM_USER_NAME.(required)
RESET_TIMEOUTTime after which the chaos proxy resets the TCP connection (milliseconds).2000
TARGET_SERVICE_PORTPort of the target HTTP service on the guest.80

HTTP chaos parameters

TunableDescriptionDefault
NETWORK_INTERFACEInterface where the proxy is inserted.ens160
PROXY_PORTPort the chaos proxy listens on.8080
TOXICITYPercentage of intercepted requests affected (0-100).100

Chaos parameters

TunableDescriptionDefault
TOTAL_CHAOS_DURATIONTotal duration of the fault in seconds.30
CHAOS_INTERVALDelay in seconds between iterations.10
SEQUENCEparallel or serial.parallel
RAMP_TIMEWait period in seconds before and after the fault.0

vCenter authentication

TunableDescriptionDefault
GOVC_URLvCenter server URL.""
GOVC_USERNAMEvCenter user mapped to the chaos role.""
GOVC_PASSWORDPassword for GOVC_USERNAME.""
GOVC_INSECURESkip SSL certificate verification when set to true.true

Tunables that apply to every fault are documented in common tunables for all faults.


Fault execution in brief

Authenticates to vCenter, opens a Guest Operations session on VM_NAME as VM_USER_NAME, runs an HTTP chaos proxy on PROXY_PORT of NETWORK_INTERFACE, redirects traffic destined for TARGET_SERVICE_PORT through the proxy, resets the TCP connection after RESET_TIMEOUT ms for TOXICITY percent of requests for TOTAL_CHAOS_DURATION seconds, then removes the redirection and stops the proxy.


Expected behavior during fault execution

  • A configurable share of TCP connections to TARGET_SERVICE_PORT are reset by the chaos proxy.
  • Callers see ECONNRESET / connection reset by peer errors mid-request.
  • After the duration ends, the redirection is removed and connections complete normally.
When the fault ends

The chaos pod removes the traffic redirection and stops the proxy via Guest Operations. HTTP behavior returns to baseline within seconds.

Signals to watch

  • HTTP error rate: Use an HTTP probe and assert connection-reset errors stay inside the SLO budget.
  • Caller retry behavior: Use a Prometheus probe on caller-side retry metrics.

Verify the fault execution effect

  1. Send an HTTP request to the target service during the chaos window.

    curl -v http://<VM_IP>:<TARGET_SERVICE_PORT>/health

    TOXICITY percent of requests should fail with Recv failure: Connection reset by peer.

  2. Inspect iptables rules on the guest.

    sudo iptables -t nat -L -n

    You should see the chaos redirection during the window and it should be removed afterwards.


Recovery and cleanup

  • End of duration: The chaos pod removes the redirection and stops the proxy.
  • Abort: Stopping the experiment also removes the redirection.
  • Manual recovery: If the redirection remains, SSH into the VM and remove the offending iptables rule, and kill the chaos process listening on PROXY_PORT.

Limitations

  • HTTP only: The fault affects HTTP traffic. HTTPS requires the proxy to terminate TLS or the client to trust the proxy CA.
  • Single port per run: Each fault run targets one TARGET_SERVICE_PORT.
  • VMware Tools required: Without VMware Tools, the fault cannot run.

Troubleshooting

VMware HTTP reset peer has no observable effect in Harness Chaos Engineering

Verify that traffic is actually flowing through the chaos proxy (sudo iptables -t nat -L -n on the guest). Confirm TARGET_SERVICE_PORT matches the live service port and the workload talks HTTP, not HTTPS.

VMware HTTP reset peer fails with address already in use

Another process is already listening on PROXY_PORT. Either stop that process or choose a different PROXY_PORT in the experiment tunables.