Heapster scalability testing #5880

vishh · 2015-03-24T22:58:23Z

Heapster must be tested to ensure that it meets our v1.0 scalability goals - 100 node clusters (#3876) each running 30 - 50 pods each (#4188). A soak test might also be very helpful.
Some of the interesting signals to track includes,

Heapster uptime
Kubelet stats API latency
API server watch errors
Monitoring Backend (GCM) write errors, QPS
Total number of metrics being handled
Heapster housekeeping latency

Heapster needs to expose some metrics to aid in scalability testing.

cc @vmarmol

vishh · 2015-05-11T17:56:40Z

cc @dchen1107

vishh · 2015-05-27T21:22:26Z

I made some changes to heapster to make it scale to 100 nodes without any load.
Next steps:

Measure resource usage of heapster in a 100 node cluster with and without load.
Soak test for heapster to quantify its reliability.

cc @saad-ali @dchen1107

thockin · 2015-06-05T04:09:44Z

@vishh Plans for this? Is there someone available to take this right now?

vishh · 2015-06-05T17:08:20Z

@saad-ali: Will you be able to look at heapster scalability?

saad-ali · 2015-06-11T22:13:19Z

Abhi (@ArtfulCoder) and I will work on this together.

Ideally we'd like to measure the following metrics:

Heapster to API Server
- QPS
- Latency
- Read/write error rate
- Total number of events being pulled
Heapster to Kubelet(s)
- QPS
- Latency
- Read/write error rate
- Total number of metrics being pulled
Heapster to backends (GCM, GCL, InfluxDB)
- QPS
- Latency
- Read/write error rate
- Total number of events/metrics being pushed

These internal metrics require implementing a Heapster instrumentation infrastructure that doesn't yet exist. Therefore, we'll consider this lower priority and likely post V1 task.

Instead we'll focus on getting a baseline of the following basic process metrics for Heapster:

Uptime
Memory usage
CPU usage
Network bandwidth usage

These are available today because Heapster is run in a container.

For a first stab at this, we plan on doing the following:

On a cluster of 4-5 nodes, start many (100s?) of containers that essentially do nothing (but will generate metrics).
Make sure Heapster is running on the cluster and pulling metrics and events and pushing them to influxdb and GCM.
Pull the basic container metrics (CPU, memory, uptime, network usage) for Heapster via custom script periodically (30 sec? 1 min? 5 min? 15 min?)
Let the cluster run for 24-72 hours.

vishh · 2015-06-11T22:17:59Z

Thanks for picking this up!
4 can be tackled by using a real monitoring backend like GCM.

dchen1107 · 2015-06-11T22:19:56Z

@saad-ali The plan SGTM Thanks!

saad-ali · 2015-06-12T17:45:45Z

Abhi and I set up a GCE cluster with 4 nodes yesterday. We scheduled 275 pods (1 container each) on the cluster. Within an hour heapster stopped sending data to GCM because we hit quota limits:

W0612 02:21:55.875194       1 driver.go:388] [GCM] Push attempt 2 failed: request &{Method:POST URL:https://www.googleapis.com/cloudmonitoring/v2beta2/projects/saads-vms2/timeseries:write Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Content-Type:[application/json] Authorization:[Bearer ya29.kAFTeh9ov-tLTYj7_5tq1JRyx86qTRlLNRwDEykK0ZvjKSZRX6pBRxDm-B3XpP68TuzZteCrxGLUDQ]] Body:{Reader:0x4c209d730e0} ContentLength:13935 TransferEncoding:[] Close:false Host:www.googleapis.com Form:map[] PostForm:map[] MultipartForm:<nil> Trailer:map[] RemoteAddr: RequestURI: TLS:<nil>} failed with status "403 Forbidden" and response: &{Status:403 Forbidden StatusCode:403 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Vary:[Origin X-Origin] Date:[Fri, 12 Jun 2015 02:21:55 GMT] Expires:[Fri, 12 Jun 2015 02:21:55 GMT] Cache-Control:[private, max-age=0] X-Frame-Options:[SAMEORIGIN] Alternate-Protocol:[443:quic,p=1] Content-Type:[application/json; charset=UTF-8] X-Content-Type-Options:[nosniff] X-Xss-Protection:[1; mode=block] Server:[GSE]] Body:0x4c209126780 ContentLength:-1 TransferEncoding:[chunked] Close:false Trailer:map[] Request:0x4c209d36d00 TLS:0x4c208217980}, Body: "{
 "error": {
  "errors": [
   {
    "domain": "usageLimits",
    "reason": 
"quotaExceeded",
    "message": "Request would exceed timeseries quota of 20000"
   }
  ],
  "code": 403,
  "message": "Request would exceed timeseries quota of 20000"
 }
}
"

By morning time the error switched to:

 failed: request &{Method:POST URL:https://www.googleapis.com/cloudmonitoring/v2beta2/projects/saads-vms2/timeseries:write Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Content-Type:[application/json] Authorization:[Bearer ya29.kAEbkiec73bWCBbuKTM7XPJ7mofV40aDG_7uK196GIR2WKEJsvw-Kef3yq100fHSBR_qoYIOvQQl-A]] Body:{Reader:0x4c20a4db830} ContentLength:3012 TransferEncoding:[] Close:false Host:www.googleapis.com Form:map[] PostForm:map[] MultipartForm:<nil> Trailer:map[] RemoteAddr: RequestURI: TLS:<nil>} failed with status "403 Forbidden" and response: &{Status:403 Forbidden StatusCode:403 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[X-Content-Type-Options:[nosniff] Vary:[Origin X-Origin] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 12 Jun 2015 17:27:18 GMT] Expires:[Fri, 12 Jun 2015 17:27:18 GMT] Cache-Control:[private, max-age=0] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[1; mode=block] Server:[GSE] Alternate-Protocol:[443:quic,p=1]] Body:0x4c209ce58c0 ContentLength:-1 TransferEncoding:[chunked] Close:false Trailer:map[] Request:0x4c209d37380 TLS:0x4c209336a80}, Body:
 "{
 "error": {
  "errors": [
   {
    "domain": "usageLimits",
    "reason": "dailyLimitExceeded",
    "message": "Daily Limit Exceeded"

   }
  ],
  "code": 403,
  "message": "Daily Limit Exceeded"
 }
}
"

To by pass I will try to get quota increased, in the meantime, I'll set up a script to scrap docker stats off the machine directly.

CC @erickhan @dchen1107

vishh · 2015-06-12T17:48:20Z

The quota issue is expected.

On Fri, Jun 12, 2015 at 10:46 AM, Saad Ali notifications@github.com wrote:

Abhi and I set up a GCE cluster with 4 nodes yesterday. We scheduled 275
pods (1 container each) on the cluster. Within an hour heapster stopped
sending data to GCM because we hit quota limits:

W0612 02:21:55.875194 1 driver.go:388] [GCM] Push attempt 2 failed: request &{Method:POST URL:https://www.googleapis.com/cloudmonitoring/v2beta2/projects/saads-vms2/timeseries:write Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Content-Type:[application/json] Authorization:[Bearer ya29.kAFTeh9ov-tLTYj7_5tq1JRyx86qTRlLNRwDEykK0ZvjKSZRX6pBRxDm-B3XpP68TuzZteCrxGLUDQ]] Body:{Reader:0x4c209d730e0} ContentLength:13935 TransferEncoding:[] Close:false Host:www.googleapis.com Form:map[] PostForm:map[] MultipartForm: Trailer:map[] RemoteAddr: RequestURI: TLS:} failed with status "403 Forbidden" and response: &{Status:403 Forbidden StatusCode:403 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Vary:[Origin X-Origin] Date:[Fri, 12 Jun 2015 02:21:55 GMT] Expires:[Fri, 12 Jun 2015 02:21:55 GMT] Cache-Control:[private, max-age=0] X-Frame-Options:[SAMEORIGIN] Alternate-Protocol:[443:quic,p=1] Content-Type:[application/json; charset=UTF-8] X-Co
ntent-Type-Options:[nosniff] X-Xss-Protection:[1; mode=block] Server:[GSE]] Body:0x4c209126780 ContentLength:-1 TransferEncoding:[chunked] Close:false Trailer:map[] Request:0x4c209d36d00 TLS:0x4c208217980}, Body: "{
"error": {
"errors": [
{
"domain": "usageLimits",
"reason":
"quotaExceeded",
"message": "Request would exceed timeseries quota of 20000"
}
],
"code": 403,
"message": "Request would exceed timeseries quota of 20000"
}
}
"

By morning time the error switched to:

failed: request &{Method:POST URL:https://www.googleapis.com/cloudmonitoring/v2beta2/projects/saads-vms2/timeseries:write Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[Content-Type:[application/json] Authorization:[Bearer ya29.kAEbkiec73bWCBbuKTM7XPJ7mofV40aDG_7uK196GIR2WKEJsvw-Kef3yq100fHSBR_qoYIOvQQl-A]] Body:{Reader:0x4c20a4db830} ContentLength:3012 TransferEncoding:[] Close:false Host:www.googleapis.com Form:map[] PostForm:map[] MultipartForm: Trailer:map[] RemoteAddr: RequestURI: TLS:} failed with status "403 Forbidden" and response: &{Status:403 Forbidden StatusCode:403 Proto:HTTP/1.1 ProtoMajor:1 ProtoMinor:1 Header:map[X-Content-Type-Options:[nosniff] Vary:[Origin X-Origin] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 12 Jun 2015 17:27:18 GMT] Expires:[Fri, 12 Jun 2015 17:27:18 GMT] Cache-Control:[private, max-age=0] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[1; mode=block] Server:[GSE] Alternate-Protocol:[443:q
uic,p=1]] Body:0x4c209ce58c0 ContentLength:-1 TransferEncoding:[chunked] Close:false Trailer:map[] Request:0x4c209d37380 TLS:0x4c209336a80}, Body:
"{
"error": {
"errors": [
{
"domain": "usageLimits",
"reason": "dailyLimitExceeded",
"message": "Daily Limit Exceeded"

}
],
"code": 403,
"message": "Daily Limit Exceeded"
}
}
"

To by pass I will try to get quota increased, in the meantime, I'll set up
a script to scrap docker stats off the machine directly.

CC @erickhan https://github.com/erickhan @dchen1107
https://github.com/dchen1107

—
Reply to this email directly or view it on GitHub
#5880 (comment)
.

dchen1107 · 2015-06-12T17:51:52Z

We are filing the request to ask for more quota. Is there any easy way to get those quota, and do we have a estimation on how big quota it should be for a given cluster size (number of nodes and number of pods, etc. ) cc/ @roberthbailey @a-robinson too.

saad-ali · 2015-06-15T23:03:27Z

Here are the initial set of results.

Test Setup

Cluster characteristics:
- 4 nodes
- n1-standard-1 (1 vCPU, 3.8 GB memory)
Test duration: 20 hours total
- 17 hours under "no load" to get a baseline
  - No additional pods/containers other than default
- 3 hours under "high load"
  - 135 additional pods, with 10 containers each (1350 extra containers)
  - Container used was a small a statically-linked C program that just sleeps for 28 days with a container size of 877.6 kB.

Results

Memory

Heapster and InfluxDB memory usage grows proportionally with the number of pods/containers.

Heapster Memory Usage

At "no load" memory usage slowly creeped up to 50 MB.
At "high load" spiked to around 230MB.

InfluxDB Memory Usage

At "no load" creeped up from 25MB to 50MB.
At "high load" spiked to around 354MB.

Fluentd/ElasticSearch Memory Usage

At "no load" stable around 47MB
At "high load" stable around 51MB

CPU

The CPU usage is cumulative and, thus, increases over time. Meaning the rate of increase is the interesting bit. But the most interesting bit is the relative rate of use overtime: in particular, InfluxDB appears to use almost an order of magnitude more CPU than ElasticSearch and even Heapster (I saw InfluxDB consuming, at times, 80% of CPU on the machine it was on).

Heapster CPU Usage

InfluxDB CPU Usage

Fluentd/ElasticSearch CPU Usage

Restarts

Hepaster, ElasticSearch, InfluxDB, Grafana containers never restarted during the test.

dchen1107 · 2015-06-15T23:20:29Z

@saad-ali After you restart heapster, does the memory usage go down? Looks like there are some memory leakage issues.

Edit: I misread @saad-ali's comment above. There are more loads later when spike happens.

saad-ali · 2015-06-16T01:40:23Z

@dchen1107 Yes, once the extra containers are removed, Heapster memory usage drops back down:

As does InfluxDB memory usage (though not nearly close to original levels):

saad-ali · 2015-06-22T02:33:52Z

Here are the results from a 100 node cluster 5000 pod run.

Test Setup

Cluster characteristics:
- 100 nodes
- n1-standard-1 (1 vCPU, 3.75 GB memory)
Test duration: 20 hours total
- 3 hours under "no load" to get a baseline
  - No additional pods/containers other than default
- 17 hours under "high load"
  - 5000 additional pods, with 2 containers each
    - Was only able to schedule 4983 Pods (17 Pods remained in waiting state)
  - Container used was a small a statically-linked C program that just sleeps for 28 days with a container size of 877.6 kB.

Results

Memory

Reconfirmed that Heapster and InfluxDB memory usage grows proportionally with the number of nodes/pods/containers.

Heapster Memory Usage

At "no load" memory was around 600 MB.
At "high load" memory was stable around 2.4 GB.

InfluxDB Memory Usage

At "no load" creeped up to 200 MB.
At "high load" up to 1.9 GB (I noticed spikes as high as 3.2 GB while removing pods).

Fluentd/ElasticSearch Memory Usage

At "no load" stable around 45 MB
At "high load" between 50-60 MB

CPU

The CPU usage is the derivative of the cumulative usage overtime and thus shows the rate of change over time.. InfluxDB was consistently high (pegged around 80-95%). Heapster would spike up and down between (as high as 99%).

Heapster CPU Usage

InfluxDB CPU Usage

Fluentd/ElasticSearch CPU Usage

Restarts

Hepaster, ElasticSearch, InfluxDB, Grafana containers never restarted during the test because of crashes. But Heapster did get appear to get rescheduled a two times (on to the same machine).

dchen1107 · 2015-06-23T21:09:58Z

Nice work, @saad-ali. I am closing this one since @vishh filed a separate one for configuring those addons pod: #10256

data collected by kubernetes#5880

…collected by kubernetes#5880

These appear to be the numbers google is using to hit their 100-node 1.0 goal, per perf testing done under kubernetes/kubernetes#5880 This looks like 12X less data, and I've been finding influx unresponsive somewhere between 10-20 nodes, so maybe this is all the breathing room we need.

These appear to be the numbers google is using to hit their 100-node 1.0 goal, per perf testing done under kubernetes/kubernetes#5880 The defaults are 10s poll interval, 5s resolution, so this should back off load by about an order of magnitude. TODO: - drop the verbose flag once finished debugging

These appear to be the numbers google is using to hit their 100-node 1.0 goal, per perf testing done under kubernetes/kubernetes#5880 The defaults are 10s poll interval, 5s resolution, so this should back off load by about an order of magnitude.

These appear to be the numbers google is using to hit their 100-node 1.0 goal, per perf testing done under kubernetes/kubernetes#5880 The defaults are 10s poll interval, 5s resolution, so this should back off load by about an order of magnitude. We're using `avoidColumns=true` to force heapster to avoid additional columns and instead append all metadata into the series names. It makes the series name ugly and hard to aggregate on the grafana side, but it wildly reduces CPU load. I guess that's why influxdb docs recommend more series with fewer points over fewer series with more points. Grafana's kraken dashboard updated to use the new series.

jeremyeder · 2016-11-04T14:34:08Z

@saad-ali did you ever push this further in total pod count? We're seeing failures after 12k...

saad-ali · 2016-11-04T22:45:11Z

This is pretty old. Check out http://blog.kubernetes.io/2016/07/kubernetes-updates-to-performance-and-scalability-in-1.3.html

Check out the https://github.com/kubernetes/community/blob/master/sig-scalability/README.md they should be able to give you the current information/plans, and address any issues you are having with published numbers.

jeremyeder · 2016-11-14T13:36:24Z

Found out last week that Heapster is being deprecated in favor of a metrics server and other components.

vishh added priority/backlog Higher priority than priority/awaiting-more-evidence. team/cluster labels Mar 24, 2015

vishh added this to the v1.0 milestone Mar 24, 2015

brendandburns modified the milestones: v1.0, v1.0-post Apr 28, 2015

dchen1107 added area/test-infra area/test labels May 11, 2015

vishh added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/backlog Higher priority than priority/awaiting-more-evidence. labels May 11, 2015

dchen1107 modified the milestones: v1.0-candidate, v1.0-post May 13, 2015

roberthbailey modified the milestones: v1.0, v1.0-candidate May 19, 2015

ghost removed area/test area/test-infra labels Jun 5, 2015

goltermann assigned saad-ali Jun 11, 2015

dchen1107 mentioned this issue Jun 15, 2015

Cluster addons make large clusters unstable (user container's oom_score_adj is misconfigured) #9788

Closed

vishh mentioned this issue Jun 23, 2015

Set resource limits on Heapster #10256

Closed

dchen1107 closed this as completed Jun 23, 2015

dchen1107 added a commit to dchen1107/kubernetes-1 that referenced this issue Jun 23, 2015

Set resource limit for both heapster and influxdb container based on

e0186c3

data collected by kubernetes#5880

dchen1107 mentioned this issue Jun 23, 2015

Set resource limit for both heapster and influxdb container based on #10260

Merged

dchen1107 added a commit to dchen1107/kubernetes-1 that referenced this issue Jun 25, 2015

Set resource limit for fluentd-elasticsearch container based on data …

898cd3d

…collected by kubernetes#5880

This was referenced Jun 25, 2015

Set resource limit for fluentd-elasticsearch container based on data … #10334

Closed

Measure kubernetes addons resource usage #10335

Closed

a-robinson mentioned this issue Jul 1, 2015

Run fluentd on the master to collect the core master logs #10597

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heapster scalability testing #5880

Heapster scalability testing #5880

vishh commented Mar 24, 2015

vishh commented May 11, 2015

vishh commented May 27, 2015

thockin commented Jun 5, 2015

vishh commented Jun 5, 2015

saad-ali commented Jun 11, 2015

vishh commented Jun 11, 2015

dchen1107 commented Jun 11, 2015

saad-ali commented Jun 12, 2015

vishh commented Jun 12, 2015

dchen1107 commented Jun 12, 2015

saad-ali commented Jun 15, 2015

dchen1107 commented Jun 15, 2015

saad-ali commented Jun 16, 2015

saad-ali commented Jun 22, 2015

dchen1107 commented Jun 23, 2015

jeremyeder commented Nov 4, 2016

saad-ali commented Nov 4, 2016

jeremyeder commented Nov 14, 2016

Heapster scalability testing #5880

Heapster scalability testing #5880

Comments

vishh commented Mar 24, 2015

vishh commented May 11, 2015

vishh commented May 27, 2015

thockin commented Jun 5, 2015

vishh commented Jun 5, 2015

saad-ali commented Jun 11, 2015

vishh commented Jun 11, 2015

dchen1107 commented Jun 11, 2015

saad-ali commented Jun 12, 2015

vishh commented Jun 12, 2015

dchen1107 commented Jun 12, 2015

saad-ali commented Jun 15, 2015

Test Setup

Results

Memory

CPU

Restarts

dchen1107 commented Jun 15, 2015

saad-ali commented Jun 16, 2015

saad-ali commented Jun 22, 2015

Test Setup

Results

Memory

CPU

Restarts

dchen1107 commented Jun 23, 2015

jeremyeder commented Nov 4, 2016

saad-ali commented Nov 4, 2016

jeremyeder commented Nov 14, 2016