Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress recurring events in to a single event to optimize etcd storage #4306

Merged
merged 1 commit into from Feb 12, 2015

Conversation

saad-ali
Copy link
Member

This implements the first and third items from the design proposal in #4073, compressing duplicate events.

This PR modifies the code that “records/writes” events:

  • It introduces a hash table to track previously fired events:
    • The key is event/object information minus timestamps/count/transient fields
    • The value is an object containing the number of times the event has been previously seen, the first time it was seen, it's name, and resource version (all fields necessary to issue future updates on the event).
  • So for each new event:
    • If the event matches a previously fired event (from the hash table), the new PUT (update) event API is called (instead of the POST/create event API) to update the existing event entry in etcd with the new last seen time stamp and count, and the event is also updated in the local hash table.
    • If the event does not match a previously fired event (from hash table), the existing behavior kicks in and calls POST/create event API to create a new entry in etcd for the event, and on success a new entry is created in the local hash table for the event.

This PR also modifies kubectl to output the new timestamps/count fields for events, sorted by the LastSeenTimestamp.

Sample kubectl get events output:

FIRSTTIME                         LASTTIME                          COUNT               NAME                                          KIND                SUBOBJECT                           REASON              SOURCE                                                  MESSAGE
Wed, 11 Feb 2015 01:24:48 +0000   Wed, 11 Feb 2015 01:24:48 +0000   1                   kubernetes-minion-3.c.saad-dev-vms.internal   Minion                                                  starting            {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Starting kubelet.
Wed, 11 Feb 2015 01:24:48 +0000   Wed, 11 Feb 2015 01:24:48 +0000   1                   kubernetes-minion-2.c.saad-dev-vms.internal   Minion                                                  starting            {kubelet kubernetes-minion-2.c.saad-dev-vms.internal}   Starting kubelet.
Wed, 11 Feb 2015 01:24:49 +0000   Wed, 11 Feb 2015 01:24:49 +0000   1                   kubernetes-minion-1.c.saad-dev-vms.internal   Minion                                                  starting            {kubelet kubernetes-minion-1.c.saad-dev-vms.internal}   Starting kubelet.
Wed, 11 Feb 2015 01:24:55 +0000   Wed, 11 Feb 2015 01:24:55 +0000   1                   kubernetes-minion-4.c.saad-dev-vms.internal   Minion                                                  starting            {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Starting kubelet.
Wed, 11 Feb 2015 01:24:46 +0000   Wed, 11 Feb 2015 01:25:01 +0000   5                   elasticsearch-logging-controller-fplln        Pod                                                     failedScheduling    {scheduler }                                            Error scheduling: no minions available to schedule pods
Wed, 11 Feb 2015 01:24:46 +0000   Wed, 11 Feb 2015 01:25:01 +0000   5                   kibana-logging-controller-ls6k1               Pod                                                     failedScheduling    {scheduler }                                            Error scheduling: no minions available to schedule pods
Wed, 11 Feb 2015 01:24:46 +0000   Wed, 11 Feb 2015 01:25:01 +0000   5                   monitoring-heapster-controller-0133o          Pod                                                     failedScheduling    {scheduler }                                            Error scheduling: no minions available to schedule pods
Wed, 11 Feb 2015 01:24:46 +0000   Wed, 11 Feb 2015 01:25:01 +0000   5                   monitoring-influx-grafana-controller-oh43e    Pod                                                     failedScheduling    {scheduler }                                            Error scheduling: no minions available to schedule pods
Wed, 11 Feb 2015 01:24:46 +0000   Wed, 11 Feb 2015 01:25:01 +0000   5                   skydns-gziey                                  Pod                                                     failedScheduling    {scheduler }                                            Error scheduling: no minions available to schedule pods
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   elasticsearch-logging-controller-fplln        BoundPod            implicitly required container POD   pulled              {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/pause:latest"
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   monitoring-influx-grafana-controller-oh43e    Pod                                                     scheduled           {scheduler }                                            Successfully assigned monitoring-influx-grafana-controller-oh43e to kubernetes-minion-2.c.saad-dev-vms.internal
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   monitoring-heapster-controller-0133o          BoundPod            implicitly required container POD   pulled              {kubelet kubernetes-minion-1.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/pause:latest"
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   monitoring-heapster-controller-0133o          Pod                                                     scheduled           {scheduler }                                            Successfully assigned monitoring-heapster-controller-0133o to kubernetes-minion-1.c.saad-dev-vms.internal
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   skydns-gziey                                  Pod                                                     scheduled           {scheduler }                                            Successfully assigned skydns-gziey to kubernetes-minion-4.c.saad-dev-vms.internal
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   monitoring-influx-grafana-controller-oh43e    BoundPod            implicitly required container POD   pulled              {kubelet kubernetes-minion-2.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/pause:latest"
Wed, 11 Feb 2015 01:25:17 +0000   Wed, 11 Feb 2015 01:25:17 +0000   1                   elasticsearch-logging-controller-fplln        Pod                                                     scheduled           {scheduler }                                            Successfully assigned elasticsearch-logging-controller-fplln to kubernetes-minion-4.c.saad-dev-vms.internal
Wed, 11 Feb 2015 01:25:18 +0000   Wed, 11 Feb 2015 01:25:18 +0000   1                   kibana-logging-controller-ls6k1               Pod                                                     scheduled           {scheduler }                                            Successfully assigned kibana-logging-controller-ls6k1 to kubernetes-minion-3.c.saad-dev-vms.internal
Wed, 11 Feb 2015 01:25:18 +0000   Wed, 11 Feb 2015 01:25:18 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            implicitly required container POD   pulled              {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/pause:latest"
Wed, 11 Feb 2015 01:25:18 +0000   Wed, 11 Feb 2015 01:25:18 +0000   1                   elasticsearch-logging-controller-fplln        BoundPod            implicitly required container POD   created             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Created with docker id 8da68b3aba1cf35f95137a49a8d892b4d4203ad57e2e66dc0f9ab8104f78e0bd
Wed, 11 Feb 2015 01:25:18 +0000   Wed, 11 Feb 2015 01:25:18 +0000   1                   monitoring-influx-grafana-controller-oh43e    BoundPod            implicitly required container POD   created             {kubelet kubernetes-minion-2.c.saad-dev-vms.internal}   Created with docker id 0d8b2a85801d3ca95c49713aec401de6be7e3b808f9f1ba197f53470ec6ec036
Wed, 11 Feb 2015 01:25:18 +0000   Wed, 11 Feb 2015 01:25:18 +0000   1                   monitoring-heapster-controller-0133o          BoundPod            implicitly required container POD   created             {kubelet kubernetes-minion-1.c.saad-dev-vms.internal}   Created with docker id afa839778d7e395b900ceb4df9dac35edb9945e0d1e8717614cf383b7864e3fd
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            implicitly required container POD   created             {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Created with docker id ace18ac48bb8c50349f8eb0ecbafb065a9edc2e6f12edcb641e5f3f88857e743
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   elasticsearch-logging-controller-fplln        BoundPod            implicitly required container POD   started             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Started with docker id 8da68b3aba1cf35f95137a49a8d892b4d4203ad57e2e66dc0f9ab8104f78e0bd
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   monitoring-heapster-controller-0133o          BoundPod            implicitly required container POD   started             {kubelet kubernetes-minion-1.c.saad-dev-vms.internal}   Started with docker id afa839778d7e395b900ceb4df9dac35edb9945e0d1e8717614cf383b7864e3fd
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            implicitly required container POD   started             {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Started with docker id ace18ac48bb8c50349f8eb0ecbafb065a9edc2e6f12edcb641e5f3f88857e743
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   monitoring-influx-grafana-controller-oh43e    BoundPod            implicitly required container POD   started             {kubelet kubernetes-minion-2.c.saad-dev-vms.internal}   Started with docker id 0d8b2a85801d3ca95c49713aec401de6be7e3b808f9f1ba197f53470ec6ec036
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   skydns-gziey                                  BoundPod            implicitly required container POD   pulled              {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/pause:latest"
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   skydns-gziey                                  BoundPod            implicitly required container POD   created             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Created with docker id 06b3ec65bf77023547246bfa64f8b3555217dae287c5a218d18353d34a1ad802
Wed, 11 Feb 2015 01:25:19 +0000   Wed, 11 Feb 2015 01:25:19 +0000   1                   skydns-gziey                                  BoundPod            implicitly required container POD   started             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Started with docker id 06b3ec65bf77023547246bfa64f8b3555217dae287c5a218d18353d34a1ad802
Wed, 11 Feb 2015 01:25:31 +0000   Wed, 11 Feb 2015 01:25:31 +0000   1                   skydns-gziey                                  BoundPod            spec.containers{etcd}               pulled              {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Successfully pulled image "quay.io/coreos/etcd:latest"
Wed, 11 Feb 2015 01:25:32 +0000   Wed, 11 Feb 2015 01:25:32 +0000   1                   skydns-gziey                                  BoundPod            spec.containers{etcd}               created             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Created with docker id 5c7bae48454bb4141899e032b0f11d86c592946943c6289c307fceeaa4677a2c
Wed, 11 Feb 2015 01:25:32 +0000   Wed, 11 Feb 2015 01:25:32 +0000   1                   skydns-gziey                                  BoundPod            spec.containers{etcd}               started             {kubelet kubernetes-minion-4.c.saad-dev-vms.internal}   Started with docker id 5c7bae48454bb4141899e032b0f11d86c592946943c6289c307fceeaa4677a2c
Wed, 11 Feb 2015 01:26:00 +0000   Wed, 11 Feb 2015 01:26:00 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            spec.containers{kibana-logging}     created             {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Created with docker id 07ef5621b6eefab5b5e9a1f287376de43fbadc3743f347e43e8268a75682b7d0
Wed, 11 Feb 2015 01:26:00 +0000   Wed, 11 Feb 2015 01:26:00 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            spec.containers{kibana-logging}     pulled              {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Successfully pulled image "kubernetes/kibana:1.0"
Wed, 11 Feb 2015 01:26:00 +0000   Wed, 11 Feb 2015 01:26:00 +0000   1                   kibana-logging-controller-ls6k1               BoundPod            spec.containers{kibana-logging}     started             {kubelet kubernetes-minion-3.c.saad-dev-vms.internal}   Started with docker id 07ef5621b6eefab5b5e9a1f287376de43fbadc3743f347e43e8268a75682b7d0

Because we keep track of event history in memory on kubelet, compression is best effort. That means that compression will not occur across kubelet restarts. Also, as noted in the proposal, if, in the future, we decide to age out events from the kubelet events hash table, then events will only be compressed until they age out of the hash table, at which point any new instance of the event will create a new entry in etcd.

E2E is clean and green:
Ran 20 of 20 Specs in 1308.916 seconds
SUCCESS! -- 20 Passed | 0 Failed | 0 Pending | 0 Skipped I0210 15:06:40.338927 21318 driver.go:83] All tests pass

@roberthbailey
Copy link
Contributor

Assigning to @davidopp. /cc @thockin

@saad-ali
Copy link
Member Author

Fixed the v1.3 issues. Please take a look.

E2E still green:
Ran 20 of 20 Specs in 1350.714 seconds
SUCCESS! -- 20 Passed | 0 Failed | 0 Pending | 0 Skipped I0211 13:41:08.444144 13635 driver.go:83] All tests pass

// created with the "" namespace. Update also requires the ResourceVersion to be set in the event
// object.
func (e *events) Update(event *api.Event) (*api.Event, error) {
if e.namespace != "" && event.Namespace != e.namespace {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check should be done on the server - there's really no need to do it in the client (it indicates a coding error in most cases).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed check.

@saad-ali
Copy link
Member Author

Thanks Clayton! Feedback addressed. PTAL

@smarterclayton
Copy link
Contributor

LGTM - do you have a clean e2e run?

@saad-ali
Copy link
Member Author

I've rebased and run e2e a few times and I have a couple of tests failing in each run.

Most recent run had these failures:

• Failure [242.909 seconds]
Pods
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/pods.go:418
  should be restarted with a /healthz http liveness probe [It]
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/pods.go:417

  Did not see the restart count of pod liveness-http in namespace e2e-test-3db8f43b-b2f3-11e4-9a0e-f0921cde18c1 increase from 0 during the test

  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/pods.go:79

• Failure [20.924 seconds]
Pods
/go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/pods.go:418
  should contain environment variables for services [It]
  /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/test/e2e/pods.go:364

  "FOOSERVICE_SERVICE_HOST=" in client env vars
  Expected
      <string>: Internal Error: pod "client-envvars-248dd495-b2f4-11e4-9a0e-f0921cde18c1.default.api" is not in 'Running' state - State: "Succeeded"

  to contain substring
      <string>: FOOSERVICE_SERVICE_HOST=

Looks like e2e is unstable at the moment, looking at the most recent jenkin's run (http://kubekins.dls.corp.google.com/job/kubernetes-e2e-gce/2164/testReport/), the same tests are failing for the same reasons.

@smarterclayton
Copy link
Contributor

Ok, we can keep an eye on it. Rerunning Travis then merging.

smarterclayton added a commit that referenced this pull request Feb 12, 2015
Compress recurring events in to a single event to optimize etcd storage
@smarterclayton smarterclayton merged commit 0b3162b into kubernetes:master Feb 12, 2015
@saad-ali saad-ali deleted the eventCompression1 branch February 13, 2015 01:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants