Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compress duplicate events #4073

Closed
dchen1107 opened this issue Feb 3, 2015 · 5 comments
Closed

Compress duplicate events #4073

dchen1107 opened this issue Feb 3, 2015 · 5 comments
Assignees
Labels
area/introspection priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Milestone

Comments

@dchen1107
Copy link
Member

Fork from #3329 ...

When something goes bad, the system generates tons of events which are identical except the timestamps. For example, pulling a non existing image, Kubelet will generate a lot of image_not_existing events and container_is_waiting events until upstream components corrects the image. When such things happen, the entire event mechanism becomes useless. This also causes memory pressure for etcd #3853

@dchen1107 dchen1107 added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. area/introspection labels Feb 3, 2015
@saad-ali
Copy link
Member

saad-ali commented Feb 3, 2015

Design Proposal for Compressing Events

  1. At the point where the kubelet code “records/writes” the event (StartRecording in pkg/client/record/event.go):
    • Introduce a hash table to track previously fired events (where key is “event, minus timestamps and count” and value is ”number of times previously seen”).
    • For each new event:
      • If the event matches with a previously fired event (from hash table), increment the count in the hash table, and call new PUT (update) event API (instead of the POST/create event API) to update existing event entry with new last seen time stamp and count.
      • If the event does not match a previously fired event (from hash table), create an entry in the hash table for the event with an initial value of 1, and continue with the existing behavior (call POST/create event API) to create a new entry for the event.
  2. In pkg/api/types.go modify the Event struct to:
    • Change Timestamp to InitialTimestamp
    • Add a LastSeenTimeStamp field (by default it will have the same value as InitialTimeStamp)
    • Add a Count field (by default it will be 1)
  3. Update kubectl, to correctly print out the new/modified Event struct fields.
  4. Extend the Kubernetes API to add “Update Event" support, so that an exiting instance of an EVENT can be replaced with a new instance, instead of always creating a new instance.

"Got yas" to watch out for:

  • Hash table clean up.
    • If kubelet runs for a long period of time and fires a ton of unique events, the hash table could grow very large in memory.
    • Future consideration: remove entries from the hash table that are older then some specified time.
  • Hash table not preserved across Kubelet restarts:
    • If Kubelet restarts, the hash table is cleared, thus, compression will not occur for events across Kubelet restarts. This shouldn't be a problem.

@smarterclayton
Copy link
Contributor

Does sort order for events change to "last seen"?

@saad-ali
Copy link
Member

saad-ali commented Feb 4, 2015

Good question. I was planning on leaving it as "Initial Time", but now that I think about it, it would make more sense to sort on "Last Seen Time".

@dchen1107 dchen1107 added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 4, 2015
@thockin
Copy link
Member

thockin commented Feb 6, 2015

We should convert this doc into something we can actually check in to the codebase under docs/design - doesn't have to be long, but should explain and be actually current.

@saad-ali
Copy link
Member

saad-ali commented Feb 6, 2015

Great point, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/introspection priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/node Categorizes an issue or PR as relevant to SIG Node.
Projects
None yet
Development

No branches or pull requests

5 participants