Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Kubelet should report available resources for node #4441

Closed
pravisankar opened this issue Feb 13, 2015 · 4 comments
Closed

Kubelet should report available resources for node #4441

pravisankar opened this issue Feb 13, 2015 · 4 comments
Labels
area/kubelet area/nodecontroller priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.

Comments

@pravisankar
Copy link

Currently Node 'resources' field is optional. We may have few issues here:

  1. When resources field is omitted for some nodes and not for others during node creation, nodes with empty resources is not preferred by the scheduler. So the pods are unevenly distributed. (One of my team member ran into this issue while testing the release)
  2. When resources field is omitted for all nodes in the cluster. Scheduler will load balance the pods and this works well if all nodes are homogeneous (i.e. same cpu/memory limits) otherwise some nodes will be under utilized.
  3. When resources field is present but has incorrect values (fat finger) can lead to over/under utilization on the node. We are not doing any resource validations in the platform.

Few options to avoid these above issues:

  1. Make resources field mandatory for the node and validate the limits.
  2. Auto generate the node resources if the field is omitted by the user and if the field is defined by the user, validate the limits (user given limits must be <= the actual limits on the node). If we prefer this option, kubelet can provide GET '/facts' resource that can list current cpu/memory limits on the node and POST /nodes on the master can fetch the node facts and can validate or populate the resources field before creating the node representation.
    Either option 1 or 2 might need migration tool to patch existing nodes in the running cluster.
  3. Sheduler can treat empty resources field for node as infinite! resources and should be considered for scheduling pods.

If we think sheduler will be smart enough to reject pod creation based on the node resource limits at some point, I feel option 2 will be fool proof and gives better experience to the user.
Please let me know your thoughts on this: @bgrant0607 @ddysher @abhgupta

@derekwaynecarr
Copy link
Member

I would vote for 2.

I also want to get to a state where a pod will always let me know the resources it could consume or always apply same defaults.

Sent from my iPhone

On Feb 13, 2015, at 6:52 PM, Ravi Sankar Penta notifications@github.com wrote:

Currently Node 'resources' field is optional. We may have few issues here:

When resources field is omitted for some nodes and not for others during node creation, nodes with empty resources is not preferred by the scheduler. So the pods are unevenly distributed. (One of my team member ran into this issue while testing the release)
When resources field is omitted for all nodes in the cluster. Scheduler will load balance the pods and this works well if all nodes are homogeneous (i.e. same cpu/memory limits) otherwise some nodes will be under utilized.
When resources field is present but has incorrect values (fat finger) can lead to over/under utilization on the node. We are not doing any resource validations in the platform.
Few options to avoid these above issues:

Make resources field mandatory for the node and validate the limits.
Auto generate the node resources if the field is omitted by the user and if the field is defined by the user, validate the limits (user given limits must be <= the actual limits on the node). If we prefer this option, kubelet can provide GET '/facts' resource that can list current cpu/memory limits on the node and POST /nodes on the master can fetch the node facts and can validate or populate the resources field before creating the node representation. Either option 1 or 2 might need migration tool to patch existing nodes in the running cluster.
Sheduler can treat empty resources field for node as infinite! resources and should be considered for scheduling pods.
If we think sheduler will be smart enough to reject pod creation based on the node resource limits at some point, I feel option 2 will be fool proof and gives better experience to the user.
Please let me know your thoughts on this: @bgrant0607 @ddysher @abhgupta


Reply to this email directly or view it on GitHub.

@bgrant0607 bgrant0607 added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. team/master area/kubelet area/nodecontroller labels Feb 14, 2015
@bgrant0607 bgrant0607 changed the title Minimize potential problems with node 'resources' field Kubelet should report available resources for node Feb 14, 2015
@bgrant0607
Copy link
Member

In general, Kubelet should report node status. See #4135 for more discussion.

I'm happy with Kubelet reporting total/nominally available resources in status, also. If you want to implement this soon, we can discuss what the API should be exactly.

As discussed in https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/resources.md, usage data should not be in node status, however. We're working on a stats collector which will provide that info #4057.

cc @rjnagal @vishh @dchen1107 @ddysher

@bgrant0607 bgrant0607 added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed team/master labels Feb 14, 2015
@bgrant0607
Copy link
Member

@derekwaynecarr Please file a separate issue if you want to discuss defaulting and/or auto-sizing pod resource requests/limits.

@bgrant0607 bgrant0607 added the priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. label Feb 14, 2015
@pravisankar
Copy link
Author

Fixed in #5030
Node 'resources' field is moved from node spec to node status in v1beta3 and is no longer a config param to the user. Node controller or Kubelet(based on --sync_node_status flag) will populate 'resources' field and is periodically updated as part of node status.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kubelet area/nodecontroller priority/awaiting-more-evidence Lowest priority. Possibly useful, but not yet enough support to actually get it done. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling.
Projects
None yet
Development

No branches or pull requests

3 participants