At v1.0, Kubernetes supports clusters up to 100 nodes with 30 pods per node and 1-2 containers per pod.
A cluster is a set of nodes (physical or virtual machines) running Kubernetes agents, managed by a "master" (the cluster-level control plane).
Normally the number of nodes in a cluster is controlled by the the value NUM_MINIONS
in the platform-specific config-default.sh
file (for example, see GCE's config-default.sh
).
Simply changing that value to something very large, however, may cause the setup script to fail for many cloud providers. A GCE deployment, for example, will run in to quota issues and fail to bring the cluster up.
When setting up a large Kubernetes cluster, the following issues must be considered.
To avoid running into cloud provider quota issues, when creating a cluster with many nodes, consider:
To prevent memory leaks or other resource issues in cluster addons from consuming all the resources available on a node, Kubernetes sets resource limits on addon containers to limit the CPU and Memory resources they can consume (See PR #10653 and #10778).
For example:
containers:
- image: gcr.io/google_containers/heapster:v0.15.0
name: heapster
resources:
limits:
cpu: 100m
memory: 200Mi
These limits, however, are based on data collected from addons running on 4-node clusters (see #10335). The addons consume a lot more resources when running on large deployment clusters (see #5880). So, if a large cluster is deployed without adjusting these values, the addons may continuously get killed because they keep hitting the limits.
To avoid running into cluster addon resource issues, when creating a cluster with many nodes, consider the following:
For directions on how to detect if addon containers are hitting resource limits, see the Troubleshooting section of Compute Resources.