哔哩下载姬downkyi:你的B站视频离线收藏神器
2026/5/31 7:54:38
污点(Taints)和容忍度(Tolerations)是Kubernetes中用于控制Pod调度到特定节点的机制。污点是节点上的标记,用于排斥不匹配的Pod;容忍度是Pod上的属性,用于允许Pod调度到有污点的节点上。
flowchart TD subgraph 控制平面 A[调度器] --> B[污点管理器] A --> C[节点控制器] end subgraph 节点层 D[节点A] --> E[Taint: dedicated=special:NoSchedule] F[节点B] --> G[Taint: node-role.kubernetes.io/control-plane:NoSchedule] H[节点C] --> I[无污点] end subgraph Pod层 J[Pod1] --> K[Toleration: dedicated=special] L[Pod2] --> M[Toleration: node-role.kubernetes.io/control-plane] N[Pod3] --> O[无容忍度] end A --> D A --> F A --> H J --> A L --> A N --> A| 组件 | 功能描述 | 作用 |
|---|---|---|
| Taint | 节点污点标记 | 排斥不匹配的Pod |
| Toleration | Pod容忍度属性 | 允许调度到有污点的节点 |
| NodeSelector | 节点选择器 | 选择特定标签的节点 |
| Affinity | 亲和性配置 | 控制Pod调度偏好 |
| 类型 | 效果 | 适用场景 |
|---|---|---|
| NoSchedule | 不调度到该节点 | 专用节点、控制平面节点 |
| PreferNoSchedule | 优先不调度 | 偏好性调度控制 |
| NoExecute | 立即驱逐不匹配的Pod | 节点维护、故障节点 |
# 为节点添加污点 kubectl taint nodes node-1 dedicated=special:NoSchedule # 查看节点污点 kubectl describe node node-1 | grep Taints # 移除污点 kubectl taint nodes node-1 dedicated=special:NoSchedule-apiVersion: v1 kind: Pod metadata: name: special-pod spec: tolerations: - key: "dedicated" operator: "Equal" value: "special" effect: "NoSchedule" containers: - name: nginx image: nginx容忍度操作符说明:
# Exists操作符 - 只要污点存在就容忍 tolerations: - key: "dedicated" operator: "Exists" effect: "NoSchedule" # Equal操作符 - 精确匹配污点值 tolerations: - key: "dedicated" operator: "Equal" value: "special" effect: "NoSchedule" # 容忍多种污点效果 tolerations: - key: "node.kubernetes.io/unreachable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 300apiVersion: v1 kind: Pod metadata: name: affinity-pod spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: "disktype" operator: "In" values: - "ssd" preferredDuringSchedulingIgnoredDuringExecution: - weight: 1 preference: matchExpressions: - key: "zone" operator: "In" values: - "us-east-1a" containers: - name: nginx image: nginxflowchart LR A[Pod创建] --> B[调度器筛选节点] B --> C{节点有污点?} C -->|否| D[调度到该节点] C -->|是| E{Pod有容忍度?} E -->|是| F{容忍度匹配?} F -->|是| D F -->|否| G[跳过该节点] E -->|否| G# 为专用节点添加污点 apiVersion: v1 kind: Node metadata: name: gpu-node-1 labels: node-role.kubernetes.io/gpu: "" spec: taints: - key: "nvidia.com/gpu" operator: "Exists" effect: "NoSchedule" # 允许GPU Pod调度到该节点 apiVersion: v1 kind: Pod metadata: name: gpu-pod spec: tolerations: - key: "nvidia.com/gpu" operator: "Exists" effect: "NoSchedule" containers: - name: gpu-workload image: nvidia/cuda:11.0-base resources: limits: nvidia.com/gpu: 1# 标记节点进行维护(驱逐所有Pod) kubectl taint nodes node-1 node.kubernetes.io/unschedulable:NoExecute # 允许特定Pod在维护期间继续运行 apiVersion: v1 kind: Pod metadata: name: critical-pod spec: tolerations: - key: "node.kubernetes.io/unschedulable" operator: "Exists" effect: "NoExecute" tolerationSeconds: 3600 # 延迟1小时后驱逐 containers: - name: critical-service image: my-critical-service| 挑战 | 原因 | 解决方案 |
|---|---|---|
| 配置复杂 | 多个污点和容忍度组合 | 配置模板化 |
| 调度冲突 | 多个约束条件冲突 | 优先级调度 |
| 资源浪费 | 专用节点未充分利用 | 弹性调度 |
| 维护困难 | 大量节点需要管理 | 自动化管理 |
from kubernetes import client, config class TaintManager: def __init__(self): config.load_kube_config() self.api = client.CoreV1Api() def add_taint(self, node_name, key, value, effect): """为节点添加污点""" taint = client.V1Taint( key=key, value=value, effect=effect ) node = self.api.read_node(node_name) if node.spec.taints is None: node.spec.taints = [] # 避免重复添加 existing_taints = [t for t in node.spec.taints if t.key == key] if not existing_taints: node.spec.taints.append(taint) self.api.patch_node(node_name, node) def remove_taint(self, node_name, key): """移除节点污点""" node = self.api.read_node(node_name) if node.spec.taints: node.spec.taints = [t for t in node.spec.taints if t.key != key] self.api.patch_node(node_name, node) def get_nodes_with_taint(self, key): """获取有特定污点的节点""" nodes = self.api.list_node().items return [n.metadata.name for n in nodes if n.spec.taints and any(t.key == key for t in n.spec.taints)] # 使用示例 manager = TaintManager() manager.add_taint('node-1', 'maintenance', 'true', 'NoSchedule') tainted_nodes = manager.get_nodes_with_taint('maintenance') print(f"维护中的节点: {tainted_nodes}")污点和容忍度是控制Pod节点调度的关键机制,它通过节点污点和Pod容忍度的匹配,实现细粒度的调度控制。随着Kubernetes的发展,污点和容忍度变得越来越重要。
在实践中,我们需要关注需求分析、策略设计、部署配置和运维管理等方面。通过选择合适的技术和最佳实践,可以构建高效、可靠的污点和容忍度调度体系。