通过本文的指导,读者可以了解如何通过二进制的方式部署 Kubernetes 1.27.3 版本集群。二进制部署可以加深对 Kubernetes 各组件的理解,可以灵活地将各个组件部署到不同的机器,以满足自身的要求。但是需要注意的是,二进制部署需要手动配置各个组件,需要一定的技术水平和经验。
二进制部署 k8s 集群 1.27.3 版本
1. 环境准备
虽然 kubeadm , kops , kubespray 以及 rke , kubesphere 等工具可以快速部署 k8s 集群,但是依然会有很多人热衷与使用二进制部署 k8s 集群。
二进制部署可以加深对 k8s 各组件的理解,可以灵活地将各个组件部署到不同的机器,以满足自身的要求。还可以生成一个超长时间自签证书,比如 99 年,免去忘记更新证书过期带来的生产事故。
1.1 书写约定
命令行输入,均以 ➜
符号表示
注释使用 #
或 //
表示
执行命令输出结果,以空行分隔
如无特殊说明,命令需要在全部集群节点执行
1.2 机器规划
1.2.1 操作系统
Rocky-8.6-x86_64-minimal.iso
1.2.2 集群节点
master
master1
10.128.170.21
etcd, kube-apiserver, kube-controller-manager, kubelet, kube-proxy, kube-scheduler
worker
worker1
10.128.170.131
kubelet, kube-proxy
worker
worker2
10.128.170.132
kubelet, kube-proxy
worker
worker3
10.128.170.133
kubelet, kube-proxy
1.2.3 测试节点(可选)
registry
registry
10.128.170.235
docker
1.3 环境配置
1.3.1 基础设置
1.3.2 k8s 环境设置
关闭防火墙
1 2 ➜ systemctl stop firewalld ➜ systemctl disable firewalld
关闭 selinux
1 2 3 4 # 临时 ➜ setenforce 0 # 永久 ➜ sed -i 's/enforcing/disabled/' /etc/selinux/config
关闭 swap
1 2 3 4 # 临时 ➜ swapoff -a # 永久 ➜ sed -ri 's/.*swap.*/#&/' /etc/fstab
使用 -r
选项可以使用扩展正则表达式,这提供了一种更强大和灵活的方式来匹配文本中的模式。
使用正则表达式 .*swap.*
匹配包含 swap 字符串的行,并在行首添加 # 符号,& 表示匹配到的整个字符串。
设置文件描述符限制
1 2 3 4 # 临时 ➜ ulimit -SHn 65535 # 永久 ➜ echo "* - nofile 65535" >>/etc/security/limits.conf
用于设置当前用户的最大文件描述符数限制。具体来说,它的作用是将当前用户的软限制和硬限制都设置为 65535。
时间同步
1 2 3 4 5 # 设置时区 ➜ timedatectl set-timezone Asia/Shanghai # 安装时间同步服务 ➜ yum install -y chrony ➜ systemctl enable --now chronyd
创建 kubernetes 证书存放目录
1 ➜ mkdir -p /etc/kubernetes/pki
1.3.3 网络环境设置
转发 IPv4 并让 iptables 看到桥接流量
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # 加载 br_netfilter 和 overlay 模块 ➜ cat > /etc/modules-load.d/k8s.conf << EOF overlay br_netfilter EOF ➜ modprobe overlay ➜ modprobe br_netfilter # 设置所需的 sysctl 参数,参数在重新启动后保持不变 ➜ cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 EOF # 应用 sysctl 参数而不重新启动 ➜ sysctl --system
通过以下命令确认 br_netfilter 和 overlay 模块被加载:
1 2 ➜ lsmod | grep overlay ➜ lsmod | grep br_netfilter
通过以下命令确认 net.bridge.bridge-nf-call-iptables、net.bridge.bridge-nf-call-ip6tables 和 net.ipv4.ip_forward 系统变量在你的 sysctl 配置中被设置为 1:
1 ➜ sysctl net.bridge.bridge-nf-call-iptables net.bridge.bridge-nf-call-ip6tables net.ipv4.ip_forward
加载 ipvs 模块
1 2 3 4 5 6 7 8 9 10 11 12 13 14 ➜ yum install -y ipset ipvsadm ➜ cat > /etc/sysconfig/modules/ipvs.modules << "EOF" # !/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh # modprobe -- nf_conntrack_ipv4 modprobe -- nf_conntrack EOF ➜ chmod +x /etc/sysconfig/modules/ipvs.modules ➜ /bin/bash /etc/sysconfig/modules/ipvs.modules # ➜ lsmod | grep -e ip_vs -e nf_conntrack_ipv4 ➜ lsmod | grep -e ip_vs -e nf_conntrack
modprobe -- ip_vs: 加载 ip_vs 内核模块,该模块提供了 Linux 内核中的 IP 负载均衡功能。
modprobe -- ip_vs_rr: 加载 ip_vs_rr 内核模块,该模块提供了基于轮询算法的 IP 负载均衡策略。
modprobe -- ip_vs_wrr: 加载 ip_vs_wrr 内核模块,该模块提供了基于加权轮询算法的 IP 负载均衡策略。
modprobe -- ip_vs_sh: 加载 ip_vs_sh 内核模块,该模块提供了基于哈希算法的 IP 负载均衡策略。
modprobe -- nf_conntrack/nf_conntrack_ipv4: 加载 nf_conntrack/nf_conntrack_ipv4 内核模块,该模块提供了 Linux 内核中的网络连接跟踪功能,用于跟踪网络连接的状态。
这些命令通常用于配置 Linux 系统中的负载均衡和网络连接跟踪功能。在加载这些内核模块之后,就可以使用相应的工具和命令来配置和管理负载均衡和网络连接跟踪。例如,可以使用 ipvsadm 命令来配置 IP 负载均衡,使用 conntrack 命令来查看和管理网络连接跟踪表。
如果提示如下错误:
1 "modprobe: FATAL: Module nf_conntrack_ipv4 not found in directory /lib/modules/4.18.0-372.9.1.el8.x86_64"
则需要将 nf_conntrack_ipv4
修改为 nf_conntrack
,然后重新执行命令,因为在高版本内核中已经把 nf_conntrack_ipv4 替换为 nf_conntrack。
nf_conntrack_ipv4 和 nf_conntrack 都是 Linux 内核中的网络连接跟踪模块,用于跟踪网络连接的状态。它们的区别在于:
nf_conntrack_ipv4 模块只能跟踪 IPv4 协议的网络连接,而 nf_conntrack 模块可以跟踪 IPv4 和 IPv6 协议的网络连接。
nf_conntrack_ipv4 模块是 nf_conntrack 模块的一个子模块,它提供了 IPv4 协议的网络连接跟踪功能。因此,如果要使用 nf_conntrack_ipv4 模块,必须先加载 nf_conntrack 模块。
这两个模块通常用于 Linux 系统中的网络安全和网络性能优化。它们可以被用于防火墙、负载均衡、网络流量分析等场景中,以便对网络连接进行跟踪、监控和控制。例如,可以使用 iptables 命令和 nf_conntrack 模块来实现基于连接状态的防火墙规则,或者使用 ipvsadm 命令和 nf_conntrack 模块来实现 IP 负载均衡。
1.3.4 重启系统
1.4 下载二进制包
https://kubernetes.io/zh-cn/releases/download/
从官方发布地址下载二进制包,下载 Server Binaries 即可,这个包含了所有所需的二进制文件。
https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md
解压后,复制二进制 kube-apiserver,kube-controller-manager,kubectl,kubelet,kube-proxy,kube-scheduler
到 master 节点 /usr/local/bin
目录下,复制二进制 kubelet,kube-proxy
到 worker 节点 /usr/local/bin
目录下。
在 master1 节点执行以下命令:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ➜ cd ~/Downloads # 下载 ➜ wget -c https://dl.k8s.io/v1.27.3/kubernetes-server-linux-amd64.tar.gz # 解压 ➜ tar -zxf kubernetes-server-linux-amd64.tar.gz # 复制到 master1 节点 /usr/local/bin 目录 ➜ cp kubernetes/server/bin/{kubeadm,kube-apiserver,kube-controller-manager,kubectl,kubelet,kube-proxy,kube-scheduler} /usr/local/bin # 查看复制结果 ➜ ls -lh /usr/local/bin/kube* -rwxr-xr-x 1 root root 46M Jul 10 14:28 /usr/local/bin/kubeadm -rwxr-xr-x 1 root root 112M Jul 10 14:28 /usr/local/bin/kube-apiserver -rwxr-xr-x 1 root root 104M Jul 10 14:28 /usr/local/bin/kube-controller-manager -rwxr-xr-x 1 root root 47M Jul 10 14:28 /usr/local/bin/kubectl -rwxr-xr-x 1 root root 102M Jul 10 14:28 /usr/local/bin/kubelet -rwxr-xr-x 1 root root 51M Jul 10 14:28 /usr/local/bin/kube-proxy -rwxr-xr-x 1 root root 52M Jul 10 14:28 /usr/local/bin/kube-scheduler # 复制到 worker1 节点 /usr/local/bin 目录 ➜ scp kubernetes/server/bin/{kubelet,kube-proxy} root@worker1:/usr/local/bin # 复制到 worker2 节点 /usr/local/bin 目录 ➜ scp kubernetes/server/bin/{kubelet,kube-proxy} root@worker2:/usr/local/bin # 复制到 worker3 节点 /usr/local/bin 目录 ➜ scp kubernetes/server/bin/{kubelet,kube-proxy} root@worker3:/usr/local/bin
1.5 查看镜像版本
在 master1 节点执行以下命令:
1 2 3 4 5 6 7 8 9 10 # 查看依赖的镜像版本 ➜ kubeadm config images list registry.k8s.io/kube-apiserver:v1.27.3 registry.k8s.io/kube-controller-manager:v1.27.3 registry.k8s.io/kube-scheduler:v1.27.3 registry.k8s.io/kube-proxy:v1.27.3 registry.k8s.io/pause:3.9 registry.k8s.io/etcd:3.5.7-0 registry.k8s.io/coredns/coredns:v1.10.1
2. 容器运行时
本节概述了使用 containerd 作为 CRI 运行时的必要步骤。
https://blog.51cto.com/lajifeiwomoshu/5428345
2.1 安装 containerd
参考 Getting started with containerd 在各节点安装 containerd。
设置 repository
1 2 3 4 # 安装 yum-utils ➜ yum install -y yum-utils # 添加 repository ➜ yum-config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
或者
1 2 # 添加 repository ➜ dnf config-manager --add-repo https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo
查看当前镜像源中支持的 containerd 版本
1 ➜ yum list containerd.io --showduplicates
安装特定版本的 containerd
1 ➜ yum install -y --setopt=obsoletes=0 containerd.io-1.6.21
添加 containerd 配置
https://kubernetes.io/docs/setup/production-environment/container-runtimes/#containerd
https://github.com/containerd/containerd/blob/main/docs/cri/config.md
This document provides the description of the CRI plugin configuration. The CRI plugin config is part of the containerd config (default path: /etc/containerd/config.toml
).
See here for more information about containerd config.
Note that the [plugins."io.containerd.grpc.v1.cri"]
section is specific to CRI, and not recognized by other containerd clients such as ctr
, nerdctl
, and Docker/Moby.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 # 创建 containerd 配置文件 ➜ cat > /etc/containerd/config.toml << EOF # https://github.com/containerd/containerd/blob/main/docs/cri/config.md disabled_plugins = [] imports = [] version = 2 [plugins."io.containerd.grpc.v1.cri"] sandbox_image = "registry.k8s.io/pause:3.9" # https://github.com/containerd/containerd/issues/6964 [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc] runtime_type = "io.containerd.runc.v2" [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options] SystemdCgroup = true [plugins."io.containerd.grpc.v1.cri".registry] config_path = "/etc/containerd/certs.d" EOF # 创建镜像仓库配置目录 ➜ mkdir -p /etc/containerd/certs.d
启动 containerd
1 2 3 4 ➜ systemctl daemon-reload ➜ systemctl enable --now containerd # 查看 containerd 状态 ➜ systemctl status containerd
查看当前 containerd 使用的配置
1 ➜ containerd config dump
测试 containerd(在任意一个集群节点测试即可)
1 2 3 4 5 6 # 拉取 redis 镜像 ➜ ctr images pull docker.io/library/redis:alpine # 创建 redis 容器并运行 ➜ ctr run docker.io/library/redis:alpine redis # 删除 redis 镜像 ➜ ctr images delete docker.io/library/redis:alpine
2.2 安装 containerd cli 工具
There are several command line interface (CLI) projects for interacting with containerd:
2.2.1 ctr
While the ctr
tool is bundled together with containerd, it should be noted the ctr
tool is solely made for debugging containerd. The nerdctl
tool provides stable and human-friendly user experience.
2.2.2 crictl
crictl 是 CRI 兼容的容器运行时命令行接口,可以使用它来检查和调试 k8s 节点上的容器运行时和应用程序。主要是用于 kubernetes, 默认操作的命名空间是 k8s.io,而且看到的对象是 pod。
如果是 k8s 环境的话,可以参考下方链接,在每个节点部署 containerd 的时候也部署下 crictl 工具。
kubernetes-sigs/cri-tools: CLI and validation tools for Kubelet Container Runtime Interface (CRI) .
在 master1 节点执行以下命令:
1 2 3 4 5 6 7 8 9 10 11 ➜ cd ~/Downloads # 下载 ➜ wget -c https://hub.gitmirror.com/https://github.com/kubernetes-sigs/cri-tools/releases/download/v1.27.1/crictl-v1.27.1-linux-amd64.tar.gz # 解压到 master1 节点 /usr/local/bin 目录 ➜ tar -zxvf crictl-v1.27.1-linux-amd64.tar.gz -C /usr/local/bin # 复制到 worker1 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/crictl root@worker1:/usr/local/bin # 复制到 worker2 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/crictl root@worker2:/usr/local/bin # 复制到 worker3 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/crictl root@worker3:/usr/local/bin
创建 crictl 配置文件:
1 2 3 4 5 6 7 8 9 ➜ cat > /etc/crictl.yaml << EOF runtime-endpoint: unix:///var/run/containerd/containerd.sock image-endpoint: unix:///var/run/containerd/containerd.sock timeout: 30 debug: false EOF ➜ scp /etc/crictl.yaml root@worker1:/etc ➜ scp /etc/crictl.yaml root@worker2:/etc ➜ scp /etc/crictl.yaml root@worker3:/etc
测试 crictl 工具:
1 2 3 4 5 6 7 ➜ crictl info --output go-template --template '{{.config.sandboxImage}}' registry.k8s.io/pause:3.9 ➜ crictl inspecti --output go-template --template '{{.status.pinned}}' registry.k8s.io/pause:3.9 FATA[0000] no such image "registry.k8s.io/pause:3.9" present
2.2.3 nerdctl(推荐)
https://github.com/containerd/nerdctl
对于单机节点来说,推荐使用 nerdctl,使用起来和 docker 类似,基本没学习成本,把 docker 换成 nerdctl 即可。对于普通用户来说,这个是比较友好的工具。但是对于 api 那些来说,可以选择其他。
nerdctl 有两种版本:
Minimal (nerdctl-1.4.0-linux-amd64.tar.gz
): nerdctl only
Full (nerdctl-full-1.4.0-linux-amd64.tar.gz
): Includes dependencies such as containerd, runc, and CNI
这里我选择的是 Minimal 的版本,直接到 https://github.com/containerd/nerdctl/releases 下载。
在 master1 节点执行以下命令:
1 2 3 4 5 6 7 8 9 10 11 ➜ cd ~/Downloads # 下载 ➜ wget -c https://hub.gitmirror.com/https://github.com/containerd/nerdctl/releases/download/v1.4.0/nerdctl-1.4.0-linux-amd64.tar.gz # 解压到 master1 节点 /usr/local/bin 目录 ➜ tar Cxzvvf /usr/local/bin nerdctl-1.4.0-linux-amd64.tar.gz # 复制到 worker1 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/{containerd-rootless-setuptool.sh,containerd-rootless.sh,nerdctl} root@worker1:/usr/local/bin # 复制到 worker2 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/{containerd-rootless-setuptool.sh,containerd-rootless.sh,nerdctl} root@worker2:/usr/local/bin # 复制到 worker3 节点 /usr/local/bin 目录 ➜ scp /usr/local/bin/{containerd-rootless-setuptool.sh,containerd-rootless.sh,nerdctl} root@worker3:/usr/local/bin
测试 nerdctl(在任意一个集群节点测试即可)
1 2 3 4 5 6 7 8 # 拉取镜像 ➜ nerdctl image pull redis:alpine # 查看镜像 ➜ nerdctl image ls # 删除镜像 ➜ nerdctl image rm redis:alpine # 运行 nginx 服务(需要 CNI plugin) # ➜ nerdctl run -d --name nginx -p 80:80 nginx:alpine
2.3 设置公共仓库镜像源(可选)
https://github.com/containerd/containerd/blob/main/docs/cri/registry.md
https://github.com/containerd/containerd/blob/main/docs/hosts.md
由于某些因素,在国内拉取公共镜像仓库的速度是极慢的,为了节约拉取时间,需要为 containerd 配置镜像仓库的 mirror。
containerd 的镜像仓库 mirror 与 docker 相比有两个区别:
拉取镜像时,默认都是从 docker hub 上拉取,如果镜像名前不加 registry 地址的话默认会给你加上 docker.io/library
。
需要注意的是:
如果 hosts.toml 文件中的 capabilities 中不加 resolve 的话,无法加速镜像
配置无需重启服务,即可生效
要配保底的加速站点,否则可能会导致下载失败
2.3.1 docker.io
https://yeasy.gitbook.io/docker_practice/install/mirror
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # 创建 docker.io 目录 ➜ mkdir -p /etc/containerd/certs.d/docker.io # 创建 docker.io 仓库配置文件 ➜ cat > /etc/containerd/certs.d/docker.io/hosts.toml << EOF # https://blog.csdn.net/qq_44797987/article/details/112681224 server = "https://docker.io" [host."https://dockerproxy.com"] capabilities = ["pull", "resolve"] [host."https://ccr.ccs.tencentyun.com"] capabilities = ["pull", "resolve"] [host."https://hub-mirror.c.163.com"] capabilities = ["pull", "resolve"] [host."https://mirror.baidubce.com"] capabilities = ["pull", "resolve"] [host."https://registry-1.docker.io"] capabilities = ["pull", "resolve", "push"] EOF
2.3.2 registry.k8s.io
1 2 3 4 5 6 7 8 9 10 # 创建 registry.k8s.io 目录 ➜ mkdir -p /etc/containerd/certs.d/registry.k8s.io # 创建 registry.k8s.io 仓库配置文件 ➜ cat > /etc/containerd/certs.d/registry.k8s.io/hosts.toml << EOF server = "https://registry.k8s.io" [host."https://registry.aliyuncs.com/v2/google_containers"] capabilities = ["pull", "resolve"] override_path = true EOF
registry.aliyuncs.com/google_containers 这个镜像仓库站点不是 registry.k8s.io 的 mirror,只是有 registry.k8s.io 的镜像,这就是为什么 registry.k8s.io 有些镜像在 registry.aliyuncs.com/google_containers 没有的原因。
通过执行以下命令并观察输出可以知道,从 "registry.aliyuncs.com/google_containers" 拉取 pause 镜像正确的请求是 "https://registry.cn-hangzhou.aliyuncs.com/v2/google_containers/pause/manifests/3.9 "。
1 2 3 4 5 6 # 在 master1 节点进行测试 ➜ ctr --debug images pull -k registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.9 ... DEBU[0000] do request host=registry.cn-hangzhou.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.cn-hangzhou.aliyuncs.com/v2/google_containers/pause/manifests/3.9" ...
使用 ctr 命令拉取 "registry.k8s.io/pause:3.9" 镜像,如果最终拼接出的请求不是 "https://registry.cn-hangzhou.aliyuncs.com/v2/google_containers/pause/manifests/3.9 ",则会导致拉取失败。
根据文档 https://github.com/containerd/containerd/blob/main/docs/hosts.md 可以知道:
1 pull [registry_host_name|IP address][:port][/v2][/org_path]<image_name>[:tag|@DIGEST]
拉取请求格式的 /v2
部分指的是分发 api 的版本。 如果未包含在拉取请求中,则默认情况下为符合上面链接的分发规范的所有客户端添加 /v2
。这可能会导致不合规的 OCI registry 使用有问题。
以拉取 "registry.k8s.io/pause:3.9" 镜像为例,在 master1 节点上进行测试 ,分析几种配置情况下实际的拉取请求 URL 及拉取结果:
2.3.2.1 配置一(错误)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 # 设置 registry.k8s.io 配置 ➜ cat > /etc/containerd/certs.d/registry.k8s.io/hosts.toml << EOF server = "https://registry.k8s.io" [host."https://registry.aliyuncs.com/google_containers"] capabilities = ["pull", "resolve"] EOF # 测试镜像拉取 ➜ ctr --debug images pull --hosts-dir "/etc/containerd/certs.d" registry.k8s.io/pause:3.9 DEBU[0000] fetching image="registry.k8s.io/pause:3.9" DEBU[0000] loading host directory dir=/etc/containerd/certs.d/registry.k8s.io DEBU[0000] resolving host=registry.aliyuncs.com DEBU[0000] do request host=registry.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.aliyuncs.com/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] fetch response received host=registry.aliyuncs.com response.header.content-length=19 response.header.content-type="text/plain; charset=utf-8" response.header.date="Wed, 12 Jul 2023 14:40:38 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.x-content-type-options=nosniff response.status="404 Not Found" url="https://registry.aliyuncs.com/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" INFO[0000] trying next host - response was http.StatusNotFound host=registry.aliyuncs.com ...
由输出可知,因为实际请求的 URL "https://registry.aliyuncs.com/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io " 不正确,所以拉取失败。
2.3.2.2 配置二(错误)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 # 设置 registry.k8s.io 配置 ➜ cat > /etc/containerd/certs.d/registry.k8s.io/hosts.toml << EOF server = "https://registry.k8s.io" [host."https://registry.aliyuncs.com/v2/google_containers"] capabilities = ["pull", "resolve"] EOF # 测试镜像拉取 ➜ ctr --debug images pull --hosts-dir "/etc/containerd/certs.d" registry.k8s.io/pause:3.9 DEBU[0000] fetching image="registry.k8s.io/pause:3.9" DEBU[0000] loading host directory dir=/etc/containerd/certs.d/registry.k8s.io DEBU[0000] resolving host=registry.aliyuncs.com DEBU[0000] do request host=registry.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.aliyuncs.com/v2/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] fetch response received host=registry.aliyuncs.com response.header.content-length=169 response.header.content-type="application/json; charset=utf-8" response.header.date="Wed, 12 Jul 2023 15:02:21 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.www-authenticate="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/v2/pause:pull\"" response.status="401 Unauthorized" url="https://registry.aliyuncs.com/v2/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] Unauthorized header="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/v2/pause:pull\"" host=registry.aliyuncs.com DEBU[0000] do request host=registry.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.aliyuncs.com/v2/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] fetch response received host=registry.aliyuncs.com response.header.content-length=169 response.header.content-type="application/json; charset=utf-8" response.header.date="Wed, 12 Jul 2023 15:02:21 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.www-authenticate="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/v2/pause:pull\",error=\"insufficient_scope\"" response.status="401 Unauthorized" url="https://registry.aliyuncs.com/v2/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] Unauthorized header="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/v2/pause:pull\",error=\"insufficient_scope\"" host=registry.aliyuncs.com ...
由输出可知,因为实际请求的 URL "https://registry.aliyuncs.com/v2/google_containers/v2/pause/manifests/3.9?ns=registry.k8s.io " 不正确,所以拉取失败。
2.3.2.3 配置三(正确)
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 # 设置 registry.k8s.io 配置 ➜ cat > /etc/containerd/certs.d/registry.k8s.io/hosts.toml << EOF server = "https://registry.k8s.io" [host."https://registry.aliyuncs.com/v2/google_containers"] capabilities = ["pull", "resolve"] override_path = true EOF # 测试镜像拉取 ➜ ctr --debug images pull --hosts-dir "/etc/containerd/certs.d" registry.k8s.io/pause:3.9 DEBU[0000] fetching image="registry.k8s.io/pause:3.9" DEBU[0000] loading host directory dir=/etc/containerd/certs.d/registry.k8s.io DEBU[0000] resolving host=registry.aliyuncs.com DEBU[0000] do request host=registry.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] fetch response received host=registry.aliyuncs.com response.header.content-length=166 response.header.content-type="application/json; charset=utf-8" response.header.date="Wed, 12 Jul 2023 15:04:06 GMT" response.header.docker-distribution-api-version=registry/2.0 response.header.www-authenticate="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/pause:pull\"" response.status="401 Unauthorized" url="https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] Unauthorized header="Bearer realm=\"https://dockerauth.cn-hangzhou.aliyuncs.com/auth\",service=\"registry.aliyuncs.com:cn-hangzhou:26842\",scope=\"repository:google_containers/pause:pull\"" host=registry.aliyuncs.com DEBU[0000] do request host=registry.aliyuncs.com request.header.accept="application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.list.v2+json, application/vnd.oci.image.manifest.v1+json, application/vnd.oci.image.index.v1+json, */*" request.header.user-agent=containerd/1.6.21 request.method=HEAD url="https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] fetch response received host=registry.aliyuncs.com response.header.content-length=2405 response.header.content-type=application/vnd.docker.distribution.manifest.list.v2+json response.header.date="Wed, 12 Jul 2023 15:04:07 GMT" response.header.docker-content-digest="sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097" response.header.docker-distribution-api-version=registry/2.0 response.header.etag="\"sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097\"" response.status="200 OK" url="https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.9?ns=registry.k8s.io" DEBU[0000] resolved desc.digest="sha256:7031c1b283388d2c2e09b57badb803c05ebed362dc88d84b480cc47f72a21097" host=registry.aliyuncs.com ...
由输出可知,因为实际请求的 URL "https://registry.aliyuncs.com/v2/google_containers/pause/manifests/3.9?ns=registry.k8s.io " 正确,所以拉取成功。
https://github.com/containerd/containerd/blob/main/docs/hosts.md#override_path-field
override_path
is used to indicate the host's API root endpoint is defined in the URL path rather than by the API specification. This may be used with non-compliant OCI registries which are missing the /v2
prefix. (Defaults to false
)
2.3.3 quay.io
1 2 3 4 5 6 7 8 9 # 创建 quay.io 目录 ➜ mkdir -p /etc/containerd/certs.d/quay.io # 创建 quay.io 仓库配置文件 ➜ cat > /etc/containerd/certs.d/quay.io/hosts.toml << EOF server = "https://quay.io" [host."https://quay-mirror.qiniu.com"] capabilities = ["pull", "resolve"] EOF
2.4 设置私有仓库(可选)
2.4.1 部署私有仓库
在用于测试的 registry 节点部署私有仓库,以测试 containerd 使用私有仓库的场景。
2.4.1.1 安装 docker
参考文档 Install Docker Engine on CentOS ,在 registry 节点安装 docker:
2.4.1.2 部署 registry
https://docs.docker.com/registry/deploying/
2.4.2 registry.local
1 2 3 4 5 6 7 8 9 10 11 12 13 14 # 创建 registry.local 目录 ➜ mkdir -p /etc/containerd/certs.d/registry.local # 创建 registry.local 仓库配置文件 ➜ cat > /etc/containerd/certs.d/registry.local/hosts.toml << EOF server = "https://registry.local" [host."http://registry.local:5000"] capabilities = ["pull", "resolve", "push"] skip_verify = true EOF # 追加 /etc/hosts 配置 ➜ cat >> /etc/hosts << EOF 10.128.170.235 registry.local EOF
2.4.3 测试私有仓库
在任意一个集群节点测试即可:
1 2 # 拉取镜像 ➜ ctr --debug images pull --hosts-dir "/etc/containerd/certs.d" registry.local/redis:alpine
需要注意的是:
ctr image pull
的时候需要注意镜像名称需要完整,否则无法拉取,格式如下:
[registry_host_name|IP address][:port][/v2][/org_path]<image_name>[:tag|@DIGEST]
因为 ctr 不使用 CRI,所以默认不会使用 config.toml 中 cri 的配置,如果拉取镜像时希望使用 mirror,则需要指定 --hosts-dir
。
3. 创建 ca 证书
证书操作如无特殊说明,只需在 master1 节点执行即可。
3.1 安装 cfssl
cfssl 是一款证书签署工具,使用 cfssl 工具可以很简化证书签署过程,方便颁发自签证书。
CloudFlare's distributes cfssl source code on github page and binaries on cfssl website .
Our documentation assumes that you will run cfssl on your local x86_64 Linux host.
https://github.com/cloudflare/cfssl/releases/tag/v1.6.4
1 2 3 4 5 # 下载并重命名 ➜ curl -L -o https://download.nuaa.cf/cloudflare/cfssl/releases/download/v1.6.4/cfssl_1.6.4_linux_amd64 ➜ curl -L -o /usr/local/bin/cfssljson https://download.nuaa.cf/cloudflare/cfssl/releases/download/v1.6.4/cfssljson_1.6.4_linux_amd64 # 赋予可执行权限 ➜ chmod +x /usr/local/bin/{cfssl,cfssljson}
离线安装的情况,直接把两个文件下载下来重命名即可。
3.2 创建 ca 证书
创建的证书统一放到 /etc/kubernetes/ssl 目录,创建后复制到 /etc/kubernetes/pki 目录。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 # 创建 /etc/kubernetes/ssl 目录 ➜ mkdir -p /etc/kubernetes/ssl # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # ca 证书创建申请 ➜ cat > ca-csr.json << EOF { "CN": "kubernetes", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "k8s", "OU": "system" } ], "ca": { "expiry": "87600h" } } EOF # 创建 ca 证书 ➜ cfssl gencert -initca ca-csr.json | cfssljson -bare ca # 验证结果,会生成两个证书文件 ➜ ls -lh ca*pem -rw------- 1 root root 1.7K Jul 13 12:38 ca-key.pem -rw-r--r-- 1 root root 1.4K Jul 13 12:38 ca.pem # 复制 ca 证书到 /etc/kubernetes/pki ➜ cp ca*pem /etc/kubernetes/pki
ca-csr.json 这个文件是 Kubernetes 集群中使用的根证书的签名请求 (CSR) 配置文件,用于定义 CA 证书的签名请求配置。
在这个配置文件中,CN 字段指定了证书的通用名称为 "kubernetes",key 字段指定了证书的密钥算法为 RSA,密钥长度为 2048 位。names 字段定义了证书的其他信息,如国家、省份、城市、组织和组织单位等。ca 字段指定了证书的过期时间为 87600 小时(即 10 年)。
这个配置文件用于创建 Kubernetes 集群中的 CA 证书,以便对集群中的其他证书进行签名和认证。
CN(Common Name): kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name)
names[].O(Organization): kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group)
由于这里是 CA 证书,是签发其它证书的根证书,这个证书密钥不会分发出去作为 client 证书,所有组件使用的 client 证书都是由 CA 证书签发而来,所以 CA 证书的 CN 和 O 的名称并不重要,后续其它签发出来的证书的 CN 和 O 的名称才是有用的。
3.3 创建签发配置文件
由于各个组件都需要配置证书,并且依赖 CA 证书来签发证书,所以我们首先要生成好 CA 证书以及后续的签发配置文件。
创建用于签发其它证书的配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # 证书签发配置文件 ➜ cat > ca-config.json << EOF { "signing": { "default": { "expiry": "87600h" }, "profiles": { "kubernetes": { "usages": [ "signing", "key encipherment", "server auth", "client auth" ], "expiry": "87600h" } } } } EOF
ca-config.json 这个文件是签发其它证书的配置文件,用于定义签名配置和证书配置。其中,signing 字段定义了签名配置,profiles 字段定义了不同场景下的证书配置。
在这个配置文件中,default 配置指定了默认的证书过期时间为 87600 小时(即 10 年),profiles 配置定义了一个名为 "kubernetes" 的证书配置,它指定了证书的用途(签名、密钥加密、服务器认证和客户端认证)和过期时间。
这个配置文件用于创建 Kubernetes 集群中的证书和密钥,以便对集群进行安全认证和加密通信。
signing:定义了签名配置,包括默认的签名过期时间和各个证书配置的签名过期时间。
profiles:定义了不同场景下的证书配置,包括证书的用途、过期时间和其他属性。
在使用 cfssl gencert 命令生成证书时,可以使用 -config
参数指定配置文件,以便根据配置文件中的规则生成符合要求的证书。如果不指定 -config
参数,则 cfssl gencert 命令将使用默认的配置文件。
4. 部署 etcd
根据 kubeadm 获取的信息,在 Kubernetes 1.27.3 版本中 etcd 使用的版本是 3.5.7,只需在 master 节点(也即 master1 节点)部署即可 。
https://github.com/etcd-io/etcd/releases/tag/v3.5.7
4.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # etcd 证书签署申请 # hosts 字段中,IP 为所有 etcd 集群节点地址,这里可以做好规划,预留几个 IP,以备后续扩容。 ➜ cat > etcd-csr.json << EOF { "CN": "etcd", "hosts": [ "127.0.0.1", "10.128.170.21", "10.128.170.22", "10.128.170.23", "localhost", "master1", "master2", "master3", "master1.local", "master2.local", "master3.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "k8s", "OU": "system" } ] } EOF # 签署 etcd 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes etcd-csr.json | cfssljson -bare etcd # 验证结果,会生成两个证书文件 ➜ ls -lh etcd*pem -rw------- 1 root root 1.7K Jul 13 23:35 etcd-key.pem -rw-r--r-- 1 root root 1.6K Jul 13 23:35 etcd.pem # 复制 etcd 证书到 /etc/kubernetes/pki ➜ cp etcd*pem /etc/kubernetes/pki
4.2 部署 etcd
下载二进制包 https://github.com/etcd-io/etcd/releases/tag/v3.5.7 并解压,将二进制程序 etcd
和 etcdctl
复制到 /usr/local/bin
目录下。
1 2 3 4 5 6 7 8 9 10 11 12 ➜ cd ~/Downloads # 下载 ➜ wget -c https://hub.gitmirror.com/https://github.com/etcd-io/etcd/releases/download/v3.5.7/etcd-v3.5.7-linux-amd64.tar.gz # 解压 ➜ tar -zxf etcd-v3.5.7-linux-amd64.tar.gz # 复制到 master1 节点 /usr/local/bin 目录 ➜ cp etcd-v3.5.7-linux-amd64/{etcd,etcdctl} /usr/local/bin # 查看复制结果 ➜ ls -lh /usr/local/bin/etcd* -rwxr-xr-x 1 root root 22M Jul 13 23:49 /usr/local/bin/etcd -rwxr-xr-x 1 root root 17M Jul 13 23:49 /usr/local/bin/etcdctl
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 ➜ mkdir /etc/etcd ➜ cat > /etc/etcd/etcd.conf << EOF ETCD_NAME="etcd1" ETCD_DATA_DIR="/var/lib/etcd/default.etcd" ETCD_LISTEN_PEER_URLS="https://10.128.170.21:2380" ETCD_LISTEN_CLIENT_URLS="https://10.128.170.21:2379" ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.128.170.21:2380" ETCD_ADVERTISE_CLIENT_URLS="https://10.128.170.21:2379" ETCD_INITIAL_CLUSTER="etcd1=https://10.128.170.21:2380" ETCD_INITIAL_CLUSTER_TOKEN="etcd-cluster" ETCD_INITIAL_CLUSTER_STATE="new" EOF
配置文件解释:
ETCD_NAME
:节点名称,集群中唯一
ETCD_DATA_DIR
: 数据保存目录
ETCD_LISTEN_PEER_URLS
:集群内部通信监听地址
ETCD_LISTEN_CLIENT_URLS
:客户端访问监听地址
ETCD_INITIAL_ADVERTISE_PEER_URLS
:集群通告地址
ETCD_ADVERTISE_CLIENT_URLS
:客户端通告地址
ETCD_INITIAL_CLUSTER
:集群节点地址列表
ETCD_INITIAL_CLUSTER_TOKEN
:集群通信 token
ETCD_INITIAL_CLUSTER_STATE
:加入集群的当前状态,new 是新集群,existing 表示加入已有集群
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 # 创建数据目录 ➜ mkdir -p /var/lib/etcd # 创建系统服务 ➜ cat > /lib/systemd/system/etcd.service << "EOF" [Unit] Description=etcd server After=network.target After=network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/etc/etcd/etcd.conf WorkingDirectory=/var/lib/etcd ExecStart=/usr/local/bin/etcd \ --cert-file=/etc/kubernetes/pki/etcd.pem \ --key-file=/etc/kubernetes/pki/etcd-key.pem \ --trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --peer-cert-file=/etc/kubernetes/pki/etcd.pem \ --peer-key-file=/etc/kubernetes/pki/etcd-key.pem \ --peer-trusted-ca-file=/etc/kubernetes/pki/ca.pem \ --peer-client-cert-auth \ --client-cert-auth Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 etcd 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now etcd # 验证结果 ➜ systemctl status etcd # 查看日志 ➜ journalctl -u etcd
5. 部署 kube-apiserver
只需在 master 节点(也即 master1 节点)部署即可。
5.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # kube-apiserver 证书签署申请 # hosts 字段中,IP 为所有 kube-apiserver 节点地址,这里可以做好规划,预留几个 IP,以备后续扩容。 # 10.96.0.1 是 service 网段的第一个 IP # kubernetes.default.svc.cluster.local 是 kube-apiserver 的 service 域名 ➜ cat > kube-apiserver-csr.json << EOF { "CN": "kubernetes", "hosts": [ "127.0.0.1", "10.128.170.21", "10.128.170.22", "10.128.170.23", "10.96.0.1", "localhost", "master1", "master2", "master3", "master1.local", "master2.local", "master3.local", "kubernetes", "kubernetes.default", "kubernetes.default.svc", "kubernetes.default.svc.cluster", "kubernetes.default.svc.cluster.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "k8s", "OU": "system" } ] } EOF # 签署 kube-apiserver 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-apiserver-csr.json | cfssljson -bare kube-apiserver # 验证结果,会生成两个证书文件 ➜ ls -lh kube-apiserver*pem -rw------- 1 root root 1.7K Jul 14 00:07 kube-apiserver-key.pem -rw-r--r-- 1 root root 1.8K Jul 14 00:07 kube-apiserver.pem # 复制 kube-apiserver 证书到 /etc/kubernetes/pki ➜ cp kube-apiserver*pem /etc/kubernetes/pki
5.2 部署 kube-apiserver
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 # 可以使用 kubeadm 生成示例配置文件,然后进行修改 # ➜ kubeadm init phase control-plane apiserver --dry-run -v 4 # ... # [control-plane] Creating static Pod manifest for "kube-apiserver" # ... # I0717 00:13:59.260175 6660 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-apiserver" to "/etc/kubernetes/tmp/kubeadm-init-dryrun365914376/kube-apiserver.yaml" # ... ➜ cat > /etc/kubernetes/kube-apiserver.conf << EOF KUBE_APISERVER_OPTS="--enable-admission-plugins=NamespaceLifecycle,NodeRestriction,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota \ --anonymous-auth=false \ --bind-address=0.0.0.0 \ --secure-port=6443 \ --authorization-mode=Node,RBAC \ --runtime-config=api/all=true \ --enable-bootstrap-token-auth \ --service-cluster-ip-range=10.96.0.0/16 \ --token-auth-file=/etc/kubernetes/token.csv \ --service-node-port-range=30000-32767 \ --tls-cert-file=/etc/kubernetes/pki/kube-apiserver.pem \ --tls-private-key-file=/etc/kubernetes/pki/kube-apiserver-key.pem \ --client-ca-file=/etc/kubernetes/pki/ca.pem \ --kubelet-client-certificate=/etc/kubernetes/pki/kube-apiserver.pem \ --kubelet-client-key=/etc/kubernetes/pki/kube-apiserver-key.pem \ --service-account-key-file=/etc/kubernetes/pki/ca-key.pem \ --service-account-signing-key-file=/etc/kubernetes/pki/ca-key.pem \ --service-account-issuer=https://kubernetes.default.svc.cluster.local \ --etcd-cafile=/etc/kubernetes/pki/ca.pem \ --etcd-certfile=/etc/kubernetes/pki/etcd.pem \ --etcd-keyfile=/etc/kubernetes/pki/etcd-key.pem \ --etcd-servers=https://10.128.170.21:2379 \ --allow-privileged=true \ --apiserver-count=1 \ --audit-log-maxage=30 \ --audit-log-maxbackup=3 \ --audit-log-maxsize=100 \ --audit-log-path=/var/log/kube-apiserver-audit.log \ --event-ttl=1h \ --v=4" EOF
如果 etcd 是一个集群,则 --etcd-servers
可以添加多个,例如:--etcd-servers=https://10.128.170.21:2379,https://10.128.170.22:2379,https://10.128.170.23:2379
生成 token 文件:
1 2 3 ➜ cat > /etc/kubernetes/token.csv << EOF $ (head -c 16 /dev/urandom | od -An -t x | tr -d ' ' ),kubelet-bootstrap,10001,"system:node-bootstrapper" EOF
在这个命令中,head -c 16 /dev/urandom | od -An -t x | tr -d ' '
生成了一个 16 字节的随机字符串,并将其转换为十六进制格式。这个字符串将作为令牌的值。
kubelet-bootstrap 是令牌的用户名
10001 是令牌的 UID,
system:node-bootstrapper 是令牌的组名。
这些值将用于 kubelet 节点的身份验证和授权。
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ➜ cat > /usr/lib/systemd/system/kube-apiserver.service << "EOF" [Unit] Description=Kubernetes API Server Documentation=https://github.com/kubernetes/kubernetes After=network.target network-online.target Wants=network-online.target [Service] Type=notify EnvironmentFile=/etc/kubernetes/kube-apiserver.conf ExecStart=/usr/local/bin/kube-apiserver $KUBE_APISERVER_OPTS Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 kube-apiserver 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kube-apiserver # 验证结果 ➜ systemctl status kube-apiserver # 查看日志 ➜ journalctl -u kube-apiserver
6. 配置 kubectl
部署完 kube-apiserver 后,就可以配置 kubectl 了,因为 kubectl 可以验证 kube-apiserver 是否已经正常工作了。
只需在 master 节点(也即 master1 节点)配置即可。
6.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # kubectl 证书签署申请 # O 参数的值必须为 system:masters,因为这是 kube-apiserver 一个内置好的角色,拥有集群管理的权限 ➜ cat > kubectl-csr.json << EOF { "CN": "clusteradmin", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:masters", "OU": "system" } ] } EOF # 签署 kubectl 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kubectl-csr.json | cfssljson -bare kubectl # 验证结果,会生成两个证书文件 ➜ ls -lh kubectl*pem -rw------- 1 root root 1.7K Jul 14 00:45 kubectl-key.pem -rw-r--r-- 1 root root 1.4K Jul 14 00:45 kubectl.pem
6.2 生成配置文件
1 2 3 4 5 6 7 8 9 10 11 ➜ kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.128.170.21:6443 --kubeconfig=kube.config ➜ kubectl config set-credentials clusteradmin --client-certificate=kubectl.pem --client-key=kubectl-key.pem --embed-certs=true --kubeconfig=kube.config ➜ kubectl config set-context kubernetes --cluster=kubernetes --user=clusteradmin --kubeconfig=kube.config ➜ kubectl config use-context kubernetes --kubeconfig=kube.config ➜ mkdir ~/.kube ➜ cp kube.config ~/.kube/config
以上命令用于在本地创建一个 Kubernetes 配置文件 kube.config,并将其复制到 ~/.kube/config 文件中,以便使用 kubectl 命令与 Kubernetes 集群进行交互。
kubectl config set-cluster 命令设置了一个名为 kubernetes 的集群,指定了以下参数:
--certificate-authority=ca.pem
:指定 CA 证书文件的路径。
--embed-certs=true
:将 CA 证书嵌入到配置文件中。
--server=https://10.128.170.21:6443
:指定 API Server 的地址和端口。
--kubeconfig=kube.config
:指定要写入的配置文件路径。
这些参数将用于创建一个名为 kubernetes 的集群配置,并将其写入到 kube.config 文件中。
kubectl config set-credentials 命令设置了一个名为 clusteradmin 的用户,指定了以下参数:
--client-certificate=kubectl.pem
:指定客户端证书文件的路径。
--client-key=kubectl-key.pem
:指定客户端私钥文件的路径。
--embed-certs=true
:将客户端证书和私钥嵌入到配置文件中。
--kubeconfig=kube.config
:指定要写入的配置文件路径。
这些参数将用于创建一个名为 clusteradmin 的用户配置,并将其写入到 kube.config 文件中。
kubectl config set-context 命令设置了一个名为 kubernetes 的上下文,指定了以下参数:
--cluster=kubernetes
:指定要使用的集群。
--user=clusteradmin
:指定要使用的用户。
--kubeconfig=kube.config
:指定要写入的配置文件路径。
这些参数将用于创建一个名为 kubernetes 的上下文配置,并将其写入到 kube.config 文件中。
kubectl config use-context 命令将当前上下文设置为 kubernetes,指定了以下参数:
--kubeconfig=kube.config
:指定要使用的配置文件路径。
这个命令将当前上下文设置为 kubernetes,以便 kubectl 命令可以使用 kube.config 文件与 Kubernetes 集群进行交互。
6.3 获取集群信息
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ➜ kubectl cluster-info Kubernetes control plane is running at https://10.128.170.21:6443 To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'. ➜ kubectl get all -A NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE default service/kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 22m ➜ kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Unhealthy Get "https://127.0.0.1:10257/healthz": dial tcp 127.0.0.1:10257: connect: connection refused scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused etcd-0 Healthy {"health":"true","reason":""}
6.4 设置 kubectl 自动补全
查看 kubectl 命令自动补全帮助:
1 2 3 4 5 6 7 ➜ kubectl completion --help ```` 安装 bash-completion: ```shell ➜ yum install -y bash-completion
设置 kubectl 自动补全配置:
1 2 3 4 5 6 7 ➜ echo "source <(kubectl completion bash)" >> ~/.bashrc ```` 使配置生效: ```shell ➜ source ~/.bashrc
7. 部署 kube-controller-manager
只需在 master 节点(也即 master1 节点)部署即可。
7.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # kube-controller-manager 证书签署申请 # hosts 字段中,IP 为所有节点地址,这里可以做好规划,预留几个 IP,以备后续扩容。 ➜ cat > kube-controller-manager-csr.json << EOF { "CN": "system:kube-controller-manager", "hosts": [ "127.0.0.1", "10.128.170.21", "10.128.170.22", "10.128.170.23", "10.128.170.131", "10.128.170.132", "10.128.170.133", "10.128.170.134", "10.128.170.135", "localhost", "master1", "master2", "master3", "worker1", "worker2", "worker3", "worker4", "worker5", "master1.local", "master2.local", "master3.local", "worker1.local", "worker2.local", "worker3.local", "worker4.local", "worker5.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:kube-controller-manager", "OU": "system" } ] } EOF # 签署 kube-controller-manager 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-controller-manager-csr.json | cfssljson -bare kube-controller-manager # 验证结果,会生成两个证书文件 ➜ ls -lh kube-controller-manager*pem -rw------- 1 root root 1.7K Jul 14 00:55 kube-controller-manager-key.pem -rw-r--r-- 1 root root 1.8K Jul 14 00:55 kube-controller-manager.pem # 复制 kube-controler-manager 证书到 /etc/kubernetes/pki ➜ cp kube-controller-manager*pem /etc/kubernetes/pki
system:kube-controller-manager 是 Kubernetes 中的一个预定义 RBAC 角色,用于授权 kube-controller-manager 组件对 Kubernetes API 的访问。详细介绍请参考官方文档:https://kubernetes.io/zh-cn/docs/reference/access-authn-authz/rbac/#default-roles-and-role-bindings
7.2 部署 kube-controller-manager
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 # 可以使用 kubeadm 生成示例配置文件,然后进行修改 # ➜ kubeadm init phase control-plane controller-manager --dry-run -v 4 # ... # [control-plane] Creating static Pod manifest for "kube-controller-manager" # ... # I0717 00:18:23.798277 6694 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-controller-manager" to "/etc/kubernetes/tmp/kubeadm-init-dryrun963226442/kube-controller-manager.yaml" # ... ➜ cat > /etc/kubernetes/kube-controller-manager.conf << EOF KUBE_CONTROLLER_MANAGER_OPTS="--secure-port=10257 \ --kubeconfig=/etc/kubernetes/kube-controller-manager.kubeconfig \ --service-cluster-ip-range=10.96.0.0/16 \ --cluster-name=kubernetes \ --cluster-signing-cert-file=/etc/kubernetes/pki/ca.pem \ --cluster-signing-key-file=/etc/kubernetes/pki/ca-key.pem \ --cluster-signing-duration=87600h \ --tls-cert-file=/etc/kubernetes/pki/kube-controller-manager.pem \ --tls-private-key-file=/etc/kubernetes/pki/kube-controller-manager-key.pem \ --service-account-private-key-file=/etc/kubernetes/pki/ca-key.pem \ --root-ca-file=/etc/kubernetes/pki/ca.pem \ --leader-elect=true \ --controllers=*,bootstrapsigner,tokencleaner \ --use-service-account-credentials=true \ --horizontal-pod-autoscaler-sync-period=10s \ --allocate-node-cidrs=true \ --cluster-cidr=10.240.0.0/12 \ --v=4" EOF
生成 kubeconfig:
1 2 3 4 5 6 7 8 9 ➜ kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.128.170.21:6443 --kubeconfig=kube-controller-manager.kubeconfig ➜ kubectl config set-credentials kube-controller-manager --client-certificate=kube-controller-manager.pem --client-key=kube-controller-manager-key.pem --embed-certs=true --kubeconfig=kube-controller-manager.kubeconfig ➜ kubectl config set-context default --cluster=kubernetes --user=kube-controller-manager --kubeconfig=kube-controller-manager.kubeconfig ➜ kubectl config use-context default --kubeconfig=kube-controller-manager.kubeconfig ➜ cp kube-controller-manager.kubeconfig /etc/kubernetes/
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ➜ cat > /usr/lib/systemd/system/kube-controller-manager.service << "EOF" [Unit] Description=Kubernetes controller manager Documentation=https://github.com/kubernetes/kubernetes After=network.target network-online.target Wants=network-online.target [Service] EnvironmentFile=/etc/kubernetes/kube-controller-manager.conf ExecStart=/usr/local/bin/kube-controller-manager $KUBE_CONTROLLER_MANAGER_OPTS Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 kube-controller-manager 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kube-controller-manager # 验证结果 ➜ systemctl status kube-controller-manager # 查看日志 ➜ journalctl -u kube-controller-manager
查看组件状态:
1 2 3 4 5 6 7 ➜ kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Unhealthy Get "https://127.0.0.1:10259/healthz": dial tcp 127.0.0.1:10259: connect: connection refused controller-manager Healthy ok etcd-0 Healthy {"health":"true","reason":""}
8. 部署 kube-scheduler
只需在 master 节点(也即 master1 节点)部署即可。
8.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # kube-scheduler 证书签署申请 # hosts 字段中,IP 为所有节点地址,这里可以做好规划,预留几个 IP,以备后续扩容。 ➜ cat > kube-scheduler-csr.json << EOF { "CN": "system:kube-scheduler", "hosts": [ "127.0.0.1", "10.128.170.21", "10.128.170.22", "10.128.170.23", "10.128.170.131", "10.128.170.132", "10.128.170.133", "10.128.170.134", "10.128.170.135", "localhost", "master1", "master2", "master3", "worker1", "worker2", "worker3", "worker4", "worker5", "master1.local", "master2.local", "master3.local", "worker1.local", "worker2.local", "worker3.local", "worker4.local", "worker5.local" ], "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "system:kube-scheduler", "OU": "system" } ] } EOF # 签署 kube-scheduler 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-scheduler-csr.json | cfssljson -bare kube-scheduler # 验证结果,会生成两个证书文件 ➜ ls -lh kube-scheduler*pem -rw------- 1 root root 1.7K Jul 14 01:06 kube-scheduler-key.pem -rw-r--r-- 1 root root 1.8K Jul 14 01:06 kube-scheduler.pem # 复制 kube-scheduler 证书到 /etc/kubernetes/pki ➜ cp kube-scheduler*pem /etc/kubernetes/pki
8.2 部署 kube-scheduler
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 # 可以使用 kubeadm 生成示例配置文件,然后进行修改 # ➜ kubeadm init phase control-plane scheduler --dry-run -v 4 # ... # [control-plane] Creating static Pod manifest for "kube-scheduler" # ... # I0717 00:26:08.548412 6903 manifests.go:154] [control-plane] wrote static Pod manifest for component "kube-scheduler" to "/etc/kubernetes/tmp/kubeadm-init-dryrun1609159078/kube-scheduler.yaml" # ... ➜ cat > /etc/kubernetes/kube-scheduler.conf << EOF KUBE_SCHEDULER_OPTS="--bind-address=127.0.0.1 \ --kubeconfig=/etc/kubernetes/kube-scheduler.kubeconfig \ --leader-elect=true \ --v=4" EOF
生成 kubeconfig
1 2 3 4 5 6 7 8 9 ➜ kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.128.170.21:6443 --kubeconfig=kube-scheduler.kubeconfig ➜ kubectl config set-credentials kube-scheduler --client-certificate=kube-scheduler.pem --client-key=kube-scheduler-key.pem --embed-certs=true --kubeconfig=kube-scheduler.kubeconfig ➜ kubectl config set-context default --cluster=kubernetes --user=kube-scheduler --kubeconfig=kube-scheduler.kubeconfig ➜ kubectl config use-context default --kubeconfig=kube-scheduler.kubeconfig ➜ cp kube-scheduler.kubeconfig /etc/kubernetes/
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ➜ cat > /usr/lib/systemd/system/kube-scheduler.service << "EOF" [Unit] Description=Kubernetes scheduler Documentation=https://github.com/kubernetes/kubernetes After=network.target network-online.target Wants=network-online.target [Service] EnvironmentFile=/etc/kubernetes/kube-scheduler.conf ExecStart=/usr/local/bin/kube-scheduler $KUBE_SCHEDULER_OPTS Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 kube-scheduler 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kube-scheduler # 验证结果 ➜ systemctl status kube-scheduler # 查看日志 ➜ journalctl -u kube-scheduler
查看组件状态:
1 2 3 4 5 6 7 ➜ kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-0 Healthy {"health":"true","reason":""}
9. 部署 kubelet
先在 master 节点完成部署,后续添加 worker 节点时从 master 节点复制配置并调整即可。
master 节点上部署 kubelet 是可选的,一旦部署 kubelet,master 节点也可以运行 Pod,如果不希望 master 节点上运行 Pod,则可以给 master 节点打上污点。
master 节点部署 kubelet 是有好处的,一是可以通过诸如 kubectl get node
等命令查看节点信息,二是可以在上面部署监控系统,日志采集系统等。
9.1 授权 kubelet 允许请求证书
授权 kubelet-bootstrap 用户允许请求证书:
1 2 3 4 5 6 7 8 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl ➜ kubectl create clusterrolebinding kubelet-bootstrap \ --clusterrole=system:node-bootstrapper \ --user=kubelet-bootstrap clusterrolebinding.rbac.authorization.k8s.io/kubelet-bootstrap created
9.2 部署 kubelet
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 # 可以使用 kubeadm 生成示例配置文件,然后进行修改 # ➜ kubeadm init phase kubelet-start --dry-run # [kubelet-start] Writing kubelet environment file with flags to file "/etc/kubernetes/tmp/kubeadm-init-dryrun3282605628/kubeadm-flags.env" # [kubelet-start] Writing kubelet configuration to file "/etc/kubernetes/tmp/kubeadm-init-dryrun3282605628/config.yaml" # https://github.com/kubernetes-sigs/sig-windows-tools/issues/323 # https://github.com/kubernetes/kubernetes/pull/118544 ➜ cat > /etc/kubernetes/kubelet.conf << EOF KUBELET_OPTS="--bootstrap-kubeconfig=/etc/kubernetes/kubelet-bootstrap.kubeconfig \ --config=/etc/kubernetes/kubelet.yaml \ --kubeconfig=/etc/kubernetes/kubelet.kubeconfig \ --cert-dir=/etc/kubernetes/pki \ --container-runtime-endpoint=unix:///var/run/containerd/containerd.sock \ --pod-infra-container-image=registry.k8s.io/pause:3.9 \ --v=4 EOF ➜ cat > /etc/kubernetes/kubelet.yaml << EOF kind: KubeletConfiguration apiVersion: kubelet.config.k8s.io/v1beta1 address: 0.0.0.0 port: 10250 readOnlyPort: 0 authentication: anonymous: enabled: false webhook: cacheTTL: 2m0s enabled: true x509: clientCAFile: /etc/kubernetes/pki/ca.pem authorization: mode: Webhook webhook: cacheAuthorizedTTL: 5m0s cacheUnauthorizedTTL: 30s cgroupDriver: systemd clusterDNS: - 10.96.0.10 clusterDomain: cluster.local healthzBindAddress: 127.0.0.1 healthzPort: 10248 rotateCertificates: true evictionHard: imagefs.available: 15% memory.available: 100Mi nodefs.available: 10% nodefs.inodesFree: 5% maxOpenFiles: 1000000 maxPods: 110 EOF
生成 kubeconfig:
1 2 3 4 5 6 7 8 9 ➜ kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.128.170.21:6443 --kubeconfig=kubelet-bootstrap.kubeconfig ➜ kubectl config set-credentials kubelet-bootstrap --token=$(awk -F, '{print $1}' /etc/kubernetes/token.csv) --kubeconfig=kubelet-bootstrap.kubeconfig ➜ kubectl config set-context default --cluster=kubernetes --user=kubelet-bootstrap --kubeconfig=kubelet-bootstrap.kubeconfig ➜ kubectl config use-context default --kubeconfig=kubelet-bootstrap.kubeconfig ➜ cp kubelet-bootstrap.kubeconfig /etc/kubernetes/
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 ➜ cat > /usr/lib/systemd/system/kubelet.service << "EOF" [Unit] Description=Kubernetes kubelet After=network.target network-online.targer containerd.service Requires=containerd.service [Service] EnvironmentFile=/etc/kubernetes/kubelet.conf ExecStart=/usr/local/bin/kubelet $KUBELET_OPTS Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 kubelet 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kubelet # 验证结果 ➜ systemctl status kubelet # 查看日志 ➜ journalctl -u kubelet
批准节点加入集群:
1 2 3 4 5 6 7 8 9 10 11 12 13 ➜ kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-h92vn 59s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Pending ➜ kubectl certificate approve csr-h92vn certificatesigningrequest.certificates.k8s.io/csr-h92vn approved ➜ kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-h92vn 2m27s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Approved,Issued
查看节点:
1 2 3 4 5 6 ➜ kubectl get node NAME STATUS ROLES AGE VERSION master1 NotReady <none> 71s v1.27.3 # 此时节点状态还是 NotReady,因为还没有安装网络插件,正确安装网络插件后,状态会变为 Ready.
10. 部署 kube-proxy
先在 master 节点完成部署,后续添加 worker 节点时从 master 节点复制配置并调整即可。
10.1 颁发证书
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 # 进入 /etc/kubernetes/ssl 目录 ➜ cd /etc/kubernetes/ssl # kube-proxy 证书签署申请 ➜ cat > kube-proxy-csr.json << EOF { "CN": "system:kube-proxy", "key": { "algo": "rsa", "size": 2048 }, "names": [ { "C": "CN", "ST": "GuangDong", "L": "ShenZhen", "O": "k8s", "OU": "system" } ] } EOF # 签署 kube-proxy 证书 ➜ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=kubernetes kube-proxy-csr.json | cfssljson -bare kube-proxy # 验证结果,会生成两个证书文件 ➜ ls -lh kube-proxy*pem -rw------- 1 root root 1.7K Jul 15 18:36 kube-proxy-key.pem -rw-r--r-- 1 root root 1.4K Jul 15 18:36 kube-proxy.pem # 复制 kube-proxy 证书到 /etc/kubernetes/pki ➜ cp kube-proxy*pem /etc/kubernetes/pki
10.2 部署 kube-proxy
编写服务配置文件:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 ➜ cat > /etc/kubernetes/kube-proxy.conf << EOF KUBE_PROXY_OPTS="--config=/etc/kubernetes/kube-proxy.yaml \ --v=4 EOF ➜ cat > /etc/kubernetes/kube-proxy.yaml << EOF kind: KubeProxyConfiguration apiVersion: kubeproxy.config.k8s.io/v1alpha1 clientConnection: kubeconfig: /etc/kubernetes/kube-proxy.kubeconfig bindAddress: 0.0.0.0 clusterCIDR: 10.240.0.0/12 healthzBindAddress: 0.0.0.0:10256 metricsBindAddress: 0.0.0.0:10249 mode: ipvs ipvs: scheduler: "rr" EOF
生成 kubeconfig:
1 2 3 4 5 6 7 8 9 ➜ kubectl config set-cluster kubernetes --certificate-authority=ca.pem --embed-certs=true --server=https://10.128.170.21:6443 --kubeconfig=kube-proxy.kubeconfig ➜ kubectl config set-credentials kube-proxy --client-certificate=kube-proxy.pem --client-key=kube-proxy-key.pem --embed-certs=true --kubeconfig=kube-proxy.kubeconfig ➜ kubectl config set-context default --cluster=kubernetes --user=kube-proxy --kubeconfig=kube-proxy.kubeconfig ➜ kubectl config use-context default --kubeconfig=kube-proxy.kubeconfig ➜ cp kube-proxy.kubeconfig /etc/kubernetes/
编写服务启动脚本:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 ➜ cat > /usr/lib/systemd/system/kube-proxy.service << "EOF" [Unit] Description=Kubernetes Proxy Documentation=https://github.com/kubernetes/kubernetes After=network.target network-online.target Wants=network-online.target [Service] EnvironmentFile=-/etc/kubernetes/kube-proxy.conf ExecStart=/usr/local/bin/kube-proxy $KUBE_PROXY_OPTS Restart=on-failure RestartSec=5 LimitNOFILE=65535 [Install] WantedBy=multi-user.target EOF
启动 kube-proxy 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kube-proxy # 验证结果 ➜ systemctl status kube-proxy # 查看日志 ➜ journalctl -u kube-proxy
11. 部署集群网络
只需在 master 节点(也即 master1 节点)部署即可。
11.1 部署 calico
参考文档 Install Calico networking and network policy for on-premises deployments
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 ➜ cd ~/Downloads ➜ curl -k https://cdn.jsdelivr.net/gh/projectcalico/calico@v3.26.1/manifests/calico.yaml -O # 找到 CALICO_IPV4POOL_CIDR 变量,取消注释并修改 Pod IP 地址段 ➜ sed -i 's/# \(- name: CALICO_IPV4POOL_CIDR\)/\1/' calico.yaml ➜ sed -i 's/# value: "192.168.0.0\/16"/ value: "10.240.0.0\/12"/' calico.yaml ➜ kubectl apply -f calico.yaml poddisruptionbudget.policy/calico-kube-controllers created serviceaccount/calico-kube-controllers created serviceaccount/calico-node created serviceaccount/calico-cni-plugin created configmap/calico-config created customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgpfilters.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/caliconodestatuses.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/ipreservations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org created customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org created clusterrole.rbac.authorization.k8s.io/calico-kube-controllers created clusterrole.rbac.authorization.k8s.io/calico-node created clusterrole.rbac.authorization.k8s.io/calico-cni-plugin created clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers created clusterrolebinding.rbac.authorization.k8s.io/calico-node created clusterrolebinding.rbac.authorization.k8s.io/calico-cni-plugin created daemonset.apps/calico-node created deployment.apps/calico-kube-controllers created # 查看网络 pod ➜ kubectl -n kube-system get pod NAME READY STATUS RESTARTS AGE calico-kube-controllers-85578c44bf-7v87p 1/1 Running 0 25m calico-node-v924z 1/1 Running 0 25m # 查看 node 状态 ➜ kubectl get node NAME STATUS ROLES AGE VERSION master1 Ready <none> 30m v1.27.3 # 查看 ipvs 模式 ➜ ipvsadm -Ln IP Virtual Server version 1.2.1 (size=4096) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP 10.96.0.1:443 rr -> 10.128.170.21:6443 Masq 1 5 0
如果 node 状态仍然是 NotReady,基本上是镜像未拉取完成或拉取失败导致的,如果一段时间后仍拉取失败,则尝试手动拉取镜像。
11.2 授权 kube-apiserver 访问 kubelet
Using RBAC Authorization
应用场景:例如 kubectl exec/run/logs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 ➜ cd ~/Downloads ➜ cat > apiserver-to-kubelet-rbac.yaml << EOF apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:kube-apiserver-to-kubelet rules: - apiGroups: - "" resources: - nodes/proxy - nodes/stats - nodes/log - nodes/spec - nodes/metrics - pods/log verbs: - "*" --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: system:kube-apiserver namespace: "" roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:kube-apiserver-to-kubelet subjects: - apiGroup: rbac.authorization.k8s.io kind: User name: kubernetes EOF ➜ kubectl apply -f apiserver-to-kubelet-rbac.yaml clusterrole.rbac.authorization.k8s.io/system:kube-apiserver-to-kubelet created clusterrolebinding.rbac.authorization.k8s.io/system:kube-apiserver created ➜ kubectl -n kube-system logs calico-kube-controllers-85578c44bf-7v87p
11.3 部署 coredns
https://github.com/coredns/deployment/blob/master/kubernetes/coredns.yaml.sed
coredns.yaml.sed 原始文件见附录章节 "16.1 coredns.yaml.sed",该 yaml 指定使用的 coredns 的版本是 1.9.4。
1 2 3 4 ➜ cd ~/Downloads # 下载 yaml 文件 ➜ curl https://raw.kgithub.com/coredns/deployment/master/kubernetes/coredns.yaml.sed -o coredns.yaml
修改配置:
1 2 3 4 5 6 7 8 9 10 11 12 # "coredns/coredns:1.9.4" 替换为 "coredns/coredns:1.10.1" ➜ sed -i 's/coredns\/coredns:1.9.4/coredns\/coredns:1.10.1/g' coredns.yaml # "CLUSTER_DOMAIN" 替换为 "cluster.local" ➜ sed -i 's/CLUSTER_DOMAIN/cluster.local/g' coredns.yaml # "REVERSE_CIDRS" 替换为 "in-addr.arpa ip6.arpa" ➜ sed -i 's/REVERSE_CIDRS/in-addr.arpa ip6.arpa/g' coredns.yaml # "UPSTREAMNAMESERVER" 替换为 "/etc/resolv.conf" (或当前网络所使用的 DNS 地址)➜ sed -i 's/UPSTREAMNAMESERVER/\/etc\/resolv.conf/g' coredns.yaml # "STUBDOMAINS" 替换为 "" ➜ sed -i 's/STUBDOMAINS//g' coredns.yaml # "CLUSTER_DNS_IP" 替换为 "10.96.0.10" (与 kubelet.yaml 配置的 clusterDNS 一致)➜ sed -i 's/CLUSTER_DNS_IP/10.96.0.10/g' coredns.yaml
安装:
1 2 3 4 5 6 7 8 ➜ kubectl apply -f coredns.yaml serviceaccount/coredns created clusterrole.rbac.authorization.k8s.io/system:coredns created clusterrolebinding.rbac.authorization.k8s.io/system:coredns created configmap/coredns created deployment.apps/coredns created service/kube-dns created
验证(如果 calico 的 pod 未就绪,请检查是否是镜像拉取未完成或镜像拉取失败)
1 2 3 4 5 6 7 8 9 10 11 12 13 ➜ kubectl -n kube-system get deploy,pod,svc NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/calico-kube-controllers 1/1 1 1 23h deployment.apps/coredns 1/1 1 1 43s NAME READY STATUS RESTARTS AGE pod/calico-kube-controllers-85578c44bf-7v87p 1/1 Running 0 23h pod/calico-node-v924z 1/1 Running 0 23h pod/coredns-db5667c87-zg9s8 1/1 Running 0 41s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/kube-dns ClusterIP 10.96.0.10 <none> 53/UDP,53/TCP,9153/TCP 43s
dig 测试:
1 2 3 4 5 6 7 ➜ yum -y install bind-utils ➜ dig -t A www.baidu.com @10.96.0.10 +short www.a.shifen.com. 14.119.104.254 14.119.104.189
pod 测试:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 ➜ kubectl run -it --rm --image=busybox:1.28.3 -- sh If you don't see a command prompt, try pressing enter. / # cat /etc/resolv.conf search default.svc.cluster.local svc.cluster.local cluster.local nameserver 10.96.0.10 options ndots:5 / # nslookup kubernetes.default Server: 10.96.0.10 Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local Name: kubernetes.default Address 1: 10.96.0.1 kubernetes.default.svc.cluster.local / # ping -c 4 www.baidu.com PING www.baidu.com (14.119.104.254): 56 data bytes 64 bytes from 14.119.104.254: seq=0 ttl=127 time=14.193 ms 64 bytes from 14.119.104.254: seq=1 ttl=127 time=12.848 ms 64 bytes from 14.119.104.254: seq=2 ttl=127 time=18.553 ms 64 bytes from 14.119.104.254: seq=3 ttl=127 time=23.581 ms --- www.baidu.com ping statistics --- 4 packets transmitted, 4 packets received, 0% packet loss round-trip min/avg/max = 12.848/17.293/23.581 ms
12. 添加 worker 节点
worker 节点需要部署两个组件 kubelet
, kube-proxy
。
在 master 节点执行,从 master 节点上复制以下文件到 worker 节点:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 ➜ scp /etc/kubernetes/pki/ca.pem \ /etc/kubernetes/pki/kube-proxy.pem \ /etc/kubernetes/pki/kube-proxy-key.pem \ root@worker1:/etc/kubernetes/pki/ ➜ scp /etc/kubernetes/kubelet.conf \ /etc/kubernetes/kubelet.yaml \ /etc/kubernetes/kubelet-bootstrap.kubeconfig \ /etc/kubernetes/kube-proxy.conf \ /etc/kubernetes/kube-proxy.yaml \ /etc/kubernetes/kube-proxy.kubeconfig \ root@worker1:/etc/kubernetes/ ➜ scp /usr/lib/systemd/system/kubelet.service \ /usr/lib/systemd/system/kube-proxy.service \ root@worker1:/usr/lib/systemd/system/ ➜ scp /usr/local/bin/kubelet \ /usr/local/bin/kube-proxy \ root@worker1:/usr/local/bin/
在 worker 节点执行,worker 节点启动 kube-proxy 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kube-proxy # 验证结果 ➜ systemctl status kube-proxy # 查看日志 ➜ journalctl -u kube-proxy
在 worker 节点执行,worker 节点启动 kubelet 服务:
1 2 3 4 5 6 7 8 9 ➜ systemctl daemon-reload ➜ systemctl enable --now kubelet # 验证结果 ➜ systemctl status kubelet # 查看日志 ➜ journalctl -u kubelet
(master 节点执行)批准 worker 节点加入集群
1 2 3 4 5 6 7 8 9 10 11 12 ➜ kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-9twkl 85s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Pending ➜ kubectl certificate approve csr-9twkl certificatesigningrequest.certificates.k8s.io/csr-9twkl approved ➜ kubectl get csr NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION csr-9twkl 2m15s kubernetes.io/kube-apiserver-client-kubelet kubelet-bootstrap <none> Approved,Issued
在 master 节点执行,查看节点:
1 2 3 4 5 ➜ kubectl get node NAME STATUS ROLES AGE VERSION master1 Ready <none> 47h v1.27.3 worker1 Ready <none> 26m v1.27.3
如果 worker1 的状态仍是 NotReady,请检查是否是镜像拉取未完成或镜像拉取失败。
13. 禁止 master 节点运行 pod
至此 1 master 1 worker 的 k8s 二进制集群已搭建完毕。
此外,还可以给节点打上角色标签,使得查看节点信息更加直观:
1 2 3 4 5 6 7 8 9 10 11 # 给 master 节点打上 master,etcd 角色标签 ➜ kubectl label node master1 node-role.kubernetes.io/master=true node-role.kubernetes.io/etcd=true # 给 worker 节点打上 worker 角色标签 ➜ kubectl label node worker1 node-role.kubernetes.io/worker=true # 查看标签 ➜ kubectl get node --show-labels # 删除标签 # ➜ kubectl label node master1 node-role.kubernetes.io/etcd-
如果不希望 master 节点运行 Pod,则给 master 打上污点:
1 2 3 4 5 6 7 8 9 10 11 12 13 # 添加污点 ➜ kubectl taint node master1 node-role.kubernetes.io/master=true:NoSchedule # 查看污点 ➜ kubectl describe node master1 | grep Taints Taints: node-role.kubernetes.io/master=true:NoSchedule # 查看全部节点的污点 # ➜ kubectl get nodes -o jsonpath='{range .items[*]}{.metadata.name}{"\n"}{.spec.taints}{"\n\n"}{end}' # 删除污点 # ➜ kubectl taint node master1 node-role.kubernetes.io/master-
后续可以新增 2 个 etcd 节点组成 etcd 集群,新增 2 个控制平面,避免单点故障。
14. 测试应用服务部署
创建 namespace:
1 2 3 4 5 6 7 8 9 10 11 12 ➜ kubectl create namespace dev namespace/dev created ➜ kubectl get namespace NAME STATUS AGE default Active 2d dev Active 7m kube-node-lease Active 2d kube-public Active 2d kube-system Active 2d
创建 deployment:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 ➜ mkdir -p /etc/kubernetes/demo ➜ cat > /etc/kubernetes/demo/nginx-deployment.yaml << EOF apiVersion: apps/v1 kind: Deployment metadata: name: nginx-deployment namespace: dev spec: replicas: 1 selector: matchLabels: app: nginx-pod template: metadata: labels: app: nginx-pod spec: containers: - name: nginx image: nginx:latest EOF ➜ kubectl apply -f /etc/kubernetes/demo/nginx-deployment.yaml deployment.apps/nginx-deployment created ➜ kubectl -n dev get pod NAME READY STATUS RESTARTS AGE nginx-deployment-6d76bcb866-zl52j 1/1 Running 0 23s
创建 service:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ➜ cat > /etc/kubernetes/demo/nginx-service.yaml << EOF apiVersion: v1 kind: Service metadata: name: nginx-service namespace: dev spec: selector: app: nginx-pod type: NodePort ports: - port: 80 targetPort: 80 nodePort: 30001 EOF ➜ kubectl apply -f /etc/kubernetes/demo/nginx-service.yaml service/nginx-service created ➜ kubectl -n dev get svc NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE nginx-service NodePort 10.96.195.221 <none> 80:30001/TCP 13s
测试服务访问:
1 2 3 4 5 6 7 8 9 10 11 12 ➜ curl 10.128.170.21:30001 -I HTTP/1.1 200 OK Server: nginx/1.25.1 Date: Tue, 18 Jul 2023 16:56:37 GMT Content-Type: text/html Content-Length: 615 Last-Modified: Tue, 13 Jun 2023 15:08:10 GMT Connection: keep-alive ETag: "6488865a-267" Accept-Ranges: bytes
15. 部署 Dashboard
在 Kubernetes 社区中,有一个很受欢迎的 Dashboard 项目,它可以给用户提供一个可视化的 Web 界面来查看当前集群的各种信息。用户可以用 Kubernetes Dashboard 部署容器化的应用、监控应用的状态、执行故障排查任务以及管理 Kubernetes 各种资源。
官方参考文档:
使用 nodeport 方式将 dashboard 服务暴露在集群外,指定使用 30443 端口。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 ➜ cd ~/Downloads # 下载相关 yaml 文件 # https://github.com/kubernetes/dashboard/blob/v2.7.0/aio/deploy/recommended.yaml ➜ curl https://fastly.jsdelivr.net/gh/kubernetes/dashboard@v2.7.0/aio/deploy/recommended.yaml -o kubernetes-dashboard.yaml # 修改 Service 部分 ➜ vim kubernetes-dashboard.yaml kind: Service apiVersion: v1 metadata: labels: k8s-app: kubernetes-dashboard name: kubernetes-dashboard namespace: kubernetes-dashboard spec: type: NodePort # 新增 ports: - port: 443 targetPort: 8443 nodePort: 30443 # 新增 selector: k8s-app: kubernetes-dashboard # 部署 ➜ kubectl apply -f kubernetes-dashboard.yaml namespace/kubernetes-dashboard created serviceaccount/kubernetes-dashboard created service/kubernetes-dashboard created secret/kubernetes-dashboard-certs created secret/kubernetes-dashboard-csrf created secret/kubernetes-dashboard-key-holder created configmap/kubernetes-dashboard-settings created role.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrole.rbac.authorization.k8s.io/kubernetes-dashboard created rolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created clusterrolebinding.rbac.authorization.k8s.io/kubernetes-dashboard created deployment.apps/kubernetes-dashboard created service/dashboard-metrics-scraper created deployment.apps/dashboard-metrics-scraper created # 查看 kubernetes-dashboard 下的资源 ➜ kubectl -n kubernetes-dashboard get deploy NAME READY UP-TO-DATE AVAILABLE AGE dashboard-metrics-scraper 1/1 1 1 5m26s kubernetes-dashboard 1/1 1 1 5m28s ➜ kubectl -n kubernetes-dashboard get pod NAME READY STATUS RESTARTS AGE dashboard-metrics-scraper-5cb4f4bb9c-8qpbc 1/1 Running 0 6m22s kubernetes-dashboard-6967859bff-fvhkv 1/1 Running 0 6m22s ➜ kubectl get svc -n kubernetes-dashboard NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE dashboard-metrics-scraper ClusterIP 10.96.6.178 <none> 8000/TCP 6m37s kubernetes-dashboard NodePort 10.96.8.70 <none> 443:30443/TCP 6m41s
如果 kubernetes-dashboard 下的资源一直未就绪,请检查是否是正在拉取镜像或者镜像一直拉取失败。
例如:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 ➜ kubectl -n kubernetes-dashboard describe pod kubernetes-dashboard-546cbc58cd-hzvhr ... Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 6m20s default-scheduler Successfully assigned kubernetes-dashboard/kubernetes-dashboard-546cbc58cd-hzvhr to worker1 Normal Pulling 6m20s kubelet Pulling image "kubernetesui/dashboard:v2.5.0" ➜ kubectl -n kubernetes-dashboard describe pod kubernetes-dashboard-546cbc58cd-hzvhr Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 10m default-scheduler Successfully assigned kubernetes-dashboard/kubernetes-dashboard-546cbc58cd-hzvhr to worker1 Warning Failed 2m1s kubelet Failed to pull image "kubernetesui/dashboard:v2.5.0": rpc error: code = Unknown desc = dial tcp 104.18.124.25:443: i/o timeout Warning Failed 2m1s kubelet Error: ErrImagePull Normal SandboxChanged 2m kubelet Pod sandbox changed, it will be killed and re-created. Normal BackOff 118s (x3 over 2m) kubelet Back-off pulling image "kubernetesui/dashboard:v2.5.0" Warning Failed 118s (x3 over 2m) kubelet Error: ImagePullBackOff Normal Pulling 106s (x2 over 10m) kubelet Pulling image "kubernetesui/dashboard:v2.5.0" Normal Pulled 25s kubelet Successfully pulled image "kubernetesui/dashboard:v2.5.0" in 1m21.608630166s Normal Created 22s kubelet Created container kubernetes-dashboard Normal Started 21s kubelet Started container kubernetes-dashboard
创建 service account 并绑定默认 cluster-admin 管理员集群角色:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 # 下面创建了一个叫 admin-user 的服务账号,放在 kubernetes-dashboard 命名空间下,并将 cluster-admin 角色绑定到 admin-user 账户,这样 admin-user 账户就有了管理员的权限。 # 默认情况下,kubeadm 创建集群时已经创建了 cluster-admin 角色,我们直接绑定即可。 ➜ cat > dashboard-admin-user.yaml << EOF apiVersion: v1 kind: ServiceAccount metadata: name: admin-user namespace: kubernetes-dashboard --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: name: admin-user roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: cluster-admin subjects: - kind: ServiceAccount name: admin-user namespace: kubernetes-dashboard --- # https://github.com/kubernetes/kubernetes/issues/110113 apiVersion: v1 kind: Secret type: kubernetes.io/service-account-token metadata: name: admin-user namespace: kubernetes-dashboard annotations: kubernetes.io/service-account.name: admin-user EOF # 应用资源配置清单 ➜ kubectl apply -f dashboard-admin-user.yaml serviceaccount/admin-user created clusterrolebinding.rbac.authorization.k8s.io/admin-user created secret/admin-user created
查看 admin-user 账户的 token:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ➜ kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}') Name: admin-user Namespace: kubernetes-dashboard Labels: <none> Annotations: kubernetes.io/service-account.name: admin-user kubernetes.io/service-account.uid: 630430bb-4ea5-4026-81ef-9d4c39089bca Type: kubernetes.io/service-account-token Data ==== ca.crt: 1318 bytes namespace: 20 bytes token: eyJhbGciOiJSUzI1NiIsImtpZCI6IlRpS0VJZ0pkRW5va3Bsb2lKOUxVRXVtM3l6RFNtaFNzUkFFLW1zcXBHS2sifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJrdWJlcm5ldGVzLWRhc2hib2FyZCIsImt1YmVybmV0ZXMuaW8vc2VydmljZWFjY291bnQvc2VjcmV0Lm5hbWUiOiJhZG1pbi11c2VyIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluLXVzZXIiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC51aWQiOiI2MzA0MzBiYi00ZWE1LTQwMjYtODFlZi05ZDRjMzkwODliY2EiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6a3ViZXJuZXRlcy1kYXNoYm9hcmQ6YWRtaW4tdXNlciJ9.C-HqHea6OsWhQ9-yjPo0DGLhrgvtQ1cdaXOeGBOZqDKbPU4s-8VO31Ihw9Fbxo6vQnLJUyzFvRVB45eKr_95sJUht1lnD4pZOJHqnvSAa9SzkHbt4FcylHHG723wplLJc3fvnyKr1u3g74hHRUfLAE3q_VghMVwHi6hRyOalYN3KiFzQXKLVyovCxxAGwaEwJg9ftiawYMkDSzxLKkI17BBwrU_zt_xAKrLn229f9eEKsTeBMju0QMyhoWKCSVbV0chfw-sbJSUMAj7a8Ff5-uY1tru-QqUGI6RzlSKlI4E5hpsUVEFuU0HIHzrwxElTmNJnLZtcotFTLrsdHIXj2w
使用输出的 token 登录 Dashboard:
https://10.128.170.21:30443
16. 附录
16.1 coredns.yaml.sed
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 apiVersion: v1 kind: ServiceAccount metadata: name: coredns namespace: kube-system --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRole metadata: labels: kubernetes.io/bootstrapping: rbac-defaults name: system:coredns rules: - apiGroups: - "" resources: - endpoints - services - pods - namespaces verbs: - list - watch - apiGroups: - discovery.k8s.io resources: - endpointslices verbs: - list - watch --- apiVersion: rbac.authorization.k8s.io/v1 kind: ClusterRoleBinding metadata: annotations: rbac.authorization.kubernetes.io/autoupdate: "true" labels: kubernetes.io/bootstrapping: rbac-defaults name: system:coredns roleRef: apiGroup: rbac.authorization.k8s.io kind: ClusterRole name: system:coredns subjects: - kind: ServiceAccount name: coredns namespace: kube-system --- apiVersion: v1 kind: ConfigMap metadata: name: coredns namespace: kube-system data: Corefile: | .:53 { errors health { lameduck 5s } ready kubernetes CLUSTER_DOMAIN REVERSE_CIDRS { fallthrough in-addr.arpa ip6.arpa } prometheus :9153 forward . UPSTREAMNAMESERVER { max_concurrent 1000 } cache 30 loop reload loadbalance }STUBDOMAINS --- apiVersion: apps/v1 kind: Deployment metadata: name: coredns namespace: kube-system labels: k8s-app: kube-dns kubernetes.io/name: "CoreDNS" app.kubernetes.io/name: coredns spec: strategy: type: RollingUpdate rollingUpdate: maxUnavailable: 1 selector: matchLabels: k8s-app: kube-dns app.kubernetes.io/name: coredns template: metadata: labels: k8s-app: kube-dns app.kubernetes.io/name: coredns spec: priorityClassName: system-cluster-critical serviceAccountName: coredns tolerations: - key: "CriticalAddonsOnly" operator: "Exists" nodeSelector: kubernetes.io/os: linux affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: k8s-app operator: In values: ["kube-dns" ] topologyKey: kubernetes.io/hostname containers: - name: coredns image: coredns/coredns:1.9.4 imagePullPolicy: IfNotPresent resources: limits: memory: 170Mi requests: cpu: 100m memory: 70Mi args: [ "-conf" , "/etc/coredns/Corefile" ] volumeMounts: - name: config-volume mountPath: /etc/coredns readOnly: true ports: - containerPort: 53 name: dns protocol: UDP - containerPort: 53 name: dns-tcp protocol: TCP - containerPort: 9153 name: metrics protocol: TCP securityContext: allowPrivilegeEscalation: false capabilities: add: - NET_BIND_SERVICE drop: - all readOnlyRootFilesystem: true livenessProbe: httpGet: path: /health port: 8080 scheme: HTTP initialDelaySeconds: 60 timeoutSeconds: 5 successThreshold: 1 failureThreshold: 5 readinessProbe: httpGet: path: /ready port: 8181 scheme: HTTP dnsPolicy: Default volumes: - name: config-volume configMap: name: coredns items: - key: Corefile path: Corefile --- apiVersion: v1 kind: Service metadata: name: kube-dns namespace: kube-system annotations: prometheus.io/port: "9153" prometheus.io/scrape: "true" labels: k8s-app: kube-dns kubernetes.io/cluster-service: "true" kubernetes.io/name: "CoreDNS" app.kubernetes.io/name: coredns spec: selector: k8s-app: kube-dns app.kubernetes.io/name: coredns clusterIP: CLUSTER_DNS_IP ports: - name: dns port: 53 protocol: UDP - name: dns-tcp port: 53 protocol: TCP - name: metrics port: 9153 protocol: TCP
16.2 helm 安装 coredns
https://github.com/coredns/helm
安装 helm:
1 2 3 4 ➜ cd ~/Downloads ➜ curl -O https://mirrors.huaweicloud.com/helm/v3.12.2/helm-v3.12.2-linux-amd64.tar.gz ➜ tar -zxvf helm-v3.12.2-linux-amd64.tar.gz ➜ cp linux-amd64/helm /usr/local/bin
添加 chart 仓库:
1 ➜ helm repo add coredns https://coredns.github.io/helm
查看可安装的版本:
1 2 3 4 5 6 7 8 9 10 11 12 ➜ helm search repo -l NAME CHART VERSION APP VERSION DESCRIPTION coredns/coredns 1.24.1 1.10.1 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.24.0 1.10.1 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.23.0 1.10.1 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.22.0 1.10.1 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.21.0 1.10.1 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.20.2 1.9.4 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.20.1 1.9.4 CoreDNS is a DNS server that chains plugins and... coredns/coredns 1.20.0 1.9.4 CoreDNS is a DNS server that chains plugins and... ...
安装 coredns:
1 ➜ helm --namespace=kube-system install coredns coredns/coredns
References
K8s 高可用集群架构(二进制)部署及应用
二进制部署 k8s 集群 1.23.1 版本
部署一套完整的企业级k8s集群