kubeadm拉起单节点K8s(新版)

Overview

由于国内网络环境特殊,使用kubeadm拉起K8s集群也变成了一件有技巧的事情。

本文介绍如何“科学”的使用kubeadm拉起一个单节点的K8s集群。

主要还是参考官网:https://kubernetes.io/docs/setup/


1.安装容器运行时

参考

  • 容器运行时:https://kubernetes.io/zh/docs/setup/production-environment/container-runtimes/

结论:此处我们选用containerd,cgroup驱动选择systemd,cgroup v2/v1我们不做偏好设置。

1.1 设置Docker repo source

1sudo apt-get update && apt-get install \
2    ca-certificates \
3    curl \
4    gnupg \
5    lsb-release
6curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg 
7echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
8sudo apt-update

1.2 安装containerd

查看containerd可用版本(可选)。

1apt-cache madison containerd.io
2apt-mark hold containerd.io

安装containerd。

1sudo apt install -y containerd.io=1.4.11-1

测试下是否能下载镜像且运行。

1sudo ctr image pull docker.io/library/alpine:latest
2sudo ctr run -t docker.io/library/alpine:latest test sh

1.3 为对接K8s做必要的配置

 1cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
 2overlay
 3br_netfilter
 4EOF
 5
 6sudo modprobe overlay
 7sudo modprobe br_netfilter
 8
 9# 设置必需的 sysctl 参数,这些参数在重新启动后仍然存在。
10cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
11net.bridge.bridge-nf-call-iptables  = 1
12net.ipv4.ip_forward                 = 1
13net.bridge.bridge-nf-call-ip6tables = 1
14EOF
15
16# 应用 sysctl 参数而无需重新启动
17sudo sysctl --system

配置containerd并重启

1containerd config default | sudo tee /etc/containerd/config.toml
2sudo systemctl restart containerd 

手动配置kubelet使用systemd cgroup驱动

在 /etc/containerd/config.toml 中设置

1[plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
2  ...
3  [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
4    SystemdCgroup = true
5
6[plugins]
7  [plugins."io.containerd.grpc.v1.cri"]
8    sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6"

注意同时配置了containerd不使用k8s.gcr.io/pause容器:

sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.6"

重启containerd即可。

1sudo systemctl restart containerd

Note: 当使用 kubeadm 时,请手动配置 kubelet 的 cgroup 驱动.在kubeadm一节会讲。

2.安装kubectl

参考:https://kubernetes.io/docs/tasks/tools/install-kubectl-linux/

  1. 下载kubectl
1curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
  1. 安装二进制
1sudo install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl
  1. 测试下kubectl是否工作
1kubectl version --client
  1. Enable kubectl的bash补全
1echo 'source <(kubectl completion bash)' >>~/.bashrc
2echo 'complete -F __start_kubectl k' >>~/.bashrc

5.安装kubectl convert插件(可选)

kubectl convert插件可以将废弃的API的元数据转换成目前支持的API,目测会很有用。

1curl -LO https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert
2curl -LO "https://dl.k8s.io/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl-convert.sha256"
3echo "$(<kubectl-convert.sha256) kubectl-convert" | sha256sum --check
4sudo install -o root -g root -m 0755 kubectl-convert /usr/local/bin/kubectl-convert
5kubectl convert --help

3.生产环境安装K8s

参考:

使用kubeadm安装和配置生产集群。

3.1 安装kubeadm/kubelet

参考:

3.1.1 官网推荐方法 (科学上网)

通过apt,yum等安装: https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/

3.1.2 手动下载release包(网络受限)

这种方式适合官网被墙时使用。

重点:安装kubeadm,kubectl和kubelet,其中kubectl在上一步已经安装好了。

  • 由于内部被墙,需要从github上下载release包。

地址:https://github.com/kubernetes/kubernetes/releases

  • 解压之后,运行其中的./cluster/get-kube-binaries.sh下载各个组件。

最终获得client/kubernetes-client-linux-amd64.tar.gzserver/kubernetes-server-linux-amd64.tar.gz

继续解压获得server的几个重要组件的二进制。

  • 安装kubelet,kubeadm
1sudo install -o root -g root -m 0755 ./k8s_bin/kubelet /usr/local/bin/kubelet
2sudo install -o root -g root -m 0755 ./k8s_bin/kubeadm /usr/local/bin/kubeadm

在原文中,是使用apt-get安装的kubelet,所以kubelet会被systemd自动拉起。但是由于此处我们是手动安装,所以必须手动安装kubelet。

  • 安装kubelet的systemd配置文件
 1cat > /lib/systemd/system/kubelet.service  << EOF
 2[Unit]
 3Description=kubelet: The Kubernetes Node Agent
 4Documentation=http://kubernetes.io/docs/
 5Wants=network-online.target
 6After=network-online.target
 7
 8[Service]
 9ExecStart=/usr/local/bin/kubelet
10Restart=always
11StartLimitInterval=0
12RestartSec=10
13
14[Install]
15WantedBy=multi-user.target
16EOF

此时kubelet会不断地重启,这是正常的。

  • 使用kubeadm配置覆盖kubelet默认配置

由于我们的kubeadm不是由apt-get自动安装的,所以相应的servicse文件没有安装,所以需要手动写入kubeadm为kubelet准备的service文件,以覆盖kubelet的默认配置。 写入/etc/systemd/system/kubelet.service.d/10-kubeadm.conf以下内容

 1mkdir -p /etc/systemd/system/kubelet.service.d/
 2cat > /etc/systemd/system/kubelet.service.d/10-kubeadm.conf << EOF
 3[Service]
 4Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf"
 5Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
 6# 这是 "kubeadm init" 和 "kubeadm join" 运行时生成的文件,动态地填充 KUBELET_KUBEADM_ARGS 变量
 7EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
 8# 这是一个文件,用户在不得已下可以将其用作替代 kubelet args。
 9# 用户最好使用 .NodeRegistration.KubeletExtraArgs 对象在配置文件中替代。
10# KUBELET_EXTRA_ARGS 应该从此文件中获取。
11EnvironmentFile=-/etc/default/kubelet
12ExecStart=
13ExecStart=/usr/local/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
14EOF

该文件为 kubelet 指定由 kubeadm 管理的所有文件的默认位置。

  • 用于 TLS 引导程序的 KubeConfig 文件为 /etc/kubernetes/bootstrap-kubelet.conf, 但仅当 /etc/kubernetes/kubelet.conf不存在时才能使用。
  • 具有唯一 kubelet 标识的 KubeConfig 文件为 /etc/kubernetes/kubelet.conf。包含 kubelet 的组件配置的文件为 /var/lib/kubelet/config.yaml
  • 包含的动态环境的文件 KUBELET_KUBEADM_ARGS是来源于 /var/lib/kubelet/kubeadm-flags.env
  • 包含用户指定标志替代的文件 KUBELET_EXTRA_ARGS是来源于 /etc/default/kubelet (对于 DEB),或者 /etc/sysconfig/kubelet(对于 RPM)。 KUBELET_EXTRA_ARGS在标志链中排在最后,并且在设置冲突时具有最高优先级。

最后记得重启kubelet

1sudo systemctl daemon-reload && systemctl restart kubelet && systemctl enable kubelet
2sudo swapoff -a

还要用swapoff -a关闭swap,这个是K8s要求的。

  • 安装crictl工具,这个是kubelet的client程序
1VERSION="v1.22.0"
2wget https://github.com/kubernetes-sigs/cri-tools/releases/download/$VERSION/crictl-$VERSION-linux-amd64.tar.gz
3sudo tar zxvf crictl-$VERSION-linux-amd64.tar.gz -C /usr/local/bin
4rm -f crictl-$VERSION-linux-amd64.tar.gz

3.2 准备离线镜像

参考:

默认的镜像会使用k8s.gcr.io这个镜像仓库的,但是由于在国内无法访问,所以有必要提前准备需要的镜像,方便我们离线安装。

通过以下命令可知kubeadm依赖的镜像:

1kubeadm config images list

镜像列表为

1k8s.gcr.io/kube-apiserver:v1.22.3
2k8s.gcr.io/kube-controller-manager:v1.22.3
3k8s.gcr.io/kube-scheduler:v1.22.3
4k8s.gcr.io/kube-proxy:v1.22.3
5k8s.gcr.io/pause:3.5
6k8s.gcr.io/etcd:3.5.0-0
7k8s.gcr.io/coredns/coredns:v1.8.4

这些镜像都是我们拉不下来的,所以需要使用自定义的镜像。

首先我们可以用 kubeadm config print init-defaults 打印kubeadm init默认使用的config,其次修改其中的K8s版本,imageRepository字段, containerd(cri插件)socket位置,etcd和coredns镜像仓库及tag号等。最终使用如下的kubeadm config。

Note:后面还有很多步要修改这个kubeadm config,最终完整的kubeadm config要参考附录4.1.

 1apiVersion: kubeadm.k8s.io/v1beta3
 2bootstrapTokens:
 3- groups:
 4  - system:bootstrappers:kubeadm:default-node-token
 5  token: abcdef.0123456789abcdef
 6  ttl: 24h0m0s
 7  usages:
 8  - signing
 9  - authentication
10kind: InitConfiguration
11localAPIEndpoint:
12  advertiseAddress: <Node-IP>
13  bindPort: 6443
14nodeRegistration:
15  criSocket: /var/run/containerd/containerd.sock
16  imagePullPolicy: IfNotPresent
17  name: node
18  taints: null
19---
20apiServer:
21  timeoutForControlPlane: 4m0s
22apiVersion: kubeadm.k8s.io/v1beta3
23certificatesDir: /etc/kubernetes/pki
24clusterName: kubernetes
25controllerManager: {}
26dns: {}
27etcd:
28  local:
29    dataDir: /var/lib/etcd
30imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
31kind: ClusterConfiguration
32kubernetesVersion: 1.22.3
33networking:
34  dnsDomain: <Hostname>
35  serviceSubnet: 10.96.0.0/12
36scheduler: {}

其中K8s管控组件使用了阿里云的镜像仓库,etcd和coredns使用了docker.io镜像仓库。

然后拉取镜像。

1sudo kubeadm config images pull --config=kubeadm-init-config.yaml

等待拉取完成即可。查看下镜像是否拉取成功。

1sudo ctr -n k8s.io image ls -q

Note: 这个kubeadm的config文件在init的时候也要指定。

3.3 选择网络方案及设置kubeadm子网

参考:

网络方案相当多,目前看Flannel, Calico和Cilium都是相当不错的选择。其中Flannel最老最简单,Cilium使用了BPF所以性能最好。但为了保证稳定性同时兼容灵活性,本项目暂时选择Calico作为网络方案,后续还需要继续调研哪种方案更好。

Calico需要指定的子网为192.168.0.0/16

1sudo kubeadm init --pod-network-cidr=192.168.0.0/16

可以在上面的kubeadm config中加入如下一段内容:

1networking:
2  ......
3  podSubnet: 192.168.0.0/16

等集群拉起来之后我们再继续安装Calico的组件。

3.4 配置kubelet使用systemd cgroup驱动

参考:

  • 配置kubelet的cgroup驱动:https://kubernetes.io/zh/docs/tasks/administer-cluster/kubeadm/configure-cgroup-driver/#%E9%85%8D%E7%BD%AE-kubelet-%E7%9A%84-cgroup-%E9%A9%B1%E5%8A%A8

在kubeadm config中加入如下配置:

1kind: KubeletConfiguration
2apiVersion: kubelet.config.k8s.io/v1beta1
3cgroupDriver: systemd

3.5 kubeadm init拉起集群

是时候拉起集群了,运行:

1sudo kubeadm init --config=kubeadm-init-config.yaml
  1. 第一次启动失败,报错:
1[init] Using Kubernetes version: v1.22.3
2[preflight] Running pre-flight checks
3	[WARNING FileExisting-socat]: socat not found in system path
4	[WARNING Hostname]: hostname "node" could not be reached
5	[WARNING Hostname]: hostname "node": lookup node on 127.0.0.53:53: server misbehaving
6error execution phase preflight: [preflight] Some fatal errors occurred:
7	[ERROR FileExisting-conntrack]: conntrack not found in system path
8[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
9To see the stack trace of this error execute with --v=5 or higher

几个错误:

  • socat, conntrack没安装:安装。
  • 配置文件中hostname配错了:修改成本机真实的hostname=ringcloud即可。
  1. 第二次启动失败。从kubelet日志看,nodeName不对,以及对应的advertiseAddress设置成 1.2.3.4也不对,需要改成真实的nodeName和IP。需要先回滚修改。
1sudo kubeadm reset
  1. 第三次启动失败,报错如下。
 1Nov 08 08:28:14 ringcloud kubelet[231529]: I1108 08:28:14.648371  231529 server.go:199] "Warning: For remote container runtime, --pod-infra-container-image is ignored in kubelet, which should be set in that remote runtime instead"
 2Nov 08 08:28:15 ringcloud kubelet[231529]: I1108 08:28:15.595136  231529 server.go:440] "Kubelet version" kubeletVersion="v1.22.3"
 3Nov 08 08:28:15 ringcloud kubelet[231529]: I1108 08:28:15.595657  231529 server.go:868] "Client rotation is on, will bootstrap in background"
 4Nov 08 08:28:15 ringcloud kubelet[231529]: I1108 08:28:15.601897  231529 dynamic_cafile_content.go:155] "Starting controller" name="client-ca-bundle::/etc/kubernetes/pki/ca.crt"
 5Nov 08 08:28:15 ringcloud kubelet[231529]: E1108 08:28:15.618957  231529 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate signing request: Post "https://103.242.173.68:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp 103.242.173.68:6443: connect: connection refused
 6Nov 08 08:28:17 ringcloud kubelet[231529]: E1108 08:28:17.743843  231529 certificate_manager.go:471] kubernetes.io/kube-apiserver-client-kubelet: Failed while requesting a signed certificate from the control plane: cannot create certificate signing request: Post "https://103.242.173.68:6443/apis/certificates.k8s.io/v1/certificatesigningrequests": dial tcp 103.242.173.68:6443: connect: connection refused
 7Nov 08 08:28:20 ringcloud kubelet[231529]: I1108 08:28:20.654012  231529 server.go:687] "--cgroups-per-qos enabled, but --cgroup-root was not specified.  defaulting to /"
 8Nov 08 08:28:20 ringcloud kubelet[231529]: I1108 08:28:20.654341  231529 container_manager_linux.go:280] "Container manager verified user specified cgroup-root exists" cgroupRoot=[]
 9Nov 08 08:28:20 ringcloud kubelet[231529]: I1108 08:28:20.654433  231529 container_manager_linux.go:285] "Creating Container Manager object based on Node Config" nodeConfig={RuntimeCgroupsName: SystemCgroupsName: KubeletCgroupsName: ContainerRuntime:remote CgroupsPerQOS:true CgroupRoot:/ CgroupDriver:systemd KubeletRootDir:/var/lib/kubelet ProtectKernelDefaults:false NodeAllocatableConfig:{KubeReservedCgroupName: SystemReservedCgroupName: ReservedSystemCPUs: EnforceNodeAllocatable:map[pods:{}] KubeReserved:map[] SystemReserved:map[] HardEvictionThresholds:[{Signal:nodefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.1} GracePeriod:0s MinReclaim:<nil>} {Signal:nodefs.inodesFree Operator:LessThan Value:{Quantity:<nil> Percentage:0.05} GracePeriod:0s MinReclaim:<nil>} {Signal:imagefs.available Operator:LessThan Value:{Quantity:<nil> Percentage:0.15} GracePeriod:0s MinReclaim:<nil>} {Signal:memory.available Operator:LessThan Value:{Quantity:100Mi Percentage:0} GracePeriod:0s MinReclaim:<nil>}]} QOSReserved:map[] ExperimentalCPUManagerPolicy:none ExperimentalCPUManagerPolicyOptions:map[] ExperimentalTopologyManagerScope:container ExperimentalCPUManagerReconcilePeriod:10s ExperimentalMemoryManagerPolicy:None ExperimentalMemoryManagerReservedMemory:[] ExperimentalPodPidsLimit:-1 EnforceCPULimits:true CPUCFSQuotaPeriod:100ms ExperimentalTopologyManagerPolicy:none}
10Nov 08 08:28:20 ringcloud kubelet[231529]: I1108 08:28:20.654570  231529 topology_manager.go:133] "Creating topology manager with policy per scope" topologyPolicyName="none" topologyScopeName="container"
11....
12Nov 08 08:28:20 ringcloud kubelet[231529]: E1108 08:28:20.864633  231529 kubelet.go:2412] "Error getting node" err="node \"ringcloud\" not found"
13Nov 08 08:28:20 ringcloud kubelet[231529]: E1108 08:28:20.965296  231529 kubelet.go:2412] "Error getting node" err="node \"ringcloud\" not found"

关键信息其实是这一句:

1Nov 08 08:28:14 ringcloud kubelet[231529]: I1108 08:28:14.648371  231529 server.go:199] "Warning: For remote container runtime, --pod-infra-container-image is ignored in kubelet, which should be set in that remote runtime instead"

containerd会无视kubelet的--pod-infra-container-image参数,而在containerd的日志中,我们确实也看到在不断的拉取k8s.gcr.io/pause容器。

修改containerd的配置文件/etc/containerd/config.toml, 以下一行:

1sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"

重启containerd.

最终启动成功了:

 1zhangwei@ringcloud:~/program/k8s_install$ sudo kubeadm init --config=kubeadm-init-config.yaml 
 2[init] Using Kubernetes version: v1.22.3
 3[preflight] Running pre-flight checks
 4[preflight] Pulling images required for setting up a Kubernetes cluster
 5[preflight] This might take a minute or two, depending on the speed of your internet connection
 6[preflight] You can also perform this action in beforehand using 'kubeadm config images pull'
 7[certs] Using certificateDir folder "/etc/kubernetes/pki"
 8[certs] Generating "ca" certificate and key
 9[certs] Generating "apiserver" certificate and key
10[certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.ringcloud.com ringcloud] and IPs [10.96.0.1 103.242.173.68]
11[certs] Generating "apiserver-kubelet-client" certificate and key
12[certs] Generating "front-proxy-ca" certificate and key
13[certs] Generating "front-proxy-client" certificate and key
14[certs] Generating "etcd/ca" certificate and key
15[certs] Generating "etcd/server" certificate and key
16[certs] etcd/server serving cert is signed for DNS names [localhost ringcloud] and IPs [103.242.173.68 127.0.0.1 ::1]
17[certs] Generating "etcd/peer" certificate and key
18[certs] etcd/peer serving cert is signed for DNS names [localhost ringcloud] and IPs [103.242.173.68 127.0.0.1 ::1]
19[certs] Generating "etcd/healthcheck-client" certificate and key
20[certs] Generating "apiserver-etcd-client" certificate and key
21[certs] Generating "sa" key and public key
22[kubeconfig] Using kubeconfig folder "/etc/kubernetes"
23[kubeconfig] Writing "admin.conf" kubeconfig file
24[kubeconfig] Writing "kubelet.conf" kubeconfig file
25[kubeconfig] Writing "controller-manager.conf" kubeconfig file
26[kubeconfig] Writing "scheduler.conf" kubeconfig file
27[kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env"
28[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
29[kubelet-start] Starting the kubelet
30[control-plane] Using manifest folder "/etc/kubernetes/manifests"
31[control-plane] Creating static Pod manifest for "kube-apiserver"
32[control-plane] Creating static Pod manifest for "kube-controller-manager"
33[control-plane] Creating static Pod manifest for "kube-scheduler"
34[etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests"
35[wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s
36[apiclient] All control plane components are healthy after 14.504580 seconds
37[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
38[kubelet] Creating a ConfigMap "kubelet-config-1.22" in namespace kube-system with the configuration for the kubelets in the cluster
39[upload-certs] Skipping phase. Please see --upload-certs
40[mark-control-plane] Marking the node ringcloud as control-plane by adding the labels: [node-role.kubernetes.io/master(deprecated) node-role.kubernetes.io/control-plane node.kubernetes.io/exclude-from-external-load-balancers]
41[mark-control-plane] Marking the node ringcloud as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule]
42[bootstrap-token] Using token: abcdef.0123456789abcdef
43[bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles
44[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
45[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
46[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
47[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
48[bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace
49[kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key
50[addons] Applied essential addon: CoreDNS
51[addons] Applied essential addon: kube-proxy
52
53Your Kubernetes control-plane has initialized successfully!
54
55To start using your cluster, you need to run the following as a regular user:
56
57  mkdir -p $HOME/.kube
58  sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
59  sudo chown $(id -u):$(id -g) $HOME/.kube/config
60
61Alternatively, if you are the root user, you can run:
62
63  export KUBECONFIG=/etc/kubernetes/admin.conf
64
65You should now deploy a pod network to the cluster.
66Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at:
67  https://kubernetes.io/docs/concepts/cluster-administration/addons/
68
69Then you can join any number of worker nodes by running the following on each as root:
70
71kubeadm join 103.242.173.68:6443 --token xxx \
72	--discovery-token-ca-cert-hash sha256:xxx

此时由于网络还没配置好,所以coredns组件还会处于无限pending的过程中。

3.6 把网络给拉起来

参考:

  • Calico安装:https://docs.projectcalico.org/getting-started/kubernetes/quickstart
1kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/tigera-operator.yaml
2kubectl create -f https://raw.githubusercontent.com/projectcalico/calico/v3.26.1/manifests/custom-resources.yaml
3watch kubectl get pods -n calico-system

等到calico-system下面的所有pod都被拉起来了,就大功告成了。

 1$ kubectl get pods -n calico-system
 2NAME                                       READY   STATUS    RESTARTS   AGE
 3calico-kube-controllers-845d5d6747-bl8vm   1/1     Running   0          3m12s
 4calico-node-b6l2c                          1/1     Running   0          3m13s
 5calico-typha-84dbb8d5b5-68mp7              1/1     Running   0          3m13s
 6csi-node-driver-jvckc                      2/2     Running   0          3m12s
 7
 8$ kubectl -n kube-system get pod
 9NAME                                READY   STATUS    RESTARTS   AGE
10coredns-7d89d9b6b8-9rcmw            1/1     Running   0          29m
11coredns-7d89d9b6b8-k565x            1/1     Running   0          29m
12etcd-ringcloud                      1/1     Running   0          30m
13kube-apiserver-ringcloud            1/1     Running   0          30m
14kube-controller-manager-ringcloud   1/1     Running   0          30m
15kube-proxy-5szw8                    1/1     Running   0          29m
16kube-scheduler-ringcloud            1/1     Running   0          30m

可见calico的所有组件,以及coredns都变成运行态了。

3.7 允许在master节点调度pod(可选)

生产环境下为了安全,这个是不建议的。当前我们只有一个测试节点,可以允许master节点调度pod运行。

1kubectl taint nodes --all node-role.kubernetes.io/control-plane-
2kubectl taint nodes --all node-role.kubernetes.io/master-

3.8 启动个简单的pod测试一下

 1pod.yaml
 2apiVersion: v1
 3kind: Pod
 4metadata:
 5  name: test
 6spec:
 7  containers:
 8  - name: busybox
 9    image: busybox
10    command: ['sleep', "1000"]

启动:

1kubectl create -f pod.yaml

容器应该会成功运行

删除:

1kubectl delete -f pod.yaml

4.附录

4.1 完整的kubeadm init config

 1apiVersion: kubeadm.k8s.io/v1beta3
 2bootstrapTokens:
 3- groups:
 4  - system:bootstrappers:kubeadm:default-node-token
 5  token: abcdef.0123456789abcdef
 6  ttl: 24h0m0s
 7  usages:
 8  - signing
 9  - authentication
10kind: InitConfiguration
11localAPIEndpoint:
12  advertiseAddress: 103.242.173.68
13  bindPort: 6443
14nodeRegistration:
15  criSocket: /var/run/containerd/containerd.sock
16  imagePullPolicy: IfNotPresent
17  name: ringcloud
18  taints: null
19---
20apiServer:
21  timeoutForControlPlane: 4m0s
22apiVersion: kubeadm.k8s.io/v1beta3
23certificatesDir: /etc/kubernetes/pki
24clusterName: kubernetes
25controllerManager: {}
26dns: {}
27etcd:
28  local:
29    dataDir: /var/lib/etcd
30imageRepository: registry.cn-hangzhou.aliyuncs.com/google_containers
31kind: ClusterConfiguration
32kubernetesVersion: 1.22.3
33networking:
34  dnsDomain: ringcloud.com
35  serviceSubnet: 10.96.0.0/12
36  podSubnet: 192.168.0.0/16
37scheduler: {}
38---
39kind: KubeletConfiguration
40apiVersion: kubelet.config.k8s.io/v1beta1
41cgroupDriver: systemd

4.2 完整的contaienrd config

文件:/etc/containerd/config.toml

  1version = 2
  2root = "/var/lib/containerd"
  3state = "/run/containerd"
  4plugin_dir = ""
  5disabled_plugins = []
  6required_plugins = []
  7oom_score = 0
  8
  9[grpc]
 10  address = "/run/containerd/containerd.sock"
 11  tcp_address = ""
 12  tcp_tls_cert = ""
 13  tcp_tls_key = ""
 14  uid = 0
 15  gid = 0
 16  max_recv_message_size = 16777216
 17  max_send_message_size = 16777216
 18
 19[ttrpc]
 20  address = ""
 21  uid = 0
 22  gid = 0
 23
 24[debug]
 25  address = ""
 26  uid = 0
 27  gid = 0
 28  level = ""
 29
 30[metrics]
 31  address = ""
 32  grpc_histogram = false
 33
 34[cgroup]
 35  path = ""
 36
 37[timeouts]
 38  "io.containerd.timeout.shim.cleanup" = "5s"
 39  "io.containerd.timeout.shim.load" = "5s"
 40  "io.containerd.timeout.shim.shutdown" = "3s"
 41  "io.containerd.timeout.task.state" = "2s"
 42
 43[plugins]
 44  [plugins."io.containerd.gc.v1.scheduler"]
 45    pause_threshold = 0.02
 46    deletion_threshold = 0
 47    mutation_threshold = 100
 48    schedule_delay = "0s"
 49    startup_delay = "100ms"
 50  [plugins."io.containerd.grpc.v1.cri"]
 51    disable_tcp_service = true
 52    stream_server_address = "127.0.0.1"
 53    stream_server_port = "0"
 54    stream_idle_timeout = "4h0m0s"
 55    enable_selinux = false
 56    selinux_category_range = 1024
 57    sandbox_image = "registry.cn-hangzhou.aliyuncs.com/google_containers/pause:3.5"
 58    stats_collect_period = 10
 59    systemd_cgroup = false
 60    enable_tls_streaming = false
 61    max_container_log_line_size = 16384
 62    disable_cgroup = false
 63    disable_apparmor = false
 64    restrict_oom_score_adj = false
 65    max_concurrent_downloads = 3
 66    disable_proc_mount = false
 67    unset_seccomp_profile = ""
 68    tolerate_missing_hugetlb_controller = true
 69    disable_hugetlb_controller = true
 70    ignore_image_defined_volumes = false
 71    [plugins."io.containerd.grpc.v1.cri".containerd]
 72      snapshotter = "overlayfs"
 73      default_runtime_name = "runc"
 74      no_pivot = false
 75      disable_snapshot_annotations = true
 76      discard_unpacked_layers = false
 77      [plugins."io.containerd.grpc.v1.cri".containerd.default_runtime]
 78        runtime_type = ""
 79        runtime_engine = ""
 80        runtime_root = ""
 81        privileged_without_host_devices = false
 82        base_runtime_spec = ""
 83      [plugins."io.containerd.grpc.v1.cri".containerd.untrusted_workload_runtime]
 84        runtime_type = ""
 85        runtime_engine = ""
 86        runtime_root = ""
 87        privileged_without_host_devices = false
 88        base_runtime_spec = ""
 89      [plugins."io.containerd.grpc.v1.cri".containerd.runtimes]
 90        [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc]
 91          runtime_type = "io.containerd.runc.v2"
 92          runtime_engine = ""
 93          runtime_root = ""
 94          privileged_without_host_devices = false
 95          base_runtime_spec = ""
 96          [plugins."io.containerd.grpc.v1.cri".containerd.runtimes.runc.options]
 97            SystemdCgroup = true
 98    [plugins."io.containerd.grpc.v1.cri".cni]
 99      bin_dir = "/opt/cni/bin"
100      conf_dir = "/etc/cni/net.d"
101      max_conf_num = 1
102      conf_template = ""
103    [plugins."io.containerd.grpc.v1.cri".registry]
104      [plugins."io.containerd.grpc.v1.cri".registry.mirrors]
105        [plugins."io.containerd.grpc.v1.cri".registry.mirrors."docker.io"]
106          endpoint = ["https://registry-1.docker.io"]
107    [plugins."io.containerd.grpc.v1.cri".image_decryption]
108      key_model = ""
109    [plugins."io.containerd.grpc.v1.cri".x509_key_pair_streaming]
110      tls_cert_file = ""
111      tls_key_file = ""
112  [plugins."io.containerd.internal.v1.opt"]
113    path = "/opt/containerd"
114  [plugins."io.containerd.internal.v1.restart"]
115    interval = "10s"
116  [plugins."io.containerd.metadata.v1.bolt"]
117    content_sharing_policy = "shared"
118  [plugins."io.containerd.monitor.v1.cgroups"]
119    no_prometheus = false
120  [plugins."io.containerd.runtime.v1.linux"]
121    shim = "containerd-shim"
122    runtime = "runc"
123    runtime_root = ""
124    no_shim = false
125    shim_debug = false
126  [plugins."io.containerd.runtime.v2.task"]
127    platforms = ["linux/amd64"]
128  [plugins."io.containerd.service.v1.diff-service"]
129    default = ["walking"]
130  [plugins."io.containerd.snapshotter.v1.devmapper"]
131    root_path = ""
132    pool_name = ""
133    base_image_size = ""
134    async_remove = false