晚上看了一篇云原生的链接,突然想自己部署一个Kubernetes集群玩玩,说干就干,相比以前ALL-In-One的OpenStack部署,Kubernetes还是简单太多,本文主要是部署Master节点,直接通过kubeadm来部署,最终所有组件正常,基本没遇到比较大的问题
首先是部署环境:
VMware Fusion:CentOS7虚拟机,这里就安装了一个Mini版,需要什么再自行进行安装
网络:NAT方式,最终用Host-Only应该也OK
资源:2C,4G,20G,MAC资源还是够的,后面如果可以部署集群再适当分配
其次是一些前提:
1、关闭swap分区,/etc/fstab里注释下面一行
#/dev/mapper/centos_2020-swap swap swap defaults 0 0
2、关闭selinux,/etc/selinux/config里修改下面一行
SELINUX=disabled
3、修改iptables内核参数和开启路由转发,如果嫌烦部署的话先直接关掉firewalld好了,一堆端口要放开
[root@2020 lihui]# cat /etc/sysctl.d/k8s.conf net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 et.ipv4.ip_forward=1 vm.swappiness=0 [root@2020 lihui]# sysctl --system
然后就开始干正事了,配置yum源直接安装
docker-ce源:
[root@2020 lihui]# cd /etc/yum.repos.d/ [root@2020 yum.repos.d]# wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo [root@2020 yum.repos.d]# yum clean all [root@2020 yum.repos.d]# yum repolist
kubernetes源:
[root@2020 lihui]# cat /etc/yum.repos.d/kubernetes.repo [kubernetes] name=kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64/ gpgcheck=0 enable=1
安装它们
yum install docker-ce kubeadm
启动服务
systemctl start docker systemctl start kubelet
到这里docker和kubelet就已经部署启动好了,接下来就是其它的组件,直接通过容器的方式部署;但是由于某些原因,对应的docker镜像是没法直接拉下来的,因此可以自行找办法pull下来,然后打tag即可,在此之前需要知道组件的版本,不然的话也没法pull,可以通过下面两步获取版本号
[root@2020 ~]# kubeadm config print init-defaults --kubeconfig MasterConfiguration > kubeadm.yaml W0108 22:36:32.862368 3395 validation.go:28] Cannot validate kube-proxy config - no validator is available W0108 22:36:32.862424 3395 validation.go:28] Cannot validate kubelet config - no validator is available [root@2020 ~]# vim kubeadm.yaml [root@2020 ~]# kubeadm config images list --config kubeadm.yaml W0108 22:36:51.939616 3416 validation.go:28] Cannot validate kube-proxy config - no validator is available W0108 22:36:51.939670 3416 validation.go:28] Cannot validate kubelet config - no validator is available k8s.gcr.io/kube-apiserver:v1.17.0 k8s.gcr.io/kube-controller-manager:v1.17.0 k8s.gcr.io/kube-scheduler:v1.17.0 k8s.gcr.io/kube-proxy:v1.17.0 k8s.gcr.io/pause:3.1 k8s.gcr.io/etcd:3.4.3-0 k8s.gcr.io/coredns:1.6.5
这样就可以写一个简单的脚本来执行下载操作
#!/bin/bash
APISERVER=v1.17.0
MANAGER=v1.17.0
SCHEDULER=v1.17.0
PROXY=v1.17.0
PAUSE=3.1
ETCD=3.4.3-0
COREDNS=1.6.5
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:$APISERVER
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:$MANAGER
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:$SCHEDULER
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:$PROXY
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD
docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$COREDNS
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:$APISERVER k8s.gcr.io/kube-apiserver:$APISERVER
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:$MANAGER k8s.gcr.io/kube-controller-manager:$MANAGER
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:$SCHEDULER k8s.gcr.io/kube-scheduler:$SCHEDULER
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:$PROXY k8s.gcr.io/kube-proxy:$PROXY
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE k8s.gcr.io/pause:$PAUSE
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD k8s.gcr.io/etcd:$ETCD
docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$COREDNS k8s.gcr.io/coredns:$COREDNS
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-apiserver:$APISERVER
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-controller-manager:$MANAGER
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-scheduler:$SCHEDULER
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/kube-proxy:$PROXY
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/pause:$PAUSE
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/etcd:$ETCD
docker rmi registry.cn-hangzhou.aliyuncs.com/google_containers/coredns:$COREDNS
执行之后,就可以看到拉下来的镜像
[root@2020 lihui]# docker image ls REPOSITORY TAG IMAGE ID CREATED SIZE k8s.gcr.io/kube-proxy v1.17.0 7d54289267dc 4 weeks ago 116MB k8s.gcr.io/kube-apiserver v1.17.0 0cae8d5cc64c 4 weeks ago 171MB k8s.gcr.io/kube-controller-manager v1.17.0 5eb3b7486872 4 weeks ago 161MB k8s.gcr.io/kube-scheduler v1.17.0 78c190f736b1 4 weeks ago 94.4MB k8s.gcr.io/coredns 1.6.5 70f311871ae1 2 months ago 41.6MB k8s.gcr.io/etcd 3.4.3-0 303ce5db0e90 2 months ago 288MB k8s.gcr.io/pause 3.1 da86e6ba6ca1 2 years ago 742kB
接着,修改了上面重定向的YAML文件,这里做了基础改动
advertiseAddress:我设置的是部署机器的NAT网络分配的IP地址
serviceSubnet:这个子网我设置的是docker网桥的子网段,凭service这个字段猜测的,如果不对,服务有异常再改
imageRepository:修改为阿里云的仓库
apiVersion: kubeadm.k8s.io/v1beta2
bootstrapTokens:
- groups:
- system:bootstrappers:kubeadm:default-node-token
token: abcdef.0123456789abcdef
ttl: 24h0m0s
usages:
- signing
- authentication
kind: InitConfiguration
localAPIEndpoint:
advertiseAddress: 172.16.247.132
bindPort: 6443
nodeRegistration:
criSocket: /var/run/dockershim.sock
name: "2020"
taints:
- effect: NoSchedule
key: node-role.kubernetes.io/master
---
apiServer:
timeoutForControlPlane: 4m0s
apiVersion: kubeadm.k8s.io/v1beta2
certificatesDir: /etc/kubernetes/pki
clusterName: kubernetes
controllerManager: {}
dns:
type: CoreDNS
etcd:
local:
dataDir: /var/lib/etcd
imageRepository: registry.aliyuncs.com/google_containers
kind: ClusterConfiguration
kubernetesVersion: v1.17.0
networking:
dnsDomain: cluster.local
serviceSubnet: 172.17.0.0/16
scheduler: {}
接着执行init操作,可以看到一堆警告
[root@2020 ~]# kubeadm init --config kubeadm.yaml W0108 23:10:08.023913 15723 validation.go:28] Cannot validate kube-proxy config - no validator is available W0108 23:10:08.023961 15723 validation.go:28] Cannot validate kubelet config - no validator is available [init] Using Kubernetes version: v1.17.0 [preflight] Running pre-flight checks [WARNING Firewalld]: firewalld is active, please ensure ports [6443 10250] are open or your cluster may not function correctly [WARNING Service-Docker]: docker service is not enabled, please run 'systemctl enable docker.service' [WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/ [WARNING Service-Kubelet]: kubelet service is not enabled, please run 'systemctl enable kubelet.service'
有的是端口没开放,有的是服务没设置开机启动,既然如此,就回退一步,直接执行reset就可以撤销
[root@2020 ~]# kubeadm reset [reset] Reading configuration from the cluster... [reset] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml' [reset] WARNING: Changes made to this host by 'kubeadm init' or 'kubeadm join' will be reverted. [reset] Are you sure you want to proceed? [y/N]: y [preflight] Running pre-flight checks [reset] Removing info for node "2020" from the ConfigMap "kubeadm-config" in the "kube-system" Namespace W0108 23:17:01.884181 18453 removeetcdmember.go:61] [reset] failed to remove etcd member: error syncing endpoints with etc: etcdclient: no available endpoints .Please manually remove this etcd member using etcdctl [reset] Stopping the kubelet service [reset] Unmounting mounted directories in "/var/lib/kubelet" [reset] Deleting contents of config directories: [/etc/kubernetes/manifests /etc/kubernetes/pki] [reset] Deleting files: [/etc/kubernetes/admin.conf /etc/kubernetes/kubelet.conf /etc/kubernetes/bootstrap-kubelet.conf /etc/kubernetes/controller-manager.conf /etc/kubernetes/scheduler.conf] [reset] Deleting contents of stateful directories: [/var/lib/etcd /var/lib/kubelet /var/lib/dockershim /var/run/kubernetes /var/lib/cni] The reset process does not clean CNI configuration. To do so, you must remove /etc/cni/net.d The reset process does not reset or clean up iptables rules or IPVS tables. If you wish to reset iptables, you must do so manually by using the "iptables" command. If your cluster was setup to utilize IPVS, run ipvsadm --clear (or similar) to reset your system's IPVS tables. The reset process does not clean your kubeconfig files and you must remove them manually. Please, check the contents of the $HOME/.kube/config file.
接着,直接将防火墙关掉,有需要部署完了在开放端口好了
[root@2020 ~]# systemctl stop firewalld [root@2020 ~]# iptables -S -P INPUT ACCEPT -P FORWARD ACCEPT -P OUTPUT ACCEPT [root@2020 ~]# systemctl enable docker.service Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service. [root@2020 ~]# systemctl enable kubelet.service Created symlink from /etc/systemd/system/multi-user.target.wants/kubelet.service to /usr/lib/systemd/system/kubelet.service.
再次执行init
[root@2020 ~]# kubeadm init --config kubeadm.yaml
最后有这样几个成功字眼
Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ Then you can join any number of worker nodes by running the following on each as root: kubeadm join 172.16.247.132:6443 --token abcdef.0123456789abcdef \ --discovery-token-ca-cert-hash sha256:8bb43d2d4c8aa40f19831be3cf0ae7b8a6a4e78bf40a7d53f20e93db6079f499
上面嗨提示我们To start using your cluster,必须进行下面操作,就照做好了,这部分是因为Kubernetes集群需要加密访问,默认会使用这个目录下授权信息来访问集群
[root@2020 ~]# mkdir -p .kube [root@2020 ~]# cp -i /etc/kubernetes/admin.conf .kube/config [root@2020 ~]# chown $(id -u):$(id -g) .kube/config [root@2020 ~]# ls -l .kube/config -rw-------. 1 root root 5450 Jan 8 23:25 .kube/config
到这里就可以来查看节点的状态了
[root@2020 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION 2020 NotReady master 5m55s v1.17.0
但是状态却是NotReady,查看一下本节点各个pod的情况,coredns是Pending
[root@2020 ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-9d85f5447-p9jbh 0/1 Pending 0 11m coredns-9d85f5447-tvnd2 0/1 Pending 0 11m etcd-2020 1/1 Running 0 11m kube-apiserver-2020 1/1 Running 0 11m kube-controller-manager-2020 1/1 Running 0 11m kube-proxy-4bs8c 1/1 Running 0 11m kube-scheduler-2020 1/1 Running 0 11m
通过describe可以看到,原因是网络没有部署的原因,所以上面依赖网络的CoreDNS也是Pending的状态
[root@2020 ~]# kubectl describe node 2020 Name: 2020 Roles: master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=2020 kubernetes.io/os=linux node-role.kubernetes.io/master= Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 08 Jan 2020 23:20:30 +0800 Taints: node-role.kubernetes.io/master:NoSchedule node.kubernetes.io/not-ready:NoSchedule Unschedulable: false Lease: HolderIdentity: 2020 AcquireTime: RenewTime: Wed, 08 Jan 2020 23:29:34 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- MemoryPressure False Wed, 08 Jan 2020 23:26:05 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Wed, 08 Jan 2020 23:26:05 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Wed, 08 Jan 2020 23:26:05 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready False Wed, 08 Jan 2020 23:26:05 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletNotReady runtime network not ready: NetworkReady=false reason:NetworkPluginNotReady message:docker: network plugin is not ready: cni config uninitialized
至于网络,直接通过Kubernetes的apply就行了
kubectl apply -f "https://cloud.weave.works/k8s/net?k8s-version=$(kubectl version | base64 | tr -d '\n')"
执行完了之后,要等待一段时间,一直状态么有改变,我还以为有问题
查看健康状态
[root@2020 ~]# kubectl get cs NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy {"health":"true"}
瞧,这样就全好了
[root@2020 ~]# kubectl get pods -n kube-system NAME READY STATUS RESTARTS AGE coredns-9d85f5447-p9jbh 1/1 Running 0 9h coredns-9d85f5447-tvnd2 1/1 Running 0 9h etcd-2020 1/1 Running 0 9h kube-apiserver-2020 1/1 Running 0 9h kube-controller-manager-2020 1/1 Running 0 9h kube-proxy-4bs8c 1/1 Running 0 9h kube-scheduler-2020 1/1 Running 0 9h weave-net-jcn82 2/2 Running 0 9h
可以看到嗨新建了一个weave-net-jcn82的Pod,就是容器网络插件在节点上的控制组件
最后再来看看本node状态就正常了
[root@2020 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION 2020 Ready master 9h v1.17.0
整个Master节点的组件情况就如下了:
[root@2020 ~]# kubectl describe node 2020 Name: 2020 Roles: master Labels: beta.kubernetes.io/arch=amd64 beta.kubernetes.io/os=linux kubernetes.io/arch=amd64 kubernetes.io/hostname=2020 kubernetes.io/os=linux node-role.kubernetes.io/master= Annotations: kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock node.alpha.kubernetes.io/ttl: 0 volumes.kubernetes.io/controller-managed-attach-detach: true CreationTimestamp: Wed, 08 Jan 2020 23:20:30 +0800 Taints: node-role.kubernetes.io/master:NoSchedule Unschedulable: false Lease: HolderIdentity: 2020 AcquireTime: RenewTime: Thu, 09 Jan 2020 08:59:04 +0800 Conditions: Type Status LastHeartbeatTime LastTransitionTime Reason Message ---- ------ ----------------- ------------------ ------ ------- NetworkUnavailable False Thu, 09 Jan 2020 00:03:34 +0800 Thu, 09 Jan 2020 00:03:34 +0800 WeaveIsUp Weave pod has set this MemoryPressure False Thu, 09 Jan 2020 08:54:20 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available DiskPressure False Thu, 09 Jan 2020 08:54:20 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure PIDPressure False Thu, 09 Jan 2020 08:54:20 +0800 Wed, 08 Jan 2020 23:20:26 +0800 KubeletHasSufficientPID kubelet has sufficient PID available Ready True Thu, 09 Jan 2020 08:54:20 +0800 Thu, 09 Jan 2020 00:03:44 +0800 KubeletReady kubelet is posting ready status Addresses: InternalIP: 172.16.247.132 Hostname: 2020 Capacity: cpu: 2 ephemeral-storage: 17394Mi hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 4026228Ki pods: 110 Allocatable: cpu: 2 ephemeral-storage: 16415037823 hugepages-1Gi: 0 hugepages-2Mi: 0 memory: 3923828Ki pods: 110 System Info: Machine ID: 4f699d9051a64357b561f19ea582feb9 System UUID: 86764D56-2F54-C137-9462-7C851A6592CE Boot ID: 3e40ef3c-3d70-40c0-8915-9c7d0b3b1cf6 Kernel Version: 3.10.0-1062.el7.x86_64 OS Image: CentOS Linux 7 (Core) Operating System: linux Architecture: amd64 Container Runtime Version: docker://19.3.5 Kubelet Version: v1.17.0 Kube-Proxy Version: v1.17.0 Non-terminated Pods: (8 in total) Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits AGE --------- ---- ------------ ---------- --------------- ------------- --- kube-system coredns-9d85f5447-p9jbh 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 9h kube-system coredns-9d85f5447-tvnd2 100m (5%) 0 (0%) 70Mi (1%) 170Mi (4%) 9h kube-system etcd-2020 0 (0%) 0 (0%) 0 (0%) 0 (0%) 9h kube-system kube-apiserver-2020 250m (12%) 0 (0%) 0 (0%) 0 (0%) 9h kube-system kube-controller-manager-2020 200m (10%) 0 (0%) 0 (0%) 0 (0%) 9h kube-system kube-proxy-4bs8c 0 (0%) 0 (0%) 0 (0%) 0 (0%) 9h kube-system kube-scheduler-2020 100m (5%) 0 (0%) 0 (0%) 0 (0%) 9h kube-system weave-net-jcn82 20m (1%) 0 (0%) 0 (0%) 0 (0%) 9h Allocated resources: (Total limits may be over 100 percent, i.e., overcommitted.) Resource Requests Limits -------- -------- ------ cpu 770m (38%) 0 (0%) memory 140Mi (3%) 340Mi (8%) ephemeral-storage 0 (0%) 0 (0%) Events:
OVER