NotificationsYou must be signed in to change notification settings
Fork0
Star1

A set of scripts and documentation for adding redundancy (etcd cluster, multiple masters) to a cluster set up with kubeadm 1.8 and above

License

Apache-2.0 license

1 star 11 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
ansible		ansible
LICENSE		LICENSE
README.md		README.md

Repository files navigation

kubeadm2ha - Workarounds for the time before kubeadm HA becomes available

A set of scripts and documentation for adding redundancy (etcd cluster, multiple masters) to a cluster set up with kubeadm 1.8. This code is intended to demonstrate and simplify creation of redundant-master setups while still using kubeadm which is still lacking this functionality. Seekubernetes/kubeadm/issues/546 for discussion on this.

This code largely follows the instructions published incookeem/kubeadm-ha and added only minor contribution in changing little bits for K8s 1.8 compatibility and automating things.

Overview

This repository contains a set of ansible scripts to do this. There are three playbooks:

cluster-setup.yaml sets up a complete cluster including the HA setup. See below for more details.
cluster-load-balanced.yaml sets up an NGINX load balancer for the apiserver.
cluster-uninstall.yaml removes data and configuration files to a point thatcluster-setup.yaml can be used again.
cluster-dashboard.yaml sets up the dashboard including influxdb/grafana.
etcd-operator.yaml sets up the etcd-operator.
cluster-images.yaml prefetches all images needed for Kubernetes operations and transfers them to the target hosts.
local-access.yaml fetches a patchedadmin.conf file to/tmp/MY-CLUSTER-NAME-admin.conf. After copying it to~/.kube/config remotekubectl access via V-IP / load balancer can be tested.
uninstall-dashboard.yaml removes the dashboard.
cluster-upgrade.yaml upgrades a cluster.

Prerequisites

Ansible version 2.4 or higher is required. Older versions will not work.

Configuration

In order to use the ansible scripts, at least two files need to be configured:

Either editmy-cluster.inventory or create your own. The inventorymust define the following groups:primary-master (a single machine on whichkubeadm will be run),secondary-masters (the other masters),masters (all masters),minions (the worker nodes),nodes (all nodes),etcd (all machines on which etcd is installed, usually the masters).
Either editgroup_vars/my-cluster.yaml to your needs or create your own (named after the group defined in the inventory you want to use). Override settings fromgroup_vars/all.yaml where necessary.

What the cluster setup does

Set up anetcd cluster with self-signed certificates on all hosts in groupetcd..
Set up akeepalived cluster on all hosts in groupmasters.
Set up a master instance on the host in groupprimary-master usingkubeadm.
Set up master instances on all hosts in groupsecondary-masters by copying and patching (replace the primary master's host name and IP) the configuration created bykubeadm and have them join the cluster.
Configure kube-proxy to use the V-IP / load balancer URL and configurekube-dns to the master nodes' cardinality.
Usekubeadm to join all hosts in the groupminions.

What the additional playbooks can be used for:

Add an NGINX-based load-balancer to the cluster. After this, the apiserver will be available through the virtual-IP on port 8443. Note that this is a round-robin load balancer that will interfere withwatch actions, likekubectl logs -f from a remote host (see #4).
Add etcd-operator for use with applications running in the cluster. This is an add-on purely because I happen to need it.
Pre-fetch and transfer Kubernetes images. This is useful for systems without Internet access.

What the images setup does

Pull all required images locally (hence you need to make sure to have docker installed on the host from which you run ansible).
Export the images to tar files.
Copy the tar files over to the target hosts.
Import the images from the tar files on the target hosts.

Setting up the dashboard

Thecluster-dashboard.yaml playbook does the following:

Install theinfluxdb,grafana anddashboard components.
Scale the number of instances to the number of master nodes.
Expose the instances viaNodePort, so that they can then be accessed through the V-IP.
Sets up a service account 'admin-user' and cluster role binding for the role 'cluster-admin' so that the dashboard can be accessed with root-like privileges.

For accessing the dashbord in this configuration there are two options:

Usehttps://:30443 - i.e. connect to the remote IP. You will get a certificat warning though because the cluster's certificates will be unknown to your browser.
Runkubectl proxy on your local host (which requires to have configuredkubectl for your local host, seeConfiguring local access below for automating this), then access viahttp://localhost:8001/api/v1/namespaces/kube-system/services/https:kubernetes-dashboard:/proxy/

The dashboard will ask you to authenticate. Again, there are several options:

Use the token of an existing service account with sufficient privileges. On many clusters this command works for root-like access:

kubectl -n kube-system describe secrets `kubectl -n kube-system get secrets | awk '/clusterrole-aggregation-controller/ {print $1}'` | awk '/token:/ {print $2}'

Use the token of the 'admin-user' service account (if it exists):

kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')

Use thelocal-access.yaml playbook to generate a configuration file. That file can be copied to~/.kube/config for localkubectl access. It can also be uploaded askubeconfig file in the dashboard's login dialogue.

Configuring local access

Running thelocal-access.yaml playbook creates a file/tmp/-admin.conf that can be used as~/.kube/config. If the dashboard has been installed (see above) the file will contain the 'admin-user' service account's token, so that for bothkubectl and the dashboard root-like access is possible. If that service account does not exist, the client-side certificate will be used instead which is OK for testing environments but is generally considered not recommendable because the client-side certificates are not supposed to leave their master host.

Upgrading a cluster

For upgrading a cluster several steps are needed:

Find out which software versions to upgrade to.
Set the ansible variables to the new software versions.
Run thecluster-images.yaml playbook if the cluster has no Internet access.
Run thecluster-upgrade.yaml playbook.

Note: Never upgrade a productive cluster without having tried it on a reference system before.

Preparation

To find out which software versions to upgrade to you will need to run a more recent version ofkubeadm:

export VERSION=$(curl -sSL https://dl.k8s.io/release/stable.txt) # or manually specify a released Kubernetes versionexport ARCH=amd64 # or: arm, arm64, ppc64le, s390xcurl -sSL https://dl.k8s.io/release/${VERSION}/bin/linux/${ARCH}/kubeadm > /tmp/kubeadmchmod a+rx /tmp/kubeadm

Copy this file to/tmp on your primary master if necessary. Now run this command for checking prerequisites and determining the versions you'd get:

/tmp/kubeadm upgrade plan

If the prerequisites are met you'll get a summary of the software versionskubeadm would upgrade to, like this:

Upgrade to the latest stable version:COMPONENT            CURRENT   AVAILABLEAPI Server           v1.8.3    v1.9.2Controller Manager   v1.8.3    v1.9.2Scheduler            v1.8.3    v1.9.2Kube Proxy           v1.8.3    v1.9.2Kube DNS             1.14.5    1.14.7Etcd                 3.2.7     3.1.11

Note that upgradingetcd is not supported here because we are running it externally, hence we'll have to upgrade it according toetcd's upgrade instruction which is beyond scope here.

We will always use the same version for the Kubernetes base software installed on your OS (kubelet,kubectl,kubeadm) and the self-hosted core components (API Server, Controller Manager, Scheduler, Kube Proxy).Hence the "v1.9.2" listed in thekubeadm output will go into theKUBERNETES_VERSION ansible variable. Edit eithergroup_vars/all.yaml to change this globally orgroup_vars/.yaml for your environment only.The same applies for the Kube DNS version which corresponds with theKUBERNETES_DNS_VERSION ansible variable.

Having configured this you may now want to fetch and install the new images for your to-be-upgraded cluster, if your cluster has no internet access.If it has you may want to do this anyway to make the upgrade more seamless.

To do so, run the following command:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory cluster-images.yaml

I usually set the number of concurrent processes manually because if a cluster consists of more than 5 (default) nodes picking a higher value here significantly speeds up the process.

Perform the upgrade

You may want to backup/etc/kubernetes on all your master machines. Do this before running the upgrade.

The actual upgrade is automated. Run the following command:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory cluster-upgrade.yaml

See the comment above on setting the number of concurrent processes.

The upgrade is not fully free of disruptions:

whilekubeadm applies the changes on a master, it restarts a number of services, hence they may be unavailable for a short time
if containers running on the minions keep local data they have to take care to rebuild it when relocated to different minions during the upgrade process (i.e. local data is ignored)

If any of these is unacceptable, a fully automated upgrade process does not really make any sense because deep knowledge of the application running in a respective cluster is required to work around this.Hence in that case a manual upgrade process is recommended.

If you are using the NGINX load balancer

After the upgrade the NGINX load balancer will not be in use. To reenable it, simply rerun thecluster-load-balanced.yaml playbook.

If something goes wrong

If the upgrade fails the situation afterwards depends on the phase in which things went wrong.

Ifkubeadm failed to upgrade the cluster it will try to perform a rollback. Hence if that happened on the first master, chances are pretty good that the cluster is still intact. In that case all you need is to startdocker,kubelet andkeepalived on the secondary masters and then uncordon them (kubectl uncordon <secondary-master-fqdn>) to be back where you started from.

Ifkubeadm on one of the secondary masters failed you still have a working, upgraded cluster, but without the secondary masters in a somewhat undefined condition. In some caseskubeadm fails if the cluster is still busy after having upgraded the previous master node, so that waiting a bit and runningkubeadm upgrade apply v<VERSION> may even succeed. Otherwise you will have to find out what went wrong and join the secondaries manually. Once this has been done, finish the automatic upgrade process by processing the second half of the playbook only:

ansible-playbook -f <good-number-of-concurrent-processes> -i <your-environment>.inventory cluster-upgrade.yaml --tags nodes

If upgrading the software packages (i.e. the second half of the playbook) failed, you still have a working cluster. You may try to fix the problems and continue manually. See the.yaml files underroles/upgrade-nodes/tasks for what you need to do.

If you are trying out the upgrade on a reference system, you may have to downgrade at some point to start again. See the sequence for reinstalling a cluster below for an instruction how to do this (hint: it is important to erase the some base software packages before setting up a new cluster based on a lower Kubernetes version).

Examples

To run one of the playbooks (e.g. to set up a cluster), run ansible like this:

ansible-playbook -i <your-inventory-file>.inventory cluster-setup.yaml

You might want to adapt the number of parallel processes to your number of hosts using the `-f' option.

A sane sequence of playbooks for a complete setup would be:

cluster-setup.yaml
etcd-operator.yaml
cluster-dashboard.yaml
cluster-load-balanced.yaml

The following playbooks can be used as needed:

cluster-uninstall.yaml
local-access.yaml
uninstall-dashboard.yaml

Sequence for reinstalling a cluster:

INVENTORY=<your-inventory-file> NODES=<number-of-nodes>ansible-playbook -f $NODES -i $INVENTORY cluster-uninstall.yaml sleep 3m# if you want to downgrade your kubelet, kubectl, ... packages you need to uninstall them first# if this is not the issue here, you can skip the following lineansible -u root -f $NODES -i $INVENTORY nodes -m command -a "rpm -e kubelet kubectl kubeadm kubernetes-cni"for i in cluster-setup.yaml etcd-operator.yaml cluster-dashboard.yaml ; do     ansible-playbook -f $NODES -i $INVENTORY $i || break    sleep 15sdone

Known limitations

This is a preview in order to obtain early feedback. It is not done yet. Known limitations are:

There could be more error checking.
The code has been tested almost exclusively in a Redhat-like (RHEL) environment. More testing on other distros is needed.

Why is there no release yet?

Currently the code is in a "works for me" state. In order to make a release more feedback from others is needed.I still expect some more bugs to be reported and fixed thereafter. Once this phase has ended there will be a first release.

About

A set of scripts and documentation for adding redundancy (etcd cluster, multiple masters) to a cluster set up with kubeadm 1.8 and above

Releases

No releases published

Packages

No packages published

Languages

Shell100.0%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

kubeadm2ha - Workarounds for the time before kubeadm HA becomes available

Overview

Prerequisites

Configuration

What the cluster setup does

What the additional playbooks can be used for:

What the images setup does

Setting up the dashboard

Configuring local access

Upgrading a cluster

Preparation

Perform the upgrade

If you are using the NGINX load balancer

If something goes wrong

Examples

Known limitations

Why is there no release yet?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages

Languages

Movatterモバイル変換

License

ReSearchITEng/kubeadm2ha

Folders and files

Latest commit

History

Repository files navigation

kubeadm2ha - Workarounds for the time before kubeadm HA becomes available

Overview

Prerequisites

Configuration

What the cluster setup does

What the additional playbooks can be used for:

What the images setup does

Setting up the dashboard

Configuring local access

Upgrading a cluster

Preparation

Perform the upgrade

If you are using the NGINX load balancer

If something goes wrong

Examples

Known limitations

Why is there no release yet?

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages0

Languages

Packages