Getting started with kubernetes using Ansible and Terraform

Published in

ITNEXT

12 min readMar 8, 2018

Click here to share this article on LinkedIn »

So, you want to start playing around with kubernetes because hey, why not. You could always run Minikube locally on your desktop, but this is only a single node cluster. I wanted to run a multi-node cluster and for me, it was about seeing it and feeling it; I learn better this way. Building and running kubernetes clusters is not my day job, so I needed a way to understand what all the hype is about.

Warning: This is NOT intended to be a production deployment. There are many shortcomings to this installation as compared to a full production deployment.

Tip: Most of this is based off of running from my Mac. Everything used below is platform independent, but some tweaks may need to be made to scripts that run locally.

Here is the basic flow: (I’m going to go through all of these steps in more details below.)

Sign up for an AWS account, and create an access-key.
Download Terraform and clone my git repo. Run this to provision 3 hosts.
For a very basic install use this ansible script. I have a more complex one that installs some sample applications and uses Contiv for container segmentation. My ansible scripts are here and I am going to walk through these.

The TL;DR for this post is here.

git clone https://github.com/magic7s/terraform_aws_spot_instance.git
cd terraform_aws_spot_instance
terraform init
terraform apply
cd ..
git clone https://github.com/magic7s/ansible-kubeadm-contiv.git
cd ansible-kubeadm-contiv
# Edit inventory file with public ip addresses from terraform output
ansible-playbook -i inventory site.yml

Basic Diagram of AWS Environment to be provisioned

Infrastructure Provisioning using Terraform

There are probably a dozen ways to do the same thing. I chose Terraform to provision my virtual machines in AWS. Why? I’m lazy and didn’t want to keep clicking through the AWS console to spin up three clean VMs. I had to do it about 100 times because I always wanted to start fresh when deploying my cluster. Also, I wanted to destroy (delete) my VMs when I was not using them, so I didn’t have to pay.

I’m using spot instances because they are up to 90% off the normal price. I can get a m3.medium (3.75G RAM) for $0.0067/hr. There are some drawbacks; if amazon wants to raise the price higher than your bid price, they will terminate your instance. Also, you cannot power it off, you can only reboot it.

Installing Terraform is pretty simple. It comes as a compiled binary for your platform. Download it from https://www.terraform.io/downloads.html and follow the instructions. I copied the file to/usr/local/bin/terraform to allow me to run it from any directory on the command line.

Download my git repo with the terraform script to provision three hosts in a security group. https://github.com/magic7s/terraform_aws_spot_instance

Review the README.MD file for additional information on how to customize the script for you.

You will need to have your AWS access_key and secret_key available. I recommend you put it in ~/.aws/credentials and it should look like this:

[default]aws_access_key_id = AAABBBBCCCDDDEEEFFFaws_secret_access_key = ABC123456%$^&*WWWMMMCCC33658

If you installed the AWS CLI you may have already done this.

Once this is done, you can run terraform init .

macbook:terraform_aws_spot_instance brad$ terraform initInitializing provider plugins...
.... [lots of output omitted]Terraform has been successfully initialized!.... [lots of output omitted]
macbook:terraform_aws_spot_instance brad$

Check that the aws region is correct in k8s.tf

variable "aws_region_name" { default = "us-west-2" }

You will need to pre-upload or create your SSH public key to the AWS region you are working in. Here are the instruction to do so. You will need the name of the key pair as listed on the AWS console.

variable "ssh_key_name" {default = "braddown-csicolaptop"}

Review the rest of the k8s.tf file for any changes you may want.

Run terraform plan

macbook:terraform_aws_spot_instance brad$ terraform plan
Refreshing Terraform state in-memory prior to plan...
The refreshed state will be used to calculate this plan, but will not be
persisted to local or remote state storage.data.external.myipaddr: Refreshing state...
data.aws_ami.ubuntu: Refreshing state...------------------------------------------------------------------------
.... [lots of output omitted]Plan: 7 to add, 0 to change, 0 to destroy.------------------------------------------------------------------------Note: You didn't specify an "-out" parameter to save this plan, so Terraform
can't guarantee that exactly these actions will be performed if
"terraform apply" is subsequently run.macbook:terraform_aws_spot_instance brad$

If you get any errors about data.external.myipaddr check that ./myipaddr.sh runs in your local directory. It should produce something like {“ip” : “68.231.198.92” } without error. If that doesn’t work try to comment it out and uncomment the other line.

data "external" "myipaddr" {# Pick one or the other. The second one requires an external script but uses DNS instead of https.#program = ["bash", "-c", "curl -s 'https://api.ipify.org?format=json'"]program = ["bash", "${path.module}/myipaddr.sh"]}

If all is successful run terraform apply and answer yes when prompted.

macbook:terraform_aws_spot_instance brad$ terraform apply
data.external.myipaddr: Refreshing state...
aws_security_group.k8s_sg: Refreshing state... (ID: sg-00dfd8ca76fe4e4f3)
data.aws_ami.ubuntu: Refreshing state...
aws_security_group_rule.allow_all_myip: Refreshing state... (ID: sgrule-2156153638)
aws_security_group_rule.allow_SG_any: Refreshing state... (ID: sgrule-104774467)
aws_security_group_rule.allow_all_egress: Refreshing state... (ID: sgrule-1753581355).... [lots of output omitted]Apply complete! Resources: 3 added, 0 changed, 0 destroyed.Outputs:master_ip = 54.218.25.205
worker_ips = [
    54.202.109.45,
    34.211.115.248
]
macbook:terraform_aws_spot_instance brad$

Congratulations you now have three spot instances, running the latest ubuntu 16.04 ami. The public IP address should output at the bottom. We will need these as we move to Ansible and installing kubernetes.

Installing kubernetes with Ansible

Assuming you have already installed python, you can install ansible with pip. You should be installing at least ansible 2.4. Check to ensure you are running the latest version ansible --version

If you haven’t already cloned my git repo with all the ansible scripts, do that now from https://github.com/magic7s/ansible-kubeadm-contiv

You should now have the following files local.

.
├── deploy_contiv_network_config.yml
├── deploy_istio.yml
├── deploy_sample_apps.yml
├── group_vars
│   └── all
├── inventory
├── LICENSE
├── README.md
├── roles
│   ├── common
│   │   └── tasks
│   │       └── main.yml
│   ├── contiv
│   │   └── tasks
│   │       └── main.yml
│   ├── contiv_network_cfg
│   │   └── tasks
│   │       └── main.yml
│   ├── docker
│   │   └── tasks
│   │       └── main.yml
│   ├── istio
│   │   └── tasks
│   │       └── main.yml
│   ├── kubeadm
│   │   └── tasks
│   │       └── main.yml
│   ├── master
│   │   └── tasks
│   │       └── main.yml
│   ├── sample_apps
│   │   ├── tasks
│   │   │   ├── main.yml
│   │   │   ├── PHPGuestbook.yml
│   │   │   └── WordPressSQL.yml
│   │   ├── templates
│   │   │   ├── guestbook.yaml
│   │   │   └── wordpresssql.yaml
│   │   └── vars
│   │       └── main.yml
│   └── worker
│       └── tasks
│           └── main.yml
└── site.yml22 directories, 22 files

site.yml is the main file that we will run to install everything. However, after the main installation (docker, kubernetes, contiv, etc.) you can run deploy_contiv_network_config.yml or deploy_sample_apps.yml or deploy_istio.yml as a separate scripts. By default everything will be installed in a single run.

You will need to update and review two files group_vars/all and inventory .

kubeadm_token is a random generated token used to authenticate worker nodes into the cluster.
If for some reason your network interface is not eth0 you will need to update the master_ip variable.
If you do not want to install the sample apps or istio service mesh (which is just another sample app), change the values to false in the vars file.
contiv_nets are all subnets used within the cluster, they do not need to be routable to the outside. Pods (groups of containers) will be accessed via port translation via the host’s IP address (inbound) or NAT (outbound).

## group_vars/all
kubeadm_token: db85f7.cff657b31b20eed5master_ip: "{{ hostvars['master1']['ansible_eth0']['ipv4']['address'] }}"ansible_remote_user: ubuntukubeadm_reset_before_init: truedelete_kube_dns: falsedeploy_sample_apps: truedeploy_istio: truecontiv_nets:
  # This network name is required for host to access pods.
  contivh1:
    net_type: -n infra
    net_sub:  -s 192.0.2.0/24
    net_gw:   -g 192.0.2.1
  # This is used for pods that do not have a io.contiv.network label
  default-net:
    net_type: -n data
    net_sub:  -s 172.16.10.10-172.16.10.250/24
    net_gw:   -g 172.16.10.1
  blue:
    net_type: -n data
    net_sub:  -s 172.17.0.5-172.17.15.250/20
    net_gw:   -g 172.17.0.1
  green:
    net_type: -n data
    net_sub:  -s 172.18.0.5-172.18.15.250/20
    net_gw:   -g 172.18.0.1

If you have the output from the terraform apply command, the public IP addresses were outputted at the bottom or run terraform output

macbook:terraform_aws_spot_instance brad$ terraform output
master_ip = 54.218.113.71
worker_ips = [
    54.190.7.158,
    34.211.12.22
]

Change the ip addresses to the public IP address of your hosts. Be careful not to copy-and-paste the comma between the worker IP addresses above.

# inventory
[master]
master1 ansible_ssh_host=54.218.113.71[worker]
54.190.7.158
34.211.12.22[master:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'[worker:vars]
ansible_ssh_common_args='-o StrictHostKeyChecking=no'

After this, you should be able to run the ansible playbook.

macbook:ansible-kubeadm-contiv brad$ansible-playbook -i inventory site.yml.... [lots of output omitted]PLAY RECAP *********************************************************************************************************************************************************
34.217.135.160             : ok=16   changed=13   unreachable=0    failed=0
54.213.221.194             : ok=16   changed=13   unreachable=0    failed=0
master1                    : ok=51   changed=41   unreachable=0    failed=0

There will be some tasks that fail (and are configured to be ignored) but they can be safely ignored. If the play stops and any of the hosts have a failed=1, that needs to be investigated.

For the most part you can run the playbook a second time and it will only update as needed or delete old configuration before re-applying. This is not 100% foolproof, and in some cases it is just easier to run terraform destroy and terraform apply to delete and spin up new hosts. [Remember to update the IP addresses in the inventory file].

Accessing and testing the kubernetes cluster

If all went well above, you should be able to access the newly built cluster.

SSH into the master node. If your laptop IP address has changed run terraform apply and it will check your IP address and update the aws security group.

Check if the cluster is in a Ready state:

ubuntu@ip-172-31-9-150:~$ kubectl get nodes
NAME               STATUS    ROLES     AGE       VERSION
ip-172-31-13-102   Ready     <none>    9m        v1.9.3
ip-172-31-8-222    Ready     <none>    9m        v1.9.3
ip-172-31-9-150    Ready     master    17m       v1.9.3

Check if all the pods are running:

ubuntu@ip-172-31-9-150:~$ kubectl get pods --all-namespaces
NAMESPACE     NAME                                      READY     STATUS    RESTARTS   AGE
default       frontend-5b697d64fb-bvl6l                 1/1       Running   0          9m
default       frontend-5b697d64fb-jck77                 1/1       Running   0          9m
default       frontend-5b697d64fb-zfr56                 1/1       Running   0          9m
default       redis-master-54c674b96-kn76j              1/1       Running   0          9m
default       redis-slave-58d5cbd65d-9msk6              1/1       Running   0          9m
default       redis-slave-58d5cbd65d-dj2tn              1/1       Running   0          9m
default       wordpress-77d578745-7w8lv                 1/1       Running   0          8m
default       wordpress-mysql-5fbdd6545b-qqb8t          1/1       Running   0          8m
kube-system   contiv-etcd-wzckp                         1/1       Running   0          14m
kube-system   contiv-netmaster-cwv2k                    1/1       Running   0          14m
kube-system   contiv-netplugin-6wk6j                    1/1       Running   0          10m
kube-system   contiv-netplugin-9s7kj                    1/1       Running   0          9m
kube-system   contiv-netplugin-pjcc2                    1/1       Running   0          14m
kube-system   contiv-ovs-cq29l                          2/2       Running   0          10m
kube-system   contiv-ovs-hmgw8                          2/2       Running   0          9m
kube-system   contiv-ovs-knl6h                          2/2       Running   0          14m
kube-system   etcd-ip-172-31-9-150                      1/1       Running   0          17m
kube-system   kube-apiserver-ip-172-31-9-150            1/1       Running   0          17m
kube-system   kube-controller-manager-ip-172-31-9-150   1/1       Running   0          17m
kube-system   kube-proxy-fhr4z                          1/1       Running   0          10m
kube-system   kube-proxy-kdk6g                          1/1       Running   0          17m
kube-system   kube-proxy-kz6rj                          1/1       Running   0          9m
kube-system   kube-scheduler-ip-172-31-9-150            1/1       Running   0          17m

Check the service IP address and ports the sample apps are running:

ubuntu@ip-172-31-9-150:~$ kubectl get svc
NAME              TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
frontend          NodePort       10.96.175.217    <none>        80:31887/TCP   12m
kubernetes        ClusterIP      10.96.0.1        <none>        443/TCP        21m
redis-master      ClusterIP      10.107.116.88    <none>        6379/TCP       12m
redis-slave       ClusterIP      10.105.113.208   <none>        6379/TCP       12m
wordpress         LoadBalancer   10.104.186.47    <pending>     80:30606/TCP   12m
wordpress-mysql   ClusterIP      10.108.198.177   <none>        3306/TCP       12m

You can now access the sample apps via the Nodeport. All hosts in the cluster will expose a random high port that will be port mapped into the application.

Example:

http://54.213.139.136:31887 would access the Guestbook app
http://54.213.139.136:30606 would access the Wordpress app

Contiv Network Configuration

Contiv is the container networking plugin (CNI) and we have deployed three networks, plus an infra network.

ubuntu@ip-172-31-9-150:~$ netctl net ls
Tenant   Network      Nw Type  Encap type  Packet tag  Subnet                         Gateway      IPv6Subnet  IPv6Gateway  Cfgd Tag
------   -------      -------  ----------  ----------  -------                        ------       ----------  -----------  ---------
default  blue         data     vxlan       0           172.17.0.5-172.17.15.250/20    172.17.0.1
default  contivh1     infra    vxlan       0           192.0.2.0/24                   192.0.2.1
default  default-net  data     vxlan       0           172.16.10.10-172.16.10.250/24  172.16.10.1
default  green        data     vxlan       0           172.18.0.5-172.18.15.250/20    172.18.0.1

All of our pods are deployed into the “blue” network. We can see that by the IP address of the pods.

ubuntu@ip-172-31-9-150:~$ kubectl get pods -o wide
NAME                               READY     STATUS    RESTARTS   AGE       IP            NODE
frontend-5b697d64fb-c9bzn          1/1       Running   0          2m        172.17.0.6    ip-172-31-8-222
frontend-5b697d64fb-ddlc5          1/1       Running   0          2m        172.17.0.5    ip-172-31-13-102
frontend-5b697d64fb-xmwfd          1/1       Running   0          2m        172.17.0.7    ip-172-31-9-150
redis-master-54c674b96-zz6nn       1/1       Running   0          2m        172.17.0.8    ip-172-31-13-102
redis-slave-58d5cbd65d-46gq2       1/1       Running   0          2m        172.17.0.9    ip-172-31-8-222
redis-slave-58d5cbd65d-qpd8h       1/1       Running   0          2m        172.17.0.10   ip-172-31-13-102
wordpress-75d9d5f86b-f7d8r         1/1       Running   0          1m        172.17.0.12   ip-172-31-13-102
wordpress-mysql-5fbdd6545b-xfcgq   1/1       Running   0          1m        172.17.0.11   ip-172-31-8-222

Let’s look at some of the group policy commands in Contiv. Here is a single endpoint group (Guestbook web tier). We can see it contains 3 endpoints that have been labeled with the tag io.contiv.net-group = epg-blue-guestbook-web

ubuntu@ip-172-31-9-150:~$ netctl group inspect epg-blue-guestbook-web
{
  "Config": {
    "key": "default:epg-blue-guestbook-web",
    "groupName": "epg-blue-guestbook-web",
    "networkName": "blue",
    "policies": [
      "policy-blue-guestbook-web"
    ],
    "tenantName": "default",
    "link-sets": {
      "MatchRules": {
        "default:policy-blue-guestbook-db:10": {
          "type": "rule",
          "key": "default:policy-blue-guestbook-db:10"
        }
      },
      "Policies": {
        "default:policy-blue-guestbook-web": {
          "type": "policy",
          "key": "default:policy-blue-guestbook-web"
        }
      }
    },
    "links": {
      "AppProfile": {},
      "NetProfile": {},
      "Network": {
        "type": "network",
        "key": "default:blue"
      },
      "Tenant": {
        "type": "tenant",
        "key": "default"
      }
    }
  },
  "Oper": {
    "endpoints": [
      {
        "containerName": "frontend-5b697d64fb-ddlc5",
        "endpointGroupId": 6,
        "endpointGroupKey": "epg-blue-guestbook-web:default",
        "endpointID": "0268f0a30adbf76b30c6b0d5910f1f78d5719afd9761507f57a455813a827604",
        "homingHost": "ip-172-31-13-102",
        "ipAddress": [
          "172.17.0.5",
          ""
        ],
        "labels": "map[]",
        "macAddress": "02:02:ac:11:00:05",
        "network": "blue.default",
        "serviceName": "epg-blue-guestbook-web"
      },
      {
        "containerName": "frontend-5b697d64fb-c9bzn",
        "endpointGroupId": 6,
        "endpointGroupKey": "epg-blue-guestbook-web:default",
        "endpointID": "24764b44e343c9352d96da863d2f9a788060487707c13a6a7093e5fb609a5261",
        "homingHost": "ip-172-31-8-222",
        "ipAddress": [
          "172.17.0.6",
          ""
        ],
        "labels": "map[]",
        "macAddress": "02:02:ac:11:00:06",
        "network": "blue.default",
        "serviceName": "epg-blue-guestbook-web"
      },
      {
        "containerName": "frontend-5b697d64fb-xmwfd",
        "endpointGroupId": 6,
        "endpointGroupKey": "epg-blue-guestbook-web:default",
        "endpointID": "2ec34a609b8819553f714c3cf348d96910053a9deee51b6b577a95a691a2cf6c",
        "homingHost": "ip-172-31-9-150",
        "ipAddress": [
          "172.17.0.7",
          ""
        ],
        "labels": "map[]",
        "macAddress": "02:02:ac:11:00:07",
        "network": "blue.default",
        "serviceName": "epg-blue-guestbook-web"
      }
    ],
    "externalPktTag": 1,
    "groupTag": "epg-blue-guestbook-web.default",
    "numEndpoints": 3,
    "pktTag": 1
  }
}
ubuntu@ip-172-31-9-150:~$

We can see all of the groups configured and their policies associated to each group. Then we show all the policies. Then each rule-set associated to each policy.

ubuntu@ip-172-31-9-150:~$ netctl group ls
Tenant   Group                   Network  IP Pool  CfgdTag  Policies                   Network profile
------   -----                   -------  -------  -------  --------                   ---------------
default  epg-blue-guestbook-web  blue                       policy-blue-guestbook-web
default  epg-blue-guestbook-db   blue                       policy-blue-guestbook-db
default  epg-blue-wordpress-web  blue                       policy-blue-wordpress-web
default  epg-blue-wordpress-db   blue                       policy-blue-wordpress-dbubuntu@ip-172-31-9-150:~$ netctl policy ls
Tenant   Policy
------   ------
default  policy-blue-guestbook-web
default  policy-blue-guestbook-db
default  policy-blue-wordpress-web
default  policy-blue-wordpress-dbubuntu@ip-172-31-9-150:~$ netctl policy rule-ls policy-blue-guestbook-web
Incoming Rules:
Rule  Priority  From EndpointGroup  From Network  From IpAddress  To IpAddress  Protocol  Port  Action
----  --------  ------------------  ------------  ---------       ------------  --------  ----  ------
10    1                                                                         tcp       80    allow
Outgoing Rules:
Rule  Priority  To EndpointGroup  To Network  To IpAddress  Protocol  Port  Action
----  --------  ----------------  ----------  ---------     --------  ----  ------
ubuntu@ip-172-31-9-150:~$ netctl policy rule-ls policy-blue-guestbook-db
Incoming Rules:
Rule  Priority  From EndpointGroup      From Network  From IpAddress  To IpAddress  Protocol  Port  Action
----  --------  ------------------      ------------  ---------       ------------  --------  ----  ------
20    1                                                                             icmp      0     allow
10    1         epg-blue-guestbook-web                                                        0     allow
Outgoing Rules:
Rule  Priority  To EndpointGroup  To Network  To IpAddress  Protocol  Port  Action
----  --------  ----------------  ----------  ---------     --------  ----  ------
ubuntu@ip-172-31-9-150:~$

These groups and policies are all configured via Ansible in the roles/sample_apps/vars/main.yml

Istio Service Mesh

If you have deploy_istio: true in the group_vars/all file, you should have istio and a sample application Bookinfo installed.

You should be able to access the Bookinfo app via the istio-ingress service.

ubuntu@ip-172-31-9-150:~$ kubectl get service -n istio-system
NAME            TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)                                                            AGE
istio-ingress   LoadBalancer   10.110.99.208    <pending>     80:32398/TCP,443:31125/TCP                                         26m
istio-mixer     ClusterIP      10.105.165.158   <none>        9091/TCP,15004/TCP,9093/TCP,9094/TCP,9102/TCP,9125/UDP,42422/TCP   26m
istio-pilot     ClusterIP      10.100.147.149   <none>        15003/TCP,8080/TCP,9093/TCP,443/TCP                                26m

Because the port assignment is random, your port numbers will be different. Mine would be: http://54.213.139.136:32398/productpage

Follow the sample guide at https://istio.io/docs/tasks/traffic-management/request-routing.html for more sample configuration tasks. All the istio files are located in /tmp/istio-0.6.0/

Enjoy!

Please provide feedback and/or suggestions. I used this process to learn (and troubleshoot) various new technologies. I am happy to accept pull requests on any of the code via github. If you notice any errors or things that are not explained in this guide, please let me know and I will update it.