Terraform Provider For Cluster Creation on Direct Connected Hosts

Background

Creating a Kubernetes cluster on direct connected hosts requires a few steps and Nirmata terraform provider gives an easy option for users looking to use Nirmata APIs to create Kubernetes clusters. Here are the steps that Nirmata Terraform provider automates -

  1. Creation of direct-connect host-group.
  2. Install Nirmata-agent on the hosts and connect to Nirmata.
  3. Create Kubernetes cluster using a specific cluster policy through Nirmata API.

Prerequisites

  1. Infrastructure with DNS and DHCP servers that can be readily configured.
  2. Obtain necessary certificates to be used for Nirmata installation. For HA configuration, set up a load balancer for base kubernetes cluster as well as Nirmata access.
  3. Linux Hosts (Ubuntu 16.04 or 18.04 or CentOS-7) with internet access to download Nirmata images from Docker Hub and Google Container Repository.
  4. For the basic install, a host with the following spec - 8 vCPU, 32GB memory, 200GB SSD storage.
  5. For HA install, 3 hosts with following spec - 8 vCPU, 32GB memory, 200Gb SSD storage.
  6. Configure SELINUX in permissive mode.
  7. Configure sysctl net.ipv4.conf.all.forwarding=1
  8. Configure sysctl net.bridge.bridge-nf-call-iptables=1
  9. Disable swap using the Disable Swap command sudo swapoff -a. Remove any swap entries from: /etc/fstab.
  10. Install Docker Engine version 18.09.2. Instructions to install Docker are available here.
  11. Terraform (tested 12.24)
  12. Go (tested 1.14)
  13. Nirmata Terraform source code
  14. Nirmata API key (Go to settings left bottom of the UI - see image API Key at the end of this doc)
  15. Direct Connect Cluster Policy (See images under Cluster Policy)
  16. Clean up IP tables iptables --flush
  17. To verify node readiness, run k8s_test.sh script. Download it from github - https://raw.githubusercontent.com/nirmata/custom-scripts/master/k8_test.sh

Run: k8s_test.sh --local

Using Nirmata Terraform to create a Direct Connect Cluster

Step-0 compile and install nirmata plugin
$ go clean
$ go build
go: downloading github.com/nirmata/go-client v1.0.0
$ cp terraform-provider-nirmata ~/.terraform/plugins/linux_amd64 #Or %APPDATA%\terraform.d\plugins, or ~/.terraform.d/plugins
Step-1 Create new directory and create a terraform file.
$ mkdir my-nirmata-tf ; cd my-nirmata-tf
$ create file nirmata.tf:

provider "nirmata" {
  // Set NIRMATA_TOKEN with your API Key
  // You can also set NIRMATA_URL with the Nirmata URL address
  // NIRMATA_URL=https://nirmata.local terraform <whatever>
}

resource "nirmata_host_group_direct_connect" "dc-host-group" {
  // This must not exist in Nirmata!
  name = "my-hg-1"
}
Step-2 Test host group create/delete
$ terraform init 
$ terraform apply
( Check in UI hostgroups -> direct connect for my-hg-1 )
$ terraform destroy 
( Check in UI hostgroups -> direct connect for my-hg-1 )
Step-3 Add node provisioning
  • This is site and cloud provider dependent.
  • When creating nodes be sure they are k8 compatible. (see above)
  • You must run the nirmata agent script with the options for this host group. The dc-host-group stores this in the variable nirmata_host_group_direct_connect.dc-host-group.curl_script.

Add to nirmata.tf:

//Example simple ssh with single Ubuntu 19.10: 

(see samples/ssh/ssh.tf for full ssh example)

resource "null_resource" "node" {
    depends_on = [ nirmata_host_group_direct_connect.dc-host-group ]
    provisioner "remote-exec" {
      inline = [ "sudo apt-get update",
        "sudo apt-get install -y docker.io",
        "${nirmata_host_group_direct_connect.dc-host-group.curl_script}"]
    }
  connection {
    type        = "ssh"
    user        = "ubuntu"
    //password    = "pass1234"
    private_key = file("~/.ssh/terraform")
    host        = "10.18.0.12"
  }
}
Step-4 Add a cluster resource dependent on the above node resource.

Add to nirmata.tf:

resource "nirmata_cluster_direct_connect" "dc-cluster-1" {
  name = "my-cluster-1"
  policy = "my-policy"
  host_group = nirmata_host_group_direct_connect.dc-host-group.name
  // This depends must match the cloud provider resource.
  depends_on = [ null_resource.node ]
}

At this point you should be able to run the following commands.

$ terraform init
$ terraform plan
$ terraform apply
$ terraform status
$ terraform destroy

Note: Be sure to set NIRMATA_URL in your shell or file before running terraform. (export NIRMATA_URL=https://mynirmata.local)

Note2: If you use the example null_resource destroy will not cleanup your nodes.

Example run

ssilbory@tower:~/my-nirmata-cluster$ terraform init

Initializing the backend...

Initializing provider plugins...
- Checking for available provider plugins...
- Downloading plugin for provider "null" (hashicorp/null) 2.1.2...

The following providers do not have any version constraints in configuration,
so the latest version was installed.

To prevent automatic upgrades to new major versions that may contain breaking
changes, it is recommended to add version = "..." constraints to the
corresponding provider blocks in configuration, with the constraint strings
suggested below.

* provider.null: version = "~> 2.1"

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.

ssilbory@tower:~/my-nirmata-cluster$ terraform apply 
provider.nirmata.token
  Nirmata API Access Token

  Enter a value: xxxxxxxxxxxxxxxxxxxx


An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # nirmata_cluster_direct_connect.dc-cluster-1 will be created
  + resource "nirmata_cluster_direct_connect" "dc-cluster-1" {
      + host_group = "baremetal-hg-1"
      + id         = (known after apply)
      + name       = "baremetal-cluster-1"
      + policy     = "sam-test-1.15"
      + state      = (known after apply)
      + status     = (known after apply)
    }

  # nirmata_host_group_direct_connect.dc-host-group will be created
  + resource "nirmata_host_group_direct_connect" "dc-host-group" {
      + curl_script = (known after apply)
      + id          = (known after apply)
      + name        = "baremetal-hg-1"
      + state       = (known after apply)
      + status      = (known after apply)
    }

  # null_resource.node will be created
  + resource "null_resource" "node" {
      + id = (known after apply)
    }

Plan: 3 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

nirmata_host_group_direct_connect.dc-host-group: Creating...
nirmata_host_group_direct_connect.dc-host-group: Creation complete after 0s [id=40316453-3720-44fe-9e13-e94dea7953bd]
null_resource.node: Creating...
null_resource.node: Provisioning with 'remote-exec'...
null_resource.node (remote-exec): Connecting to remote host via SSH...
null_resource.node (remote-exec):   Host: 10.18.0.159
null_resource.node (remote-exec):   User: ssilbory
null_resource.node (remote-exec):   Password: true
null_resource.node (remote-exec):   Private key: false
null_resource.node (remote-exec):   Certificate: false
null_resource.node (remote-exec):   SSH Agent: true
null_resource.node (remote-exec):   Checking Host Key: false
null_resource.node (remote-exec): Connected!
null_resource.node (remote-exec): Detected OS: ubuntu
null_resource.node (remote-exec): Using interface as wlx74ee2af2afa4
null_resource.node (remote-exec): Host IP address:  10.18.0.159
null_resource.node (remote-exec): Detected Docker version 19.03.8
null_resource.node (remote-exec): Docker Validation Passed
null_resource.node (remote-exec): Getting Nirmata agent ...
null_resource.node (remote-exec): Create dir ...
null_resource.node (remote-exec): Detecting host id
null_resource.node (remote-exec): curl  --insecure -so /opt/nirmata/bin/get-id.sh https://www.nirmata.io/nirmata-host-agent/get-id.sh
null_resource.node (remote-exec):  --insecure -so /opt/nirmata/bin/report-nirmata-agent-log.sh https://www.nirmata.io/nirmata-host-agent/report-nirmata-agent-log.sh
null_resource.node (remote-exec): Provider: other
null_resource.node (remote-exec): Installing nirmata-agent service ...
null_resource.node (remote-exec): Starting nirmata agent. Please wait...
null_resource.node (remote-exec): Created symlink /etc/systemd/system/multi-user.target.wants/nirmata-agent.service → /etc/systemd/system/nirmata-agent.service.
null_resource.node (remote-exec): ● nirmata-agent.service - Nirmata Host Agent Service
null_resource.node (remote-exec):    Loaded: loaded (/etc/systemd/system/nirmata-agent.service; enabled; vendor preset: enabled)
null_resource.node (remote-exec):    Active: active (running) since Fri 2020-04-03 08:34:59 PDT; 23ms ago
null_resource.node (remote-exec):   Process: 15880 ExecStartPre=/opt/nirmata/bin/start-nirmata-agent.sh (code=exited, status=0/SUCCESS)
null_resource.node (remote-exec):  Main PID: 16083 (docker)
null_resource.node (remote-exec):     Tasks: 1 (limit: 4915)
null_resource.node (remote-exec):    Memory: 7.5M
null_resource.node (remote-exec):    CGroup: /system.slice/nirmata-agent.service
null_resource.node (remote-exec):            └─16083 /usr/bin/docker wait nirmata-agent

null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: nirmata-agent
null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: + '[' '!' -f /opt/nirmata/conf/nirmata-agent.ini ']'
null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: + '[' true == false ']'
null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: + docker run -d --log-opt max-size=2m --log-opt max-file=5 --label com.nirmata.container-type=system -e https_proxy= -v /var/run:/var/run -v /opt/nirmata:/opt/nirmata/ -v /sys:/sys:ro -v /:/rootfs:ro -v /var/lib/docker/:/var/lib/docker:ro -v /etc:/etc/ --name=nirmata-agent nirmata/nirmata-host-agent:running --hostIP=10.18.0.159 --host-interface=wlx74ee2af2afa4 --hostgroup=40316453-3720-44fe-9e13-e94dea7953bd --url=wss://www.nirmata.io/host-gateway/manager --cprov=other --registry= '--registry-login={LOGIN_NAME:-}' '--registry-password={LOGIN_PASSWORD:=}' --insecure --domain=local
null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: 4d7bf38b83111bc4aa40a933ebda9103736af04d329c8ac4d67825c730b8023f
null_resource.node (remote-exec): Apr 03 08:34:57 tower start-nirmata-agent.sh[15880]: + sleep 2
null_resource.node (remote-exec): Apr 03 08:34:59 tower start-nirmata-agent.sh[15880]: + echo 'Started nirmata agent ... '
null_resource.node (remote-exec): Apr 03 08:34:59 tower start-nirmata-agent.sh[15880]: Started nirmata agent ...
null_resource.node (remote-exec): Apr 03 08:34:59 tower start-nirmata-agent.sh[15880]: + set +x
null_resource.node (remote-exec): Apr 03 08:34:59 tower systemd[1]: Started Nirmata Host Agent Service.
null_resource.node (remote-exec): Getting Latest agent image ...
null_resource.node (remote-exec): Nirmata agent started successfully
null_resource.node (remote-exec): -------------------------------------
null_resource.node (remote-exec): Start command: systemctl start nirmata-agent
null_resource.node (remote-exec): Stop command : systemctl stop nirmata-agent
null_resource.node (remote-exec): -------------------------------------
null_resource.node: Creation complete after 29s [id=8075449753195635996]
nirmata_cluster_direct_connect.dc-cluster-1: Creating...
nirmata_cluster_direct_connect.dc-cluster-1: Creation complete after 1s [id=3ece5a4b-8e65-469f-a3b9-2502b801112c]

Apply complete! Resources: 3 added, 0 changed, 0 destroyed.

ssilbory@tower:~/my-nirmata-cluster$ terraform show
# nirmata_cluster_direct_connect.dc-cluster-1:
resource "nirmata_cluster_direct_connect" "dc-cluster-1" {
    host_group = "baremetal-hg-1"
    id         = "3ece5a4b-8e65-469f-a3b9-2502b801112c"
    name       = "baremetal-cluster-1"
    policy     = "sam-test-1.15"
    state      = "pendingCreate"
    status     = []
}

# nirmata_host_group_direct_connect.dc-host-group:
resource "nirmata_host_group_direct_connect" "dc-host-group" {
    curl_script = "sudo curl -sSL https://nirmata.io/nirmata-host-agent/setup-nirmata-agent.sh | sudo sh -s -- --cloud other --hostgroup 40316453-3720-44fe-9e13-e94dea7953bd"
    id          = "40316453-3720-44fe-9e13-e94dea7953bd"
    name        = "baremetal-hg-1"
    status      = []
}

# null_resource.node:
resource "null_resource" "node" {
    id = "8075449753195635996"
}