Run Your First CrateDB Cluster on Kubernetes, Part One

Written by Mika Naylor | 2018-08-17

CrateDB and Docker are a great match thanks to CrateDB’s horizontally scalable shared-nothing architecture that lends itself well to containerization.

Kubernetes is an open-source container orchestration system for the management, deployment, and scaling of containerized systems.

Together, Docker and Kubernetes are a fantastic way to deploy and scale CrateDB.

This miniseries introduces you to Kubernetes and walks you through the process of setting up your first CrateDB cluster on your local machine.

An Introduction to Kubernetes

Components

Internally, Kubernetes is a system of processes distributed across every node that makes up the Kubernetes cluster. This cluster must have at least one master node and may have any number of non-master (i.e., worker) nodes.

Containers (that run the actual software you want to deploy) are then distributed intelligently across the Kubernetes nodes.

A Kubernetes node has different Kubernetes components running on it, depending on the function of the node.

A master node runs three unique components:

kube-apiserver
The API server provides the REST endpoint for the Kubernetes API and provides the shared state which all other Kubernetes components use. It also validates and configures the Kubernetes objects.
kube-controller-manager
The controller manager is a daemon that monitors the shared state of the Kubernetes cluster (provided by the API server) and attempts to make changes to the Kubernetes cluster necessary to move it towards the desired state.
kube-scheduler
The scheduler, among other tasks, assigns containers to run on nodes within the Kubernetes cluster according to resource usage and prioritization.

A non-master node runs two unique components:

kubelet
(a.k.a. the node agent)The node agent is the primary component of a non-master node. This component receives Docker image specifications, runs the specified images, and tries to keep them healthy.
kube-proxy
The network proxy is responsible for implementing services (that provide, e.g., load balancing) defined via the API server.

This way, Kubernetes can be thought of, not as a single thing, but as multiple instances of these five components coordinating across multiple machines.

Concepts

Now that we’ve established what Kubernetes is, we can look at the concepts that are used to define the state of a Kubernetes cluster.

Arguably the most important of these concepts is a pod.

Pods

For Kubernetes, a pod is the foundational building block of any system.

A pod represents a single unit of computing. This unit of computing may be one container or several tightly-coupled containers.

If you are deploying a web application, for example, a pod would run a single instance of your application.

You can scale pods horizontally. If you want to scale out, you can add replica pods. If you want to scale in, you can remove replica pods.

Most simple web application instances only require one container. However, more complex applications might require more than one. For example, one container might serve client requests, and one container might perform asynchronous background jobs.

Containers in a pod share a common network interface and each container has access to all storage volumes assigned to the pod. (Containers outside the pod do not have access to assigned storage volumes.)

The official CrateDB Docker image works well as a single container pod. These CrateDB pods can then be combined to produce a CrateDB cluster of any size.

In summary, pods are ephemeral, disposable environments that run one or more related containers. They are the foundational Kubernetes building block, and Kubernetes primarily concerns itself with pod scheduling, node assignment, and monitoring.

Controllers

You do not usually create pods by hand. Instead, you use a controller to create them.

A controller is an entity within Kubernetes than manages a set of pods for you according to your specification.

Kubernetes has several controllers for different purposes.

For example, a DaemonSet is a controller that ensures that all nodes in the Kubernetes cluster run a single copy of the defined pod. DaemonSets can be useful for running things like a log collection daemon like fluentd.

Another commonly used controller is the ReplicaSet, which ensures that a specified number of pod replicas are running at any given time.

You could run a CrateDB cluster with a ReplicaSet, but you would quickly run into the problem of state.

Ideally, containers should be stateless. That is, they should not have a fixed identity within your system. It shouldn't matter if a container is destroyed, recreated, moved to another node, or whatever—as long as the desired number of containers are running.

Stateless containers are suitable for web applications that persist state to an external database. However, databases themselves require persistent storage. You don't want your data to vanish because a container was rescheduled!

Besides, a CrateDB cluster requires stable network identities so that CrateDB can manage shard allocation.

Fortunately, Kubernetes provides the StatefulSet controller which gives each pod a fixed identity and storage that persists across restarts and rescheduling. That is, the controller creates all of the pods within a StatefulSet from the same template, but they are not interchangeable.

That's just the ticket for our purposes.

Services

So far we've got pods, which house one or more related containers. Also, we have controllers, which create and destroy pods according to our specifications. We can even have pods with persistent storage.

However, because the pods can be stopped, started, and rescheduled to any Kubernetes node, assigned IP addresses may change over time. Changing IP addresses are the sort of thing we don't want client applications to have to handle.

To solve this problem, we use services.

A service provides a static interface to access one or more pods.

The three most commonly used services are:

ClusterIP
The default type of service. Exposes pods on an internal network interface that is only reachable from within the Kubernetes cluster itself.
NodePort
Pods are exposed on a defined port number. These ports are accessible using the IP address of any Kubernetes node, and they are reachable from outside the Kubernetes cluster.
LoadBalancer
Incoming network traffic is load balanced across the set of backend pods. A load balancer is made available on a defined port number that is accessible using the IP address of any Kubernetes node or a single virtual cluster IP address.

For our purposes, we want to create a LoadBalancer service that balances incoming queries across the whole CrateDB cluster.

Dive In

Enough of the theory! Let's dive in and put this knowledge into practice.

By the time you've finished this section, you should have a simple three-node CrateDB cluster running on Kubernetes.

Prerequisites

For this tutorial you need to install:

A hypervisor
Minikube (listed below) requires a hypervisor to work. Check the Minikube quickstart for a list of supported drivers. VirtualBox is a popular cross-platform option.
kubectl
The standard Kubernetes command-line tool. You can use this to create and inspect objects in your Kubernetes cluster.Follow the installation documentation to get this installed.
Minikube
There are multiple ways to provision Kubernetes. I chose Minikube because it is well suited for local development.Install the latest release of Minikube.

This tutorial uses these three pieces of software to create and interact with Kubernetes cluster running inside a virtual machine.

Once you have all three installed, you can continue.

Start Kubernetes

You don't need to worry about downloading disk images and setting up virtual machines manually. As long as you have a compatible hypervisor like VirtualBox installed on your system, Minikube detects it and set the whole thing up for you automatically.

Get started by running the following command:

$ minikube start --memory 4096

Starting local Kubernetes v1.10.0 cluster...
Starting VM...
Downloading Minikube ISO
160.27 MB / 160.27 MB [======================================] 100.00% 0s
Getting VM IP address...
Moving files into cluster...
Setting up certs...
Connecting to cluster...
Setting up kubeconfig...
Starting cluster components...
Kubectl is now configured to use the cluster.
Loading cached images from config file.

Here, we passed in --memory 4096 because Minikube assigns 1GB of memory for the virtual machine by default. However, we want more than that for our cluster.

Feel free to change this if you need to.

Minikube automatically configures kubectl to use the newly created Kubernetes cluster. To verify this, and to ensure the cluster is ready to use, run this command:

$ kubectl get nodes

NAME       STATUS    ROLES     AGE       VERSION
minikube   Ready     master    4m        v1.10.0

Great! The cluster is up and running.

Namespaces

Kubernetes objects exist inside namespaces.

Namespaces provide a shared environment for objects, and these environments are isolated from other namespaces.

The Standard Namespaces

Kubernetes configures a new cluster with some initial namespaces.

You can view the initial namespaces like so:

$ kubectl get namespaces

NAME          STATUS    AGE
default       Active    12m
kube-public   Active    12m
kube-system   Active    12m

Let's look at these namespaces in detail:

defaultThe default namespace, used when no other namespaces are specified.
kube-publicThis namespace is for resources that are publicly available.
kube-systemThe namespace used by the Kubernetes internals.

If you inspect the list of pods running in the kube-system namespace, you can see the Kubernetes components we covered earlier:

$ kubectl get pods --namespace kube-system

NAME                                    READY    STATUS    RESTARTS   AGE
etcd-minikube                           1/1      Running   0          30m
kube-addon-manager-minikube             1/1      Running   0          31m
kube-apiserver-minikube                 1/1      Running   0          30m
kube-controller-manager-minikube        1/1      Running   0          31m
kube-dns-86f4d74b45-hl6v9               3/3      Running   0          31m
kube-proxy-j7c59                        1/1      Running   0          31m
kube-scheduler-minikube                 1/1      Running   0          30m
kubernetes-dashboard-5498ccf677-wbbsm   1/1      Running   0          31m
storage-provisioner                     1/1      Running   0          31m

Create a CrateDB Namespace

Before we continue, let's create a namespace for our CrateDB cluster resources.

Technically, we don't require namespaces for what we're doing. However, it's worthwhile using namespaces, to familiarize yourself with them.

Create a namespace like this:

$ kubectl create namespace crate

namespace/crate created

The newly created namespace should show up, like this:

$ kubectl get namespaces

NAME          STATUS    AGE
default       Active    32m
kube-public   Active    31m
kube-system   Active    32m
crate         Active    59s

From now on, when we pass --namespace crate to kubectl when creating or modifying resources used by our CrateDB cluster.

Create the CrateDB Services

Internal Service

For CrateDB to work, each CrateDB node should be able to discover and communicate with the other nodes in the cluster.

Create a new file named crate-internal-service.yaml.

Paste in the following configuration:

kind: Service
apiVersion: v1
metadata:
  name: crate-internal-service
  labels:
    app: crate
spec:
  # A static IP address is assigned to this service. This IP address is
  # only reachable from within the Kubernetes cluster.
  type: ClusterIP
  ports:
    # Port 4300 for inter-node communication.
  - port: 4300
    name: crate-internal
  selector:
    # Apply this to all nodes with the `app:crate` label.
    app: crate

Let's break that down:

We define a Kubernetes service called crate-internal-service that maps onto all pods with the app:crate label. In a subsequent step, we configure all of our CrateDB pods to use this label.
We configure a static IP address for the service.
We expose the service on port 4300, which is the standard port CrateDB uses for inter-node communication.

Kubernetes creates SRV records for each matching pod. We can use those records in a subsequent step to configure CrateDB unicast host discovery.

Save the file and then create the service, like so:

$ kubectl create -f crate-internal-service.yaml --namespace crate

service/crate-internal created

External Service

We need to expose our pods externally so that clients may query CrateDB.

Create a new file named crate-external-service.yaml.

Paste in the following configuration:

kind: Service
apiVersion: v1
metadata:
  name: crate-external-service
  labels:
    app: crate
spec:
  # Create an externally reachable load balancer.
  type: LoadBalancer
  ports:
    # Port 4200 for HTTP clients.
  - port: 4200
    name: crate-web
    # Port 5432 for PostgreSQL wire protocol clients.
  - port: 5432
    name: postgres
  selector:
    # Apply this to all nodes with the `app:crate` label.
    app: crate

Let's break that down:

We define a Kubernetes service called crate-external-service that maps onto all pods with the app:crate label, as before.
Kubernetes will create an external load balancer.
We exposed the service on port 4200 for HTTP clients and port 5432 for PostgreSQL wire protocol clients.

Usually, a LoadBalancer service is only available using a hosted solution. In these situations, Kubernetes provisions an external load balancer using the load balancer service provided by the hosted solution.

Fortunately, however, Minikube mocks out a load balancer for us. So we can happily use LoadBalancer services locally.

Save the file and then create the service, like so:

$ kubectl create -f crate-external-service.yaml --namespace crate

service/crate-external created

Create the CrateDB Controller

So far, we've created two services:

An internal service that CrateDB uses for node discovery and inter-node communication.
An external service that load balances CrateDB client requests.

However, these are just the interfaces to our CrateDB cluster. We still need to define a controller that knows how to assemble and manage our desired CrateDB cluster. This controller is the heart of our setup.

Create a new file named crate-controller.yaml.

Paste in the following configuration:

kind: StatefulSet
apiVersion: "apps/v1"
metadata:
  # This is the name used as a prefix for all pods in the set.
  name: crate
spec:
  serviceName: "crate-set"
  # Our cluster has three nodes.
  replicas: 3
  selector:
    matchLabels:
      # The pods in this cluster have the `app:crate` app label.
      app: crate
  template:
    metadata:
      labels:
        app: crate
    spec:
      # InitContainers run before the main containers of a pod are 
      # started, and they must terminate before the primary containers 
      # are initialized. Here, we use one to set the correct memory
      # map limit.
      initContainers:
      - name: init-sysctl
        image: busybox
        imagePullPolicy: IfNotPresent
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      # This final section is the core of the StatefulSet configuration. 
      # It defines the container to run in each pod.
      containers:
      - name: crate
        # Use the CrateDB 3.0.5 Docker image.
        image: crate:3.0.5
        # Pass in configuration to CrateDB via command-line options. 
        # Notice that we are configuring CrateDB unicast host discovery 
        # using the SRV records provided by Kubernetes.
        command:
          - /docker-entrypoint.sh
          - -Ccluster.name=${CLUSTER_NAME}
          - -Cdiscovery.zen.minimum_master_nodes=2
          - -Cdiscovery.zen.hosts_provider=srv
          - -Cdiscovery.srv.query=_crate-internal._tcp.crate-internal-service.${NAMESPACE}.svc.cluster.local
          - -Cgateway.recover_after_nodes=2
          - -Cgateway.expected_nodes=${EXPECTED_NODES}
          - -Cpath.data=/data
        volumeMounts:
              # Mount the `/data` directory as a volume named `data`.
            - mountPath: /data
              name: data
        resources:
          limits:
            # How much memory each pod gets.
            memory: 512Mi
        ports:
          # Port 4300 for inter-node communication.
        - containerPort: 4300
          name: crate-internal
          # Port 4200 for HTTP clients.
        - containerPort: 4200
          name: crate-web
          # Port 5432 for PostgreSQL wire protocol clients.
        - containerPort: 5432
          name: postgres
        # Environment variables passed through to the container.
        env:
          # This is variable is detected by CrateDB.
        - name: CRATE_HEAP_SIZE
          value: "256m"
          # The rest of these variables are used in the command-line
          # options.
        - name: EXPECTED_NODES
          value: "3"
        - name: CLUSTER_NAME
          value: "my-crate"
        - name: NAMESPACE
          valueFrom:
            fieldRef:
              fieldPath: metadata.namespace
      volumes:
        # Use a RAM drive for storage which is fine for testing, but must
        # not be used for production setups!
        - name: data
          emptyDir:
            medium: "Memory"

Phew! That's a lot. Let's break it down:

We define a Kubernetes controller that creates pods named crate-0, crate-1, crate-2, and so on.
This controller creates a StatefulSet called crate-set. This set must have three CrateDB pods with a fixed identity and persistent storage.
Each pod has the app:crate label, flagging them for use with the services we defined previously.
We use an InitContainer to manually configure the correct memory map limit so that the CrateDB bootstrap checks pass.
Each pod gets 512MB of memory so that the three pods in our cluster use 1.5GB of the 4GB we allocated to the virtual machine. Plenty of room to grow.
We define the CrateDB container to run in each pod. We're using the 3.0.5 version of the CrateDB Docker image.
The crate-internal-service provides SRV records. We use those records to configure CrateDB unicast host discovery via command-line options.
Each pod exposes port 4300 for intra-node communication, port 4200 for HTTP clients, and port 5432 for PostgreSQL wire protocol clients.
We pass through some configuration via environment variables. CrateDB detects CRATE_HEAP_SIZE, and the command-line options make use of the rest.
The CrateDB heap size is configured as 256MB, which is 50% of the available memory.
We're using a RAM drive for storage as a temporary storage solution to get us set up quickly.

Save the file and then create the controller, like so:

$ kubectl create -f crate-controller.yaml --namespace crate

statefulset.apps/crate-controller created

The StatefulSet controller brings up the CrateDB pods one by one. You can monitor the progress with this command:

$ kubectl get pods --namespace crate

NAME      READY     STATUS            RESTARTS   AGE
crate-0   0/1       PodInitializing   0          36s

Keep running this command until you see something like this:

$ kubectl get pods --namespace crate

NAME      READY     STATUS    RESTARTS   AGE
crate-0   1/1       Running   0          2m
crate-1   1/1       Running   0          1m
crate-2   1/1       Running   0          1m

Once you see this, all three pods are running, and your CrateDB cluster is ready!

Access the CrateDB Cluster

Before you can access CrateDB, you need to inspect the running services:

$ kubectl get service --namespace crate

NAME                    TYPE          CLUSTER-IP      EXTERNAL-IP  PORT(S)                        AGE
crate-external-service  LoadBalancer  10.96.227.26    <pending>    4200:31159/TCP,5432:31316/TCP  44m
crate-internal-service  ClusterIP     10.101.192.101  <none>       4300/TCP                       44m

We're only interested in the crate-external-service service.

The PORT(S) column tells us that Kubernetes port 31159 is mapped to CrateDB port 4200 (HTTP) and Kubernetes port 31316 is mapped to CrateDB port 5432 (PostgreSQL wire protocol).

Because of a Minicube quirk, the external IP is always pending. To get around this, we can ask Minicube to list the services instead:

$ minikube service list --namespace crate

|------------|------------------------|--------------------------------|
| NAMESPACE  |          NAME          |              URL               |
|------------|------------------------|--------------------------------|
| my-cratedb | crate-external-service | http://192.168.99.100:31159    |
|            |                        | http://192.168.99.100:31316    |
| my-cratedb | crate-internal-service | No node port                   |
|------------|------------------------|--------------------------------|

As shown, the crate-external-service has two ports exposed on 92.168.99.100.

Because of a Minicube issue, for now, the http:// prefix is added regardless of the actual network protocol in use, which happens to be right for the CrateDB HTTP port, but is not correct for the PostgreSQL wire protocol port.

The port we're interested in for this tutorial is the HTTP port, which is 31159. You can verify that this port is open and functioning as expected by issuing a simple HTTP request via the command line:

$ curl 192.168.99.100:31159

{
  "ok" : true,
  "status" : 200,
  "name" : "Regenstein",
  "cluster_name" : "my-crate",
  "version" : {
    "number" : "3.0.5",
    "build_hash" : "89703701b45084f7e17705b45d1ce76d95fbc7e5",
    "build_timestamp" : "2018-07-31T06:18:44Z",
    "build_snapshot" : false,
    "es_version" : "6.1.4",
    "lucene_version" : "7.1.0"
  }
}

Great! This is the HTTP API response we expect.

Copy and paste the same network address (192.168.99.100:31159 in the example above, but your address is probably different) into your browser.

You should see the CrateDB Admin UI:

Select the Cluster screen from the left-hand navigation menu, and you should see something that looks like this:

Here you can see that our CrateDB cluster has three nodes, as expected.

If this is your first time using CrateDB, you should check out the Getting Started guide for help importing test data and running your first query.

When you're done, you can stop Minikube, like so:

$ minikube stop

Stopping local Kubernetes cluster...
Machine stopped.

Wrap Up

In this post, we went over the basics of Kubernetes and created a simple three-node CrateDB cluster with Kubernetes on a single virtual machine.

This setup is a good starting point for a local testing environment. However, we can improve it. An excellent place to start is with storage.

CrateDB nodes are writing their data to a RAM drive, which is no good if we want our data to survive a power cycle.

Part two of this miniseries shows you how to configure non-volatile storage and how to scale your cluster.

View full post