What is Kubernetes?

Kubernetes is the operating system for analytic applications in the cloud. Here are best practices for deploying on Kubernetes that every analytic application developer should know. Contact us to find out more. 

Looking for the Kubernetes Operator? Go to the GitHub project.

Kubernetes orchestrates container-based applications

Containers bundle software releases plus their dependencies into self-contained images that are easy to distribute and run. Kubernetes offers a powerful mechanism to define container-based applications and takes care of scheduling containers across fleets of hosts. It’s a powerful model for managing applications that hides complexity from developers. Analytic developers should have hands-on experience with the following Kubernetes capabilities. 

  • Resource definition – How to define requested compute, storage, and networking requirements for container-based applications. 
  • Scheduling – How to control where containers run. For example, nodeSelector properties can tie containers to particular VM types. 
  • Development, debugging, and monitoring – How to build containers, deploy Kubernetes applications locally using Minikube (or other Kubernetes distributions that can run on dev hosts) and diagnose problems. 

These skills are useful for any type of application on Kubernetes, not just analytics. Kubernetes is now the dominant runtime for container-based applications.

5.4M

CNCF statistics from 2021 estimated that 5.4M developers already had experience using Kubernetes

Want more resources on Kubernetes?

Check out the Kubernetes Analytic Developer FAQ, Kubernetes Concepts, and the Kubernetes Tutorial.

Managed Kubernetes services simplify operation

Kubernetes is relatively easy to use but hard to operate. Managed Kubernetes services offload most operational tasks installing and upgrading Kubernetes or ensuring availability. The extra cost of running a Kubernetes cluster in a managed service is very low–AWS, GCP, and Azure all start at $0.10 per hour per cluster.

Users are still responsible for compute, storage, and networking that Kubernetes applications need but have to pay these costs either way. However, by delegating basic Kubernetes management to the cloud provider developers can instead focus on building applications and applying Kubernetes features that help drive down overall cloud costs. 

Managed Kubernetes is now an established best practice across the entire Kubernetes user community. Datadog statistics show that the growing adoption of managed Kubernetes has enabled users to upgrade more quickly to new Kubernetes versions. Most Altinity customers who run Kubernetes in public clouds also use managed Kubernetes services. 

Kubernetes scales cloud resources dynamically

Kubernetes can automatically allocate compute and storage to meet changing needs of applications. This is a powerful capability that allows developers to dial up application performance for faster response or dial down to reduce costs. Here are two important scaling capabilities of Kubernetes that every developer should understand. 

  • VM provisioning – Managed Kubernetes services can resource pools to provision additional VMs when more compute and RAM are required, then release them when no longer needed. Managed Kubernetes services each have their own ways to do this. For example, AWS EKS has managed node groups or Karpenter, while GKE has node pools
  • Storage provisioning – Managed Kubernetes services can similarly scale cloud storage such as EBS volumes. Storage classes allocate storage using calls to CSI drivers, which are responsible for fulfilling the request. Modern CSI drivers can extend storage size and throughput while applications are running.

The big benefit for developers is that Kubernetes applications can trigger autoscaling simply by asking for more resources.  Scaling happens transparently. This portability is one of the most powerful features of Kubernetes for managing data. Code and scaling techniques are the same across clouds. 

Learn more about dynamic cloud resource scaling with Kubernetes:

Autoscaling Compute in Kubernetes | Storage Management for Analytic Applications

Kubernetes is the platform of choice for analytic databases

It was once common to deploy only stateless applications like web servers on Kubernetes. No longer: Kubernetes is now an outstanding platform for data, especially analytic databases. The following capabilities changed the playing field. 

  • Operators – Thanks to the Kubernetes Operator Pattern, most databases now have operators that define database cluster properties in a short resource file and provide critical management capabilities like upgrade, scaling, backup, security, and monitoring. 
  • High performance storage – The Container Storage Interface (CSI) gives Kubernetes applications access to sophisticated cloud storage capabilities, including lower costs, online volume extension, encryption, storage snapshots and many other features. 
  • Independent scaling of compute and storage – Analytic databases often change compute and storage at different times and in different directions. Operators like the Altinity Kubernetes Operator for ClickHouse can do this with minimal or even zero disruption to running database servers. 

Datadog stats show that databases are now the most common container workload, thanks in large part to growing Kubernetes support for data management. There’s even Data on Kubernetes community with over 4,000 Slack members. It is focused on defining best practices for running databases on Kubernetes. 

Altinity helps customers succeed on Kubernetes

Any development team with sufficient time and experience can build efficient analytic stacks using Kubernetes. However, it requires both skill and time to gain the full advantages of performance, portability, and cost efficiency that Kubernetes enables. A good vendor can help free up dev teams to focus on solving problems that bring more value to users. 

Altinity operates Altinity.Cloud, a managed cloud platform for ClickHouse built on Kubernetes that manages hundreds of ClickHouse clusters across AWS, GCP, and Azure. We also support many self-managed ClickHouse users running in Kubernetes. Besides deep experience in Kubernetes itself, we’re also authors of innovative open source software like the Altinity Kubernetes Operator for ClickHouse and the Altinity clickhouse-backup project. We offer Altinity Stable Builds for ClickHouse

No matter how you are using Kubernetes, Altinity has software and answers. We can manage ClickHouse fully on Kubernetes in our cloud, manage it in your Kubernetes, or help you manage everything yourself. Ask us how. We’re here to help.