Introducing the Kubernetes Operator for TiDB

August 24, 2018 -

Learn how to use the TiDB Operator to deploy, monitor, and manage the distributed, MySQL-compatible, TiDB database on Kubernetes clusters.

Slide1_Intro_CloudSecurity_chargerv8_iStock_000077430721_Large

The rise of Kubernetes has significantly simplified the deployment and operation of cloud-native applications. An important part of that experience is the ease of running a cloud-native distributed database like TiDB. TiDB is an open-source, MySQL-compatible “NewSQL” database that supports hybrid transactional and analytical processing (HTAP).

In this tutorial, we’ll discuss how to use the TiDB Operator, a new open-source project by PingCAP to leverage Kubernetes to deploy the entire TiDB Platform and all of its components. The TiDB Operator allows you to monitor a TiDB deployment in a Kubernetes cluster and provides a gateway to administrative duties.

At this point, it’s perhaps a forgone conclusion that Kubernetes is the de facto orchestration engine of cloud-native applications— ”Linux of the cloud””, as executive director of the Linux Foundation Jim Zemlin put it. Kubernetes is not just a mature and useful technology, but it holds strategic value for the IT operations of many large companies. At least 54% of he Fortune 500 were hiring for Kubernetes skills in 2017.

Inspired by the concept and pattern popularized by CoreOS’s Operator Framework, we began building the TiDB Operator roughly a year ago. Back then, Kubernetes was much less stable or feature-rich, so we had to implement a lot of workarounds to make our Operator… well, operate. With Kubernetes’ dramatic growth in the last year, we refactored our old code to align it with the standard and style of present-day Kubernetes, before open-sourcing the TiDB Operator on GitHub.

Kubernetes’ growing popularity has spawned a large ecosystem of cloud-native applications, as evidenced by the large number of cloud-native projects assembled by the Cloud Native Computing Foundation (CNCF). So where does TiDB fit in all this? Most of these applications can be considered stateless, occupying some core parts of any cloud-native architecture—microservices, service meshes, messaging/tracing/monitoring, etc. However, there is also a place for stateful applications (such as a persistent distributed database). That’s where TiDB fits in.

How to use the TiDB Operator

Let’s dive into how to deploy TiDB using Kubernetes on your laptop, though this can be done on any Kubernetes cluster. Note that this local deployment is only meant to give you a taste of the TiDB Operator for testing and experimentation, not for production use. It’s still undergoing testing by our team and the open-source community. We encourage you to participate.

First, let’s do a quick overview of what is in a TiDB cluster and how that fits with the Kubernetes architecture. Every TiDB cluster has three components:

  1. TiDB stateless SQL layer;
  2. TiKV, a distributed transactional key-value storage layer where data is persisted;
  3. Placement Driver (PD), a metadata cluster that controls TiKV.

In the context of Kubernetes, TiKV and PD maintain database state on disk, and are thus mapped to a StatefulSet with a Persistent Volume Claim. TiDB stateless SQL layer is also mapped to a StatefulSet, but does not make any Persistent Volume Claims.