手把手教你用 Go 开发并部署 Kubernetes Operator
手把手教你用 Go 开发并部署 Kubernetes Operator
Kubernetes Operator 是一种扩展 Kubernetes API 的方式,可以自动化部署、配置和管理应用程序。它通过自定义资源(Custom Resource Definitions, CRD)和自定义控制器(Custom Controller)来实现。本文将引导你使用 Go 语言开发一个简单的 Kubernetes Operator,并将其部署到 Kubernetes 集群中。
1. 准备工作
Go 环境: 确保你已经安装了 Go 语言环境(Go 1.16+)。
Kubernetes 集群: 你需要一个可用的 Kubernetes 集群。可以使用 Minikube、Kind 或其他 Kubernetes 发行版。
kubectl: 确保你已经安装并配置了 kubectl 命令行工具。
Operator SDK: Operator SDK 是一个用于构建 Kubernetes Operator 的框架。使用以下命令安装:
go install github.com/operator-framework/operator-sdk/cmd/operator-sdk@latest
2. 创建 Operator 项目
使用 Operator SDK 创建一个新的 Operator 项目:
operator-sdk init --domain=example.com --repo=github.com/your-username/my-operator
--domain: 指定 Operator 的域名。这里使用example.com作为示例,你需要替换为你自己的域名。--repo: 指定 Operator 项目的 Git 仓库地址。这里使用github.com/your-username/my-operator作为示例,你需要替换为你自己的仓库地址。
3. 定义 Custom Resource Definition (CRD)
CRD 用于定义 Kubernetes API 的扩展资源。我们将创建一个名为 AppService 的 CRD,用于描述一个简单的应用程序服务。
operator-sdk create api --group=app --version=v1alpha1 --kind=AppService --resource=true --controller=true
--group: 指定 CRD 的 Group。这里使用app。--version: 指定 CRD 的 Version。这里使用v1alpha1。--kind: 指定 CRD 的 Kind。这里使用AppService。--resource: 指定是否创建资源文件。这里设置为true。--controller: 指定是否创建控制器文件。这里设置为true。
执行完上述命令后,Operator SDK 会自动生成 CRD 的定义文件和 Controller 的框架代码。你可以在 api/v1alpha1/appservice_types.go 文件中定义 AppService 资源的 Spec 和 Status。
package v1alpha1
import (
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
// AppServiceSpec defines the desired state of AppService
type AppServiceSpec struct {
// Size is the desired size of the AppService
Size int32 `json:"size"`
// Image is the Docker image to use for the AppService
Image string `json:"image"`
}
// AppServiceStatus defines the observed state of AppService
type AppServiceStatus struct {
// Nodes are the names of the nodes where the AppService is running
Nodes []string `json:"nodes"`
}
//+kubebuilder:object:root=true
//+kubebuilder:subresource:status
// AppService is the Schema for the appservices API
type AppService struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec AppServiceSpec `json:"spec,omitempty"`
Status AppServiceStatus `json:"status,omitempty"`
}
//+kubebuilder:object:root=true
// AppServiceList contains a list of AppService
type AppServiceList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []AppService `json:"items"`
}
func init() {
SchemeBuilder.Register(&AppService{}, &AppServiceList{})
}
AppServiceSpec定义了AppService资源的期望状态,包括Size(副本数量) 和Image(Docker 镜像)。AppServiceStatus定义了AppService资源的观察状态,包括Nodes(运行 AppService 的节点名称)。
4. 实现 Controller 逻辑
Controller 负责协调 AppService 资源的状态,并确保实际状态与期望状态一致。你需要在 controllers/appservice_controller.go 文件中实现 Controller 的 Reconcile 方法。
package controllers
import (
"context"
appv1alpha1 "github.com/your-username/my-operator/api/v1alpha1"
appsv1 "k8s.io/api/apps/v1"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/log"
)
// AppServiceReconciler reconciles a AppService object
type AppServiceReconciler struct {
client.Client
Scheme *runtime.Scheme
}
//+kubebuilder:rbac:groups=app.example.com,resources=appservices,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=app.example.com,resources=appservices/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=app.example.com,resources=appservices/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch
// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
// TODO(user): Modify the Reconcile function to compare the state specified by
// the AppService object against the actual cluster state, and then
// perform operations to make the cluster state reflect the state specified by
// the user.
//
// For more details, check Reconcile and its Result here:
// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.11.0/pkg/reconcile
func (r *AppServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := log.FromContext(ctx)
// 1. Fetch the AppService instance
appService := &appv1alpha1.AppService{}
err := r.Get(ctx, req.NamespacedName, appService)
if err != nil {
if errors.IsNotFound(err) {
// AppService not found, could have been deleted after reconcile request.
// Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
// Return and don't requeue
log.Info("AppService resource not found. Ignoring since object must be deleted")
return ctrl.Result{}, nil
}
// Error reading the object - requeue the request.
log.Error(err, "Failed to get AppService")
return ctrl.Result{}, err
}
// 2. Define a new Deployment object
deployment := r.deploymentForAppService(appService)
// 3. Check if the Deployment already exists, if not create a new one
err = r.Get(ctx, types.NamespacedName{Name: deployment.Name, Namespace: deployment.Namespace}, &appsv1.Deployment{})
if err != nil {
if errors.IsNotFound(err) {
log.Info("Creating a new Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
err = r.Create(ctx, deployment)
if err != nil {
log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
return ctrl.Result{}, err
}
// Deployment created successfully - return and requeue
return ctrl.Result{Requeue: true}, nil
}
log.Error(err, "Failed to get Deployment")
return ctrl.Result{}, err
}
// 4. Ensure the deployment size is the same as the spec
size := appService.Spec.Size
if *deployment.Spec.Replicas != size {
log.Info("Updating Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
deployment.Spec.Replicas = &size
err = r.Update(ctx, deployment)
if err != nil {
log.Error(err, "Failed to update Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
return ctrl.Result{}, err
}
// Spec updated - return and requeue
return ctrl.Result{Requeue: true}, nil
}
// 5. Update the AppService status with the pod names
podList := &corev1.PodList{}
lisOpts := []client.ListOption{
client.InNamespace(req.Namespace),
client.MatchingLabels(map[string]string{"app": appService.Name}),
}
if err = r.List(ctx, podList, lisOpts...); err != nil {
log.Error(err, "Failed to list pods", "AppService.Namespace", appService.Namespace, "AppService.Name", appService.Name)
return ctrl.Result{}, err
}
podNames := getPodNames(podList.Items)
if !reflect.DeepEqual(podNames, appService.Status.Nodes) {
appService.Status.Nodes = podNames
err := r.Status().Update(ctx, appService)
if err != nil {
log.Error(err, "Failed to update AppService status")
return ctrl.Result{}, err
}
}
return ctrl.Result{}, nil
}
// deploymentForAppService returns a deployment object for the AppService.
func (r *AppServiceReconciler) deploymentForAppService(appService *appv1alpha1.AppService) *appsv1.Deployment {
ls := labelsForAppService(appService.Name)
replicas := appService.Spec.Size
deployment := &appsv1.Deployment{
ObjectMeta: metav1.ObjectMeta{
Name: appService.Name,
Namespace: appService.Namespace,
Labels: ls,
},
Spec: appsv1.DeploymentSpec{
Replicas: &replicas,
Selector: &metav1.LabelSelector{
MatchLabels: ls,
},
Template: corev1.PodTemplateSpec{
ObjectMeta: metav1.ObjectMeta{
Labels: ls,
},
Spec: corev1.PodSpec{
Containers: []corev1.Container{{
Image: appService.Spec.Image,
Name: "app-service",
ImagePullPolicy: corev1.PullIfNotPresent,
}},
},
},
},
}
// Set AppService instance as the owner and controller
ctrl.SetControllerReference(appService, deployment, r.Scheme)
return deployment
}
// labelsForAppService returns the labels for selecting the resources
// belonging to the given AppService CR name.
func labelsForAppService(name string) map[string]string {
return map[string]string{"app": name}
}
// getPodNames returns the pod names of the array of pods passed in
func getPodNames(pods []corev1.Pod) []string {
var podNames []string
for _, pod := range pods {
podNames = append(podNames, pod.Name)
}
return podNames
}
// SetupWithManager sets up the controller with the Manager.
func (r *AppServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).For(&appv1alpha1.AppService{}).Owns(&appsv1.Deployment{}).Complete(r)
}
这个 Reconcile 方法实现了以下逻辑:
- 获取 AppService 实例: 根据请求的命名空间和名称获取
AppService资源。 - 定义 Deployment 对象: 创建一个 Deployment 对象,该 Deployment 对象将部署由
AppService资源定义的应用程序。 - 检查 Deployment 是否存在: 如果 Deployment 不存在,则创建新的 Deployment。
- 确保 Deployment 的大小与 Spec 相同: 如果 Deployment 的副本数量与
AppService资源的Spec.Size不一致,则更新 Deployment。 - 更新 AppService 状态: 使用 Pod 的名称更新
AppService资源的状态。
5. 构建和部署 Operator
构建 Operator 镜像:
make docker-build IMG=your-docker-registry/my-operator:v0.0.1your-docker-registry/my-operator:v0.0.1替换为你自己的 Docker 镜像仓库地址和标签。
推送 Operator 镜像到 Docker 仓库:
docker push your-docker-registry/my-operator:v0.0.1部署 CRD:
make install这个命令会安装 CRD 到 Kubernetes 集群中。
部署 Controller:
make deploy IMG=your-docker-registry/my-operator:v0.0.1这个命令会部署 Controller 到 Kubernetes 集群中。
6. 创建 AppService 实例
创建一个 YAML 文件 appservice.yaml,用于定义一个 AppService 实例:
apiVersion: app.example.com/v1alpha1
kind: AppService
metadata:
name: my-appservice
spec:
size: 3
image: nginx:latest
size: 指定应用程序的副本数量为 3。image: 指定使用的 Docker 镜像为nginx:latest。
使用 kubectl 创建 AppService 实例:
kubectl apply -f appservice.yaml
7. 验证 Operator
使用 kubectl 检查 Deployment 是否已成功创建:
kubectl get deployments
你应该看到一个名为 my-appservice 的 Deployment 正在运行。你还可以使用 kubectl 检查 Pod 是否已成功创建:
kubectl get pods
你应该看到 3 个 nginx Pod 正在运行。你还可以使用 kubectl 获取 AppService 资源的状态:
kubectl get appservices my-appservice -o yaml
你应该看到 AppService 资源的 Status 字段已经更新,包含了运行 AppService 的节点名称。
8. 总结
本文介绍了如何使用 Go 语言开发一个简单的 Kubernetes Operator,并将其部署到 Kubernetes 集群中。通过这个示例,你应该对 Kubernetes Operator 的开发过程有了一个基本的了解。你可以根据自己的需求扩展这个 Operator,实现更复杂的功能。
9. 完整代码
完整的代码可以在以下 GitHub 仓库中找到:
https://github.com/your-username/my-operator (请将 your-username 替换为你的 GitHub 用户名)
注意: 以上代码仅为示例,可能需要根据你的实际情况进行修改。
希望这篇文章能够帮助你入门 Kubernetes Operator 的开发!