WEBKT

手把手教你用 Go 开发并部署 Kubernetes Operator

160 0 0 0

手把手教你用 Go 开发并部署 Kubernetes Operator

Kubernetes Operator 是一种扩展 Kubernetes API 的方式,可以自动化部署、配置和管理应用程序。它通过自定义资源(Custom Resource Definitions, CRD)和自定义控制器(Custom Controller)来实现。本文将引导你使用 Go 语言开发一个简单的 Kubernetes Operator,并将其部署到 Kubernetes 集群中。

1. 准备工作

  • Go 环境: 确保你已经安装了 Go 语言环境(Go 1.16+)。

  • Kubernetes 集群: 你需要一个可用的 Kubernetes 集群。可以使用 Minikube、Kind 或其他 Kubernetes 发行版。

  • kubectl: 确保你已经安装并配置了 kubectl 命令行工具。

  • Operator SDK: Operator SDK 是一个用于构建 Kubernetes Operator 的框架。使用以下命令安装:

    go install github.com/operator-framework/operator-sdk/cmd/operator-sdk@latest
    

2. 创建 Operator 项目

使用 Operator SDK 创建一个新的 Operator 项目:

operator-sdk init --domain=example.com --repo=github.com/your-username/my-operator
  • --domain: 指定 Operator 的域名。这里使用 example.com 作为示例,你需要替换为你自己的域名。
  • --repo: 指定 Operator 项目的 Git 仓库地址。这里使用 github.com/your-username/my-operator 作为示例,你需要替换为你自己的仓库地址。

3. 定义 Custom Resource Definition (CRD)

CRD 用于定义 Kubernetes API 的扩展资源。我们将创建一个名为 AppService 的 CRD,用于描述一个简单的应用程序服务。

operator-sdk create api --group=app --version=v1alpha1 --kind=AppService --resource=true --controller=true
  • --group: 指定 CRD 的 Group。这里使用 app
  • --version: 指定 CRD 的 Version。这里使用 v1alpha1
  • --kind: 指定 CRD 的 Kind。这里使用 AppService
  • --resource: 指定是否创建资源文件。这里设置为 true
  • --controller: 指定是否创建控制器文件。这里设置为 true

执行完上述命令后,Operator SDK 会自动生成 CRD 的定义文件和 Controller 的框架代码。你可以在 api/v1alpha1/appservice_types.go 文件中定义 AppService 资源的 Spec 和 Status。

package v1alpha1

import (
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// AppServiceSpec defines the desired state of AppService
type AppServiceSpec struct {
    // Size is the desired size of the AppService
    Size int32 `json:"size"`
    // Image is the Docker image to use for the AppService
    Image string `json:"image"`
}

// AppServiceStatus defines the observed state of AppService
type AppServiceStatus struct {
    // Nodes are the names of the nodes where the AppService is running
    Nodes []string `json:"nodes"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

// AppService is the Schema for the appservices API
type AppService struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec   AppServiceSpec   `json:"spec,omitempty"`
    Status AppServiceStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// AppServiceList contains a list of AppService
type AppServiceList struct {
    metav1.TypeMeta `json:",inline"`
    metav1.ListMeta `json:"metadata,omitempty"`
    Items           []AppService `json:"items"`
}

func init() {
    SchemeBuilder.Register(&AppService{}, &AppServiceList{})
}
  • AppServiceSpec 定义了 AppService 资源的期望状态,包括 Size (副本数量) 和 Image (Docker 镜像)。
  • AppServiceStatus 定义了 AppService 资源的观察状态,包括 Nodes (运行 AppService 的节点名称)。

4. 实现 Controller 逻辑

Controller 负责协调 AppService 资源的状态,并确保实际状态与期望状态一致。你需要在 controllers/appservice_controller.go 文件中实现 Controller 的 Reconcile 方法。

package controllers

import (
    "context"

    appv1alpha1 "github.com/your-username/my-operator/api/v1alpha1"
    appsv1 "k8s.io/api/apps/v1"
    corev1 "k8s.io/api/core/v1"
    "k8s.io/apimachinery/pkg/api/errors"
    metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
    "k8s.io/apimachinery/pkg/runtime"
    "k8s.io/apimachinery/pkg/types"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/client"
    "sigs.k8s.io/controller-runtime/pkg/log"
)

// AppServiceReconciler reconciles a AppService object
type AppServiceReconciler struct {
    client.Client
    Scheme *runtime.Scheme
}

//+kubebuilder:rbac:groups=app.example.com,resources=appservices,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=app.example.com,resources=appservices/status,verbs=get;update;patch
//+kubebuilder:rbac:groups=app.example.com,resources=appservices/finalizers,verbs=update
//+kubebuilder:rbac:groups=apps,resources=deployments,verbs=get;list;watch;create;update;patch;delete
//+kubebuilder:rbac:groups=core,resources=pods,verbs=get;list;watch

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
// TODO(user): Modify the Reconcile function to compare the state specified by
// the AppService object against the actual cluster state, and then
// perform operations to make the cluster state reflect the state specified by
// the user.
//
// For more details, check Reconcile and its Result here:
// - https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.11.0/pkg/reconcile
func (r *AppServiceReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    log := log.FromContext(ctx)

    // 1. Fetch the AppService instance
    appService := &appv1alpha1.AppService{}
    err := r.Get(ctx, req.NamespacedName, appService)
    if err != nil {
        if errors.IsNotFound(err) {
            // AppService not found, could have been deleted after reconcile request.
            // Owned objects are automatically garbage collected. For additional cleanup logic use finalizers.
            // Return and don't requeue
            log.Info("AppService resource not found. Ignoring since object must be deleted")
            return ctrl.Result{}, nil
        }
        // Error reading the object - requeue the request.
        log.Error(err, "Failed to get AppService")
        return ctrl.Result{}, err
    }

    // 2. Define a new Deployment object
    deployment := r.deploymentForAppService(appService)

    // 3. Check if the Deployment already exists, if not create a new one
    err = r.Get(ctx, types.NamespacedName{Name: deployment.Name, Namespace: deployment.Namespace}, &appsv1.Deployment{})
    if err != nil {
        if errors.IsNotFound(err) {
            log.Info("Creating a new Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
            err = r.Create(ctx, deployment)
            if err != nil {
                log.Error(err, "Failed to create new Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
                return ctrl.Result{}, err
            }
            // Deployment created successfully - return and requeue
            return ctrl.Result{Requeue: true}, nil
        }
        log.Error(err, "Failed to get Deployment")
        return ctrl.Result{}, err
    }

    // 4. Ensure the deployment size is the same as the spec
    size := appService.Spec.Size
    if *deployment.Spec.Replicas != size {
        log.Info("Updating Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
        deployment.Spec.Replicas = &size
        err = r.Update(ctx, deployment)
        if err != nil {
            log.Error(err, "Failed to update Deployment", "Deployment.Namespace", deployment.Namespace, "Deployment.Name", deployment.Name)
            return ctrl.Result{}, err
        }
        // Spec updated - return and requeue
        return ctrl.Result{Requeue: true}, nil
    }

    // 5. Update the AppService status with the pod names
    podList := &corev1.PodList{}
    lisOpts := []client.ListOption{
        client.InNamespace(req.Namespace),
        client.MatchingLabels(map[string]string{"app": appService.Name}),
    }
    if err = r.List(ctx, podList, lisOpts...); err != nil {
        log.Error(err, "Failed to list pods", "AppService.Namespace", appService.Namespace, "AppService.Name", appService.Name)
        return ctrl.Result{}, err
    }
    podNames := getPodNames(podList.Items)
    if !reflect.DeepEqual(podNames, appService.Status.Nodes) {
        appService.Status.Nodes = podNames
        err := r.Status().Update(ctx, appService)
        if err != nil {
            log.Error(err, "Failed to update AppService status")
            return ctrl.Result{}, err
        }
    }

    return ctrl.Result{}, nil
}

// deploymentForAppService returns a deployment object for the AppService.
func (r *AppServiceReconciler) deploymentForAppService(appService *appv1alpha1.AppService) *appsv1.Deployment {
    ls := labelsForAppService(appService.Name)
    replicas := appService.Spec.Size

    deployment := &appsv1.Deployment{
        ObjectMeta: metav1.ObjectMeta{
            Name:      appService.Name,
            Namespace: appService.Namespace,
            Labels:    ls,
        },
        Spec: appsv1.DeploymentSpec{
            Replicas: &replicas,
            Selector: &metav1.LabelSelector{
                MatchLabels: ls,
            },
            Template: corev1.PodTemplateSpec{
                ObjectMeta: metav1.ObjectMeta{
                    Labels: ls,
                },
                Spec: corev1.PodSpec{
                    Containers: []corev1.Container{{
                        Image:           appService.Spec.Image,
                        Name:            "app-service",
                        ImagePullPolicy: corev1.PullIfNotPresent,
                    }},
                },
            },
        },
    }
    // Set AppService instance as the owner and controller
    ctrl.SetControllerReference(appService, deployment, r.Scheme)
    return deployment
}

// labelsForAppService returns the labels for selecting the resources
// belonging to the given AppService CR name.
func labelsForAppService(name string) map[string]string {
    return map[string]string{"app": name}
}

// getPodNames returns the pod names of the array of pods passed in
func getPodNames(pods []corev1.Pod) []string {
    var podNames []string
    for _, pod := range pods {
        podNames = append(podNames, pod.Name)
    }
    return podNames
}

// SetupWithManager sets up the controller with the Manager.
func (r *AppServiceReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).For(&appv1alpha1.AppService{}).Owns(&appsv1.Deployment{}).Complete(r)
}

这个 Reconcile 方法实现了以下逻辑:

  1. 获取 AppService 实例: 根据请求的命名空间和名称获取 AppService 资源。
  2. 定义 Deployment 对象: 创建一个 Deployment 对象,该 Deployment 对象将部署由 AppService 资源定义的应用程序。
  3. 检查 Deployment 是否存在: 如果 Deployment 不存在,则创建新的 Deployment。
  4. 确保 Deployment 的大小与 Spec 相同: 如果 Deployment 的副本数量与 AppService 资源的 Spec.Size 不一致,则更新 Deployment。
  5. 更新 AppService 状态: 使用 Pod 的名称更新 AppService 资源的状态。

5. 构建和部署 Operator

  1. 构建 Operator 镜像:

    make docker-build IMG=your-docker-registry/my-operator:v0.0.1
    
    • your-docker-registry/my-operator:v0.0.1 替换为你自己的 Docker 镜像仓库地址和标签。
  2. 推送 Operator 镜像到 Docker 仓库:

    docker push your-docker-registry/my-operator:v0.0.1
    
  3. 部署 CRD:

    make install
    

    这个命令会安装 CRD 到 Kubernetes 集群中。

  4. 部署 Controller:

    make deploy IMG=your-docker-registry/my-operator:v0.0.1
    

    这个命令会部署 Controller 到 Kubernetes 集群中。

6. 创建 AppService 实例

创建一个 YAML 文件 appservice.yaml,用于定义一个 AppService 实例:

apiVersion: app.example.com/v1alpha1
kind: AppService
metadata:
  name: my-appservice
spec:
  size: 3
  image: nginx:latest
  • size: 指定应用程序的副本数量为 3。
  • image: 指定使用的 Docker 镜像为 nginx:latest

使用 kubectl 创建 AppService 实例:

kubectl apply -f appservice.yaml

7. 验证 Operator

使用 kubectl 检查 Deployment 是否已成功创建:

kubectl get deployments

你应该看到一个名为 my-appservice 的 Deployment 正在运行。你还可以使用 kubectl 检查 Pod 是否已成功创建:

kubectl get pods

你应该看到 3 个 nginx Pod 正在运行。你还可以使用 kubectl 获取 AppService 资源的状态:

kubectl get appservices my-appservice -o yaml

你应该看到 AppService 资源的 Status 字段已经更新,包含了运行 AppService 的节点名称。

8. 总结

本文介绍了如何使用 Go 语言开发一个简单的 Kubernetes Operator,并将其部署到 Kubernetes 集群中。通过这个示例,你应该对 Kubernetes Operator 的开发过程有了一个基本的了解。你可以根据自己的需求扩展这个 Operator,实现更复杂的功能。

9. 完整代码

完整的代码可以在以下 GitHub 仓库中找到:

https://github.com/your-username/my-operator (请将 your-username 替换为你的 GitHub 用户名)

注意: 以上代码仅为示例,可能需要根据你的实际情况进行修改。

希望这篇文章能够帮助你入门 Kubernetes Operator 的开发!

Operator 小白 Kubernetes OperatorGoCRD

评论点评