Version: v26.03

Cluster Lifecycle Management

Feature Introduction

Installation and deployment, as the starting point of the PAAS platform business, is an important part of measuring product capabilities. The tool uses a Cluster-API based approach to implement system installation and deployment, using K8s declarative API style to manage its own multi-K8s clusters, simplifying K8s cluster lifecycle management, optimizing cluster installation processes and management details, and making the installation process more automated. This document introduces business cluster installation guidance after installing the management cluster and cluster lifecycle management operations.

Application Scenarios

Users use the installation and deployment tool to create the openFuyao management plane and cluster, completing container platform installation and deployment.

Capability Scope

Perform installation and deployment of the entire container platform, supporting cluster lifecycle management of business clusters, including creation, deletion, scale-out, scale-in, and upgrade of business clusters.

Key Features

Based on the Cluster-API approach, using K8s declarative API style to manage its own multi-K8s clusters, simplifying K8s cluster lifecycle management and making the installation process more automated.

Implementation Principle

  • Overall Process
  1. Download the BKE installation tool and perform bootstrap node initialization.

  2. The bootstrap node pushes Agent to management cluster nodes, deploys the K8s cluster and Cluster-API, completing management cluster creation.

  3. Complete creation, uninstallation, dynamic scale-out/scale-in, and K8s version upgrade of business clusters on the openFuyao management plane (including dynamic scale-out/scale-in of the management cluster itself).

Note:

  • The bootstrap node is responsible for building the management cluster, the management cluster is responsible for building business clusters, and business clusters are responsible for running customer programs.
  • In the current installation and deployment tool, bootstrap node initialization and bootstrap node management cluster setup are executed by users through the command line, while management cluster creation of business clusters is managed through the openFuyao cluster lifecycle management interface on the management cluster.

Figure 1 Installation and Deployment Flowchart

The BKE components involved in the diagram consist of bkeadm, bkeagent, and cluster-api, specifically introduced as follows:

  • bkeadm

Only used as a binary file on the bootstrap node, it is the BKE installation tool. Main functions include: starting image registry and yaml registry, starting K3s cluster and installing cluster-api, submitting management cluster configuration.

  • bkeagent

A cloud-native host management tool responsible for listening to kube-apiserver instructions and executing them on the host. Main functions include: listening to CRD resources, reporting execution results; providing the capability to switch listening K8s clusters; built-in multiple tasks, such as: node environment check, docker installation, shell script execution, etc.

  • cluster-api

A controller with main functions including: distributing and starting bkeagent on all nodes; creating instructions to command bkeagent work; aggregating various deployment cluster results and updating them to bkecluster events.

Figure 2 System Logical View

After the frontend modifies and fills in cluster configuration options, the configuration file is submitted to the backend for processing, APIs are called, and submitted to the corresponding K8s APIserver for cluster management action execution.

None.

Code links:

https://gitcode.com/openfuyao/installer-website

https://gitcode.com/openfuyao/installer-service

Installation

Overall installation is divided into management cluster installation and business cluster installation. Detailed installation content for the management cluster can be found in Online Installation and Offline Installation. Business cluster installation can be found in the following sections.

Logging into the openFuyao Management Plane

Enter "https://bootstrap node login IP address:cluster lifecycle management web service port" in the browser, enter the username and password to log into the cluster lifecycle management page.

Note:

  • Password must be changed on first login.
  • The cluster lifecycle management web service port defaults to 30010.

Adding a Cluster

The following introduces adding cluster operations.

Prerequisites

  • Bootstrap node has been configured.

  • Management cluster installation is complete.

Background

Create business clusters on the cluster lifecycle management page.

Restrictions

  • Need to confirm correct node IP address, username, and password.

  • High availability cluster creation requires confirming the correct load balancer IP address.

Procedure

  1. Select "Cluster Lifecycle Management" in the left navigation bar of the openFuyao platform. The interface displays cluster lifecycle management list information, including "Cluster Name", "Status", "Node Count", etc.

  2. Click "+ Add Cluster" in the upper right of the list to enter the "Add Cluster" interface.

    Figure 3 Add Cluster

  3. Fill in the interface content.

Table 1 Interface Content Description

Area NameDescription
Basic InformationCreate a name for the cluster.
Master InformationControl plane node related information for the business cluster. Click to continue adding control plane nodes. Click the icon in the "Action" column to delete added nodes.
Node InformationNode information for business cluster nodes.
Advanced ConfigurationDefaults to configured image registry address and image registry IP address, as well as Chart repository address. For online installation, the default address is the online default repository address; for offline installation, the default address is the bootstrap node address configured during offline installation.
  1. Click "Create" below, a "Cluster Information" dialog box appears.

  2. After the cluster information scrolls to completion, click "OK".

  3. Return to the list interface, the "Status" column displays "Installing".

  4. After the "Status" column field updates, adding cluster is complete.

Note:
If "Cluster Information" displays error information, further diagnosis and resolution is needed.

Deleting a Cluster

The following introduces deleting cluster operations.

Prerequisites

Created clusters exist in the cluster lifecycle management list interface.

Background

Delete business clusters on the cluster lifecycle management page.

Restrictions

Currently only supports deletion of successfully created clusters. Failed creation clusters need to be troubleshot first, then manually deleted by logging into each node of the cluster. Manual deletion steps are as follows:

  1. Download and automatically install bkeadm on each node of the cluster.

    bash
    # Method 1: Quick download
    curl -sfL https://openfuyao.obs.cn-north-4.myhuaweicloud.com/openFuyao/bkeadm/releases/download/1.2.2/download.sh | bash
    bash
    # Method 2: Verify complete download of files
    ## Download download.sh script file
    curl -LO https://openfuyao.obs.cn-north-4.myhuaweicloud.com/openFuyao/bkeadm/releases/download/1.2.2/download.sh
    ## Download download.sh file checksum file and verify (optional), successful verification outputs -: OK, failed verification requires contacting openFuyao community maintainers to investigate
    curl -LO https://openfuyao.obs.cn-north-4.myhuaweicloud.com/openFuyao/bkeadm/releases/download/1.2.2/download.sh.sha256
    sha256sum -c <(cat download.sh.sha256) < download.sh
    ## Run download.sh file to download bke installation tool, checksum of installation tool will be verified during execution
    chmod +x download.sh && ./download.sh
  2. Execute configuration initialization.

    bash
    bke reset --all --mount

    Note: Currently bke reset operation may leave residual container mount paths, use df -h to check for overlay items, if they exist try bke reset uninstall again, or reboot to delete.

  3. Delete BKE files.

    bash
    rm -rf /bke
    which bke | xargs rm -rf

Procedure

  1. In the "Cluster Lifecycle Management" list interface, click button in the "Action" column and select "Delete".

  2. In the dialog box, check "Understood that this deletion is a deep deletion and cannot be undone."

  3. Click "Delete", the interface prompts "Deleting, will take some time please wait patiently", the "Status" column displays "Deleting".

  4. After the cluster information is cleared from the list, the cluster deletion operation is complete.

Note:
Deletion time is generally 10 to 20 minutes, specific time depends on network conditions.

Upgrading a Cluster

The following introduces upgrading cluster operations.

Prerequisites

Created clusters exist in the cluster lifecycle management list interface.

Background

Perform in-place K8s upgrade of business clusters on the cluster lifecycle management page.

Restrictions

  • Only supports upgrading K8s versions to higher versions, does not support K8s version rollback.

  • The default installed K8s version for the current version is v1.28.8, supports upgrading to v1.29.1.

Precautions

  • Upgrade is an irreversible operation, please operate carefully.

  • When upgrading a cluster, ensure the cluster is in healthy status. During upgrade, no operations on the cluster are recommended.

  • Upgrade is in-place, business is temporarily unavailable during the upgrade process.

Procedure

  1. Entry 1: In the "Cluster Lifecycle Management" list interface, click button in the "Action" column and select "Upgrade".

    Entry 2: In the "Cluster Details" interface upper right corner, click "Upgrade Cluster".

  2. The "Upgrade Cluster" window appears, select the upgrade version from the dropdown.

  3. Confirm information is correct, click "OK". Automatically returns to the list interface and the "Status" column displays "Upgrading", after the "Status" column updates, cluster upgrade is complete.

    After completing the upgrade operation, you can see the current cluster's latest K8s version in the "Kubernetes Version" column of the cluster list interface.

Scaling Out a Cluster

The following introduces scaling out cluster operations.

Prerequisites

Business cluster has been successfully created on the cluster lifecycle management page, and the cluster status is "Healthy".

Background

Scale out business clusters on the cluster lifecycle management page.

Restrictions

When scaling out a cluster, users need to ensure the nodes used are unused. When scaling out, only node nodes can be added.

Procedure

  1. In the "Cluster Lifecycle Management" list interface, click any "Cluster Name" to enter the "Node Details" interface.

Note:
Node list information in the node details interface is arranged with control plane nodes first, followed by worker nodes.

  1. In the upper right of the list information, click "+ Add Node".

Note:
Set new nodes are all worker nodes.

  1. Fill in the dialog box information, such as "Node Name", "Port", "IP Address", "Username", and "Password".
  • Click to continue adding nodes.

  • Click the in the "Action" column to delete added nodes.

  1. When complete, click "Next".

  2. The interface displays the "Cluster Information" window, after scrolling to completion click "OK", returns to the cluster lifecycle management list interface, and the corresponding cluster's "Status" column is "Installing".

  3. After the corresponding cluster's "Status" column updates, cluster scale-out is complete.

    • Scale-out failed: corresponding cluster's "Status" column is "Installation Failed".
    • Scale-out successful: corresponding cluster's "Status" column is "Healthy".

Note:
Before successful scale-out, the cluster is not stable, please do not perform other operations on the cluster.

Scaling In a Cluster

The following introduces scaling in cluster operations.

Prerequisites

Business cluster has been successfully created on the cluster lifecycle management page, and the cluster status is "Healthy".

Background

Scale in business clusters on the cluster lifecycle management page.

Restrictions

When scaling in a cluster, only supports deleting worker nodes, and deleted nodes cannot be recovered, please operate carefully.

Procedure

  1. In the "Cluster Lifecycle Management" list interface, click any "Cluster Name" to enter the "Node Details" interface.

  2. Select deletion method.

  • Batch Delete
  1. Click the checkbox before "Node Name" to select nodes to delete.

  2. Click "Delete" above the list.

  • Single Delete
  1. Select the node to delete.

  2. Click the icon in the "Action" column.

  3. In the dialog box, check "Understood that this deletion cannot be undone".

  4. Click "Delete" in the dialog box to complete deletion of the selected node.

Status Description

During cluster operations (installation, scale-out, upgrade), the status changes displayed in the cluster "Status" column on the page.

Installation

During installation, the overall status flow changes as follows:

  • When a cluster has only 1 worker node, the overall status changes are: Installing, Abnormal, Healthy.

  • When a cluster has multiple worker nodes, the overall status changes are: Installing, Scaling Out, Abnormal, Healthy.

Scale-out

During cluster scale-out, the overall status changes are: Installing, Scaling Out, Abnormal, Healthy.

Upgrade

During cluster upgrade, the overall status changes are: Upgrading, Abnormal, Healthy.

Actual Actions for Each "Status"

Table 2 Status and Description

StatusDescription
InstallingNode environment preparation, ClusterAPI docking, cluster certificate creation, cluster entry configuration, control plane node initialization and joining.
Scaling OutWorker nodes joining the cluster.
UpgradingControl plane nodes and worker nodes upgrading.
AbnormalDuring installation, scale-out, upgrade processes, cluster health check was performed, Pods are not all ready.
HealthyCluster is in healthy status.