Skip to main content
Version: v25.09

Version Overview

Version Change Description

New Features

AI inference optimization: provides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, inference backend module, global KV cache management module, and PD disaggregation module. Compared with the polling performance baseline, the inference throughput is improved by 55% and the latency is reduced by 40%.

AI inference software suite: provides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models.

openFuyao Kubernetes: upgrades Kubernetes to 1.33 with multiple enhancements.

  • High-density deployment: More than 1,000 Pods can be deployed on each node.
  • Startup acceleration: kubelet supports CPU scale-up during service startup to accelerate Java program startup.
  • Log enhancement: fuyao-log-runner supports log rotation and reliability enhancement.
  • Certificate management: Kubernetes certificates support hot loading.
  • PVC capacity expansion: The PVC template of a StatefulSet supports capacity expansion.

Multi-layer shelf: provides a consistency check tool.

Inherited Features

Table 1 Inherited features of openFuyao v25.09

Feature NameChange Description
Installation and deployment1. A service cluster node can be co-deployed with a bootstrap node, reducing resource dependencies.
2. The new-user management function is optimized on the bootstrap node management page, improving permission management.
3. Offline installation package creation is optimized. You can select the extensions to be contained in the offline installation package as required.
Colocation1. QoS assurance capability improvement: Multiple Rubik capabilities are supported, such as elastic throttling and asynchronous tiered memory reclamation. In addition, the system applies restrictions on memset to LS QoS-class Pods when NUMA affinity is configured.
2. Code refactoring: Some colocation repositories are merged, and the deployment structure of the colocation component suite is optimized.
3. The HTTPS port is used to collect kubelet indicators, improving security.

Deleted Features

In v25.09, the deployment package related to the Installer installation mode is not updated. You can select the Cluster API installation mode.

Interface Change Description

None

Introduction to Features

Table 2 describes the main features of openFuyao v25.09. For details about the features, see User Guide.

Table 2 openFuyao feature list

CategoryFeature NameDescription
Basic platform functionsInstallation and deploymentInterconnects with the standard Cluster API installation and deployment tool and supports one-click installation of service clusters. The management cluster provides multi-scenario interactive service cluster deployment capabilities on the unified management plane, including single-node or multi-node installation (including HA), online or offline installation, cluster scaling, and in-place Kubernetes upgrade.
Container orchestration coreProvides openFuyao Kubernetes, which is compatible with Kubernetes 1.33 and provides enhanced functions such as high-density deployment, startup acceleration, log enhancement, and certificate management enhancement.
Management planeProvides an out-of-the-box console, which offers the application market and supports functions such as application management, extension management, resource management, repository management, monitoring, alerting, user management, and command line interaction.
Authentication and authorizationProvides a built-in OAuth2 server, which supports the OAuth 2.0 protocol and enables functions such as application authentication, authorization, password reset, and password policies. It also provides a unified authentication and authorization access solution for frontend and non-frontend applications.
User managementProvides cross-cluster multi-user management capabilities and supports binding of users at the platform or cluster level to roles such as administrators, operators, and observers.
Multi-cluster managementAllows you to upgrade a cluster to a management cluster to implement management federation of multiple clusters.
Command line interactionProvides a command-line interaction pop-up for cluster administrators on the cluster management plane, enabling the administrators to manage clusters directly by running backend kubectl commands in the console.
Component installation and managementApplication marketSupports browsing, searching, and deploying Helm-centric extensions and applications, and provides computing acceleration suites to unleash ultimate computing power.
Application managementIntegrates Helm v3 (chart manager) to quickly deploy, upgrade, roll back, and uninstall applications. You can view Helm chart details, resources, logs, events, and monitoring information.
Repository managementProvides a built-in Harbor repository to support upload and management of Helm charts. You can add and remove remote Harbor repositories and synchronize Helm charts from remote Harbor repositories.
Extension managementProvides a dynamic, pluggable framework developed based on the ConsolePlugin custom resource definition (CRD), supporting the seamless integration of extension WebUIs into the openFuyao management plane. Helm charts enable quick deployment of extensions and facilitate operations such as upgrades, rollbacks, WebUI enabling and disabling, and uninstallation. Additionally, extensions can also be easily connected to the authentication and authorization system of the platform to ensure security, achieving a plug-and-play capability for extensions.
Kubernetes native resource managementResource managementIncludes all core resources and CRDs of Kubernetes, facilitating resource management (addition, deletion, query, and modification) for users.
Event managementReflects changes of Kubernetes native resources such as Pods, Deployments, and StatefulSets.
RBAC managementImplements permission control on cluster resources by setting service accounts, roles, and role bindings.
Computing power scheduling optimizationColocationSupports colocation, ensuring that online workloads are prioritized in scheduling during peak hours and offline workloads are throttled, while enabling offline workloads to use oversold resources during off-peak hours of online workloads. This feature improves the overall cluster resource utilization by 30% to 50%, without significantly affecting QoS and with jitter lower than 5%.
NUMA-aware schedulingImplements hardware NUMA topology awareness at the cluster and node levels and performs NUMA-aware scheduling for applications based on NUMA affinity to improve application performance. The average throughput is improved by 30%. For example, the performance of Redis is improved by 30% on average.
Multi-core schedulingImplements service type–based anti-affinity scheduling and multidimensional resource scoring at the cluster level. The container deployment density is improved by 10% with the performance deterioration less than 5%.
RayProvides a solution that offers high usability, high performance, and high compute resource utilization for Ray in cloud-native scenarios. It supports full lifecycle management of Ray clusters and jobs, reduces O&M costs, enhances cluster observability, fault locating, and optimization practices, and implements efficient computing power scheduling and management.
Automatic hardware managementKAE-OperatorImplements minute-level automatic management of Kunpeng Accelerator Engine (KAE) hardware, including KAE hardware feature discovery, and automatic management and installation of components such as drivers, firmware, and hardware device plug-ins. Full deployment to a usable state can be achieved within 5 minutes.
NPU operatorImplements minute-level automatic management of Ascend NPU hardware, including NPU hardware feature discovery, and automatic management and installation of components such as drivers, firmware, hardware device plug-ins, indicator collection, and cluster scheduling. Full deployment to a usable state can be achieved within 10 minutes.
ObservabilityMonitoringProvides out-of-the-box indicator collection and visualized display capabilities, supports monitoring of resources such as clusters, nodes, and workloads, and provides out-of-the-box dashboards.
Custom dashboardAllows users to customize monitoring indicators based on their service requirements to implement precise data observation and analysis.
LoggingCollects various types of logs in the cluster, allows users to view and download logs, and reports alerts based on preset alert rules.
AlertingMonitors the cluster status and triggers alerts when specific conditions are met. This way, issues can be detected in a timely manner, and necessary measures can be taken to ensure system stability and reliability.
AI inferenceAI inference optimizationProvides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, global KV cache management module, and PD disaggregation module. The inference performance is improved.
AI inference software suiteProvides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models.