Version: v25.09

Version Overview

Version Change Description

New Features

AI inference optimization: provides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, inference backend module, global KV cache management module, and PD disaggregation module. Compared with the polling performance baseline, the inference throughput is improved by 55% and the latency is reduced by 40%.

AI inference software suite: provides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models.

openFuyao Kubernetes: upgrades Kubernetes to 1.33 with multiple enhancements.

High-density deployment: More than 1,000 Pods can be deployed on each node.
Startup acceleration: kubelet supports CPU scale-up during service startup to accelerate Java program startup.
Log enhancement: fuyao-log-runner supports log rotation and reliability enhancement.
Certificate management: Kubernetes certificates support hot loading.
PVC capacity expansion: The PVC template of a StatefulSet supports capacity expansion.

Multi-layer shelf: provides a consistency check tool.

Inherited Features

Table 1 Inherited features of openFuyao v25.09

Feature Name	Change Description
Installation and deployment	1. A service cluster node can be co-deployed with a bootstrap node, reducing resource dependencies. 2. The new-user management function is optimized on the bootstrap node management page, improving permission management. 3. Offline installation package creation is optimized. You can select the extensions to be contained in the offline installation package as required.
Colocation	1. QoS assurance capability improvement: Multiple Rubik capabilities are supported, such as elastic throttling and asynchronous tiered memory reclamation. In addition, the system applies restrictions on memset to LS QoS-class Pods when NUMA affinity is configured. 2. Code refactoring: Some colocation repositories are merged, and the deployment structure of the colocation component suite is optimized. 3. The HTTPS port is used to collect kubelet indicators, improving security.

Feature Name

Change Description

Installation and deployment

1. A service cluster node can be co-deployed with a bootstrap node, reducing resource dependencies.
2. The new-user management function is optimized on the bootstrap node management page, improving permission management.
3. Offline installation package creation is optimized. You can select the extensions to be contained in the offline installation package as required.

Colocation

1. QoS assurance capability improvement: Multiple Rubik capabilities are supported, such as elastic throttling and asynchronous tiered memory reclamation. In addition, the system applies restrictions on memset to LS QoS-class Pods when NUMA affinity is configured.
2. Code refactoring: Some colocation repositories are merged, and the deployment structure of the colocation component suite is optimized.
3. The HTTPS port is used to collect kubelet indicators, improving security.

Deleted Features

In v25.09, the deployment package related to the Installer installation mode is not updated. You can select the Cluster API installation mode.

Interface Change Description

None

Introduction to Features

Table 2 describes the main features of openFuyao v25.09. For details about the features, see User Guide.

Table 2 openFuyao feature list

Category	Feature Name	Description
Basic platform functions	Installation and deployment	Interconnects with the standard Cluster API installation and deployment tool and supports one-click installation of service clusters. The management cluster provides multi-scenario interactive service cluster deployment capabilities on the unified management plane, including single-node or multi-node installation (including HA), online or offline installation, cluster scaling, and in-place Kubernetes upgrade.
	Container orchestration core	Provides openFuyao Kubernetes, which is compatible with Kubernetes 1.33 and provides enhanced functions such as high-density deployment, startup acceleration, log enhancement, and certificate management enhancement.
	Management plane	Provides an out-of-the-box console, which offers the application market and supports functions such as application management, extension management, resource management, repository management, monitoring, alerting, user management, and command line interaction.
	Authentication and authorization	Provides a built-in OAuth2 server, which supports the OAuth 2.0 protocol and enables functions such as application authentication, authorization, password reset, and password policies. It also provides a unified authentication and authorization access solution for frontend and non-frontend applications.
	User management	Provides cross-cluster multi-user management capabilities and supports binding of users at the platform or cluster level to roles such as administrators, operators, and observers.
	Multi-cluster management	Allows you to upgrade a cluster to a management cluster to implement management federation of multiple clusters.
	Command line interaction	Provides a command-line interaction pop-up for cluster administrators on the cluster management plane, enabling the administrators to manage clusters directly by running backend kubectl commands in the console.
Component installation and management	Application market	Supports browsing, searching, and deploying Helm-centric extensions and applications, and provides computing acceleration suites to unleash ultimate computing power.
	Application management	Integrates Helm v3 (chart manager) to quickly deploy, upgrade, roll back, and uninstall applications. You can view Helm chart details, resources, logs, events, and monitoring information.
	Repository management	Provides a built-in Harbor repository to support upload and management of Helm charts. You can add and remove remote Harbor repositories and synchronize Helm charts from remote Harbor repositories.
	Extension management	Provides a dynamic, pluggable framework developed based on the ConsolePlugin custom resource definition (CRD), supporting the seamless integration of extension WebUIs into the openFuyao management plane. Helm charts enable quick deployment of extensions and facilitate operations such as upgrades, rollbacks, WebUI enabling and disabling, and uninstallation. Additionally, extensions can also be easily connected to the authentication and authorization system of the platform to ensure security, achieving a plug-and-play capability for extensions.
Kubernetes native resource management	Resource management	Includes all core resources and CRDs of Kubernetes, facilitating resource management (addition, deletion, query, and modification) for users.
	Event management	Reflects changes of Kubernetes native resources such as Pods, Deployments, and StatefulSets.
	RBAC management	Implements permission control on cluster resources by setting service accounts, roles, and role bindings.
Computing power scheduling optimization	Colocation	Supports colocation, ensuring that online workloads are prioritized in scheduling during peak hours and offline workloads are throttled, while enabling offline workloads to use oversold resources during off-peak hours of online workloads. This feature improves the overall cluster resource utilization by 30% to 50%, without significantly affecting QoS and with jitter lower than 5%.
	NUMA-aware scheduling	Implements hardware NUMA topology awareness at the cluster and node levels and performs NUMA-aware scheduling for applications based on NUMA affinity to improve application performance. The average throughput is improved by 30%. For example, the performance of Redis is improved by 30% on average.
	Multi-core scheduling	Implements service type–based anti-affinity scheduling and multidimensional resource scoring at the cluster level. The container deployment density is improved by 10% with the performance deterioration less than 5%.
	Ray	Provides a solution that offers high usability, high performance, and high compute resource utilization for Ray in cloud-native scenarios. It supports full lifecycle management of Ray clusters and jobs, reduces O&M costs, enhances cluster observability, fault locating, and optimization practices, and implements efficient computing power scheduling and management.
Automatic hardware management	KAE-Operator	Implements minute-level automatic management of Kunpeng Accelerator Engine (KAE) hardware, including KAE hardware feature discovery, and automatic management and installation of components such as drivers, firmware, and hardware device plug-ins. Full deployment to a usable state can be achieved within 5 minutes.
Automatic hardware management	NPU operator	Implements minute-level automatic management of Ascend NPU hardware, including NPU hardware feature discovery, and automatic management and installation of components such as drivers, firmware, hardware device plug-ins, indicator collection, and cluster scheduling. Full deployment to a usable state can be achieved within 10 minutes.
Observability	Monitoring	Provides out-of-the-box indicator collection and visualized display capabilities, supports monitoring of resources such as clusters, nodes, and workloads, and provides out-of-the-box dashboards.
	Custom dashboard	Allows users to customize monitoring indicators based on their service requirements to implement precise data observation and analysis.
	Logging	Collects various types of logs in the cluster, allows users to view and download logs, and reports alerts based on preset alert rules.
	Alerting	Monitors the cluster status and triggers alerts when specific conditions are met. This way, issues can be detected in a timely manner, and necessary measures can be taken to ensure system stability and reliability.
AI inference	AI inference optimization	Provides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, global KV cache management module, and PD disaggregation module. The inference performance is improved.
	AI inference software suite	Provides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models.

Version Change Description​

Interface Change Description​

Introduction to Features​

Version Change Description

Interface Change Description

Introduction to Features