Version Overview
Version Change Description
New Features
AI inference optimization: provides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, inference backend module, global KV cache management module, and PD disaggregation module. Compared with the polling performance baseline, the inference throughput is improved by 55% and the latency is reduced by 40%.
AI inference software suite: provides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models.
openFuyao Kubernetes: upgrades Kubernetes to 1.33 with multiple enhancements.
- High-density deployment: More than 1,000 Pods can be deployed on each node.
- Startup acceleration: kubelet supports CPU scale-up during service startup to accelerate Java program startup.
- Log enhancement: fuyao-log-runner supports log rotation and reliability enhancement.
- Certificate management: Kubernetes certificates support hot loading.
- PVC capacity expansion: The PVC template of a StatefulSet supports capacity expansion.
Multi-layer shelf: provides a consistency check tool.
Inherited Features
Table 1 Inherited features of openFuyao v25.09
| Feature Name | Change Description |
|---|---|
| Installation and deployment | 1. A service cluster node can be co-deployed with a bootstrap node, reducing resource dependencies. 2. The new-user management function is optimized on the bootstrap node management page, improving permission management. 3. Offline installation package creation is optimized. You can select the extensions to be contained in the offline installation package as required. |
| Colocation | 1. QoS assurance capability improvement: Multiple Rubik capabilities are supported, such as elastic throttling and asynchronous tiered memory reclamation. In addition, the system applies restrictions on memset to LS QoS-class Pods when NUMA affinity is configured. 2. Code refactoring: Some colocation repositories are merged, and the deployment structure of the colocation component suite is optimized. 3. The HTTPS port is used to collect kubelet indicators, improving security. |
Deleted Features
In v25.09, the deployment package related to the Installer installation mode is not updated. You can select the Cluster API installation mode.
Interface Change Description
None
Introduction to Features
Table 2 describes the main features of openFuyao v25.09. For details about the features, see User Guide.
Table 2 openFuyao feature list
| Category | Feature Name | Description |
|---|---|---|
| Basic platform functions | Installation and deployment | Interconnects with the standard Cluster API installation and deployment tool and supports one-click installation of service clusters. The management cluster provides multi-scenario interactive service cluster deployment capabilities on the unified management plane, including single-node or multi-node installation (including HA), online or offline installation, cluster scaling, and in-place Kubernetes upgrade. |
| Container orchestration core | Provides openFuyao Kubernetes, which is compatible with Kubernetes 1.33 and provides enhanced functions such as high-density deployment, startup acceleration, log enhancement, and certificate management enhancement. | |
| Management plane | Provides an out-of-the-box console, which offers the application market and supports functions such as application management, extension management, resource management, repository management, monitoring, alerting, user management, and command line interaction. | |
| Authentication and authorization | Provides a built-in OAuth2 server, which supports the OAuth 2.0 protocol and enables functions such as application authentication, authorization, password reset, and password policies. It also provides a unified authentication and authorization access solution for frontend and non-frontend applications. | |
| User management | Provides cross-cluster multi-user management capabilities and supports binding of users at the platform or cluster level to roles such as administrators, operators, and observers. | |
| Multi-cluster management | Allows you to upgrade a cluster to a management cluster to implement management federation of multiple clusters. | |
| Command line interaction | Provides a command-line interaction pop-up for cluster administrators on the cluster management plane, enabling the administrators to manage clusters directly by running backend kubectl commands in the console. | |
| Component installation and management | Application market | Supports browsing, searching, and deploying Helm-centric extensions and applications, and provides computing acceleration suites to unleash ultimate computing power. |
| Application management | Integrates Helm v3 (chart manager) to quickly deploy, upgrade, roll back, and uninstall applications. You can view Helm chart details, resources, logs, events, and monitoring information. | |
| Repository management | Provides a built-in Harbor repository to support upload and management of Helm charts. You can add and remove remote Harbor repositories and synchronize Helm charts from remote Harbor repositories. | |
| Extension management | Provides a dynamic, pluggable framework developed based on the ConsolePlugin custom resource definition (CRD), supporting the seamless integration of extension WebUIs into the openFuyao management plane. Helm charts enable quick deployment of extensions and facilitate operations such as upgrades, rollbacks, WebUI enabling and disabling, and uninstallation. Additionally, extensions can also be easily connected to the authentication and authorization system of the platform to ensure security, achieving a plug-and-play capability for extensions. | |
| Kubernetes native resource management | Resource management | Includes all core resources and CRDs of Kubernetes, facilitating resource management (addition, deletion, query, and modification) for users. |
| Event management | Reflects changes of Kubernetes native resources such as Pods, Deployments, and StatefulSets. | |
| RBAC management | Implements permission control on cluster resources by setting service accounts, roles, and role bindings. | |
| Computing power scheduling optimization | Colocation | Supports colocation, ensuring that online workloads are prioritized in scheduling during peak hours and offline workloads are throttled, while enabling offline workloads to use oversold resources during off-peak hours of online workloads. This feature improves the overall cluster resource utilization by 30% to 50%, without significantly affecting QoS and with jitter lower than 5%. |
| NUMA-aware scheduling | Implements hardware NUMA topology awareness at the cluster and node levels and performs NUMA-aware scheduling for applications based on NUMA affinity to improve application performance. The average throughput is improved by 30%. For example, the performance of Redis is improved by 30% on average. | |
| Multi-core scheduling | Implements service type–based anti-affinity scheduling and multidimensional resource scoring at the cluster level. The container deployment density is improved by 10% with the performance deterioration less than 5%. | |
| Ray | Provides a solution that offers high usability, high performance, and high compute resource utilization for Ray in cloud-native scenarios. It supports full lifecycle management of Ray clusters and jobs, reduces O&M costs, enhances cluster observability, fault locating, and optimization practices, and implements efficient computing power scheduling and management. | |
| Automatic hardware management | KAE-Operator | Implements minute-level automatic management of Kunpeng Accelerator Engine (KAE) hardware, including KAE hardware feature discovery, and automatic management and installation of components such as drivers, firmware, and hardware device plug-ins. Full deployment to a usable state can be achieved within 5 minutes. |
| NPU operator | Implements minute-level automatic management of Ascend NPU hardware, including NPU hardware feature discovery, and automatic management and installation of components such as drivers, firmware, hardware device plug-ins, indicator collection, and cluster scheduling. Full deployment to a usable state can be achieved within 10 minutes. | |
| Observability | Monitoring | Provides out-of-the-box indicator collection and visualized display capabilities, supports monitoring of resources such as clusters, nodes, and workloads, and provides out-of-the-box dashboards. |
| Custom dashboard | Allows users to customize monitoring indicators based on their service requirements to implement precise data observation and analysis. | |
| Logging | Collects various types of logs in the cluster, allows users to view and download logs, and reports alerts based on preset alert rules. | |
| Alerting | Monitors the cluster status and triggers alerts when specific conditions are met. This way, issues can be detected in a timely manner, and necessary measures can be taken to ensure system stability and reliability. | |
| AI inference | AI inference optimization | Provides an end-to-end acceleration solution in the AI inference scenario, incorporating the intelligent routing module, global KV cache management module, and PD disaggregation module. The inference performance is improved. |
| AI inference software suite | Provides an integrated solution for AI appliances, supporting integration of full-stack foundational LLM inference and DeepSeek. It is out-of-the-box and scalable, and adapts to certain NPU and GPU hardware models. |