Multi-Core Scheduling
Feature Overview
The rapid evolution of data center server chips to multi-core architectures (256 cores or more) is increasingly exacerbating core resource contention in multi-service hybrid deployment scenarios. Typical challenges are as follows:
- Intensified lock contention: When the number of cores exceeds 256, frequent access to shared kernel resources (such as file systems, network stacks, and system calls) leads to intensified lock contention. This results in significant global contention for synchronization primitives (such as mutexes and semaphores), causing large-scale context switching overheads.
- Resource mismatch: While core density grows, memory bandwidth and I/O throughput do not scale proportionally. In high-density hybrid deployment scenarios, computing-intensive services and memory-sensitive services are prone to resource interference.
- Deployment density bottleneck: To maintain QoS for critical services, the container deployment density decreases by less than 20% in ultra-large-scale core scenarios using traditional scheduling policies. As a result, the hardware resource utilization deteriorates severely.
In a multi-core cluster, resources are unevenly allocated among nodes. Some nodes experience heavy I/O contention while others face memory contention. To address this issue, the balanced scheduling method is used at the cluster level to schedule services of the same type to different nodes.
This feature supports explicit labeling of service characteristics (I/O-intensive, memory-sensitive, and computing-intensive) for workloads. The scheduler uses the multi-dimensional weighted scoring algorithm to dynamically select the target node with the minimum resource contention. The scoring dimensions include the proportion of resource requests of the same-type services and the available resources on the node.
Applicable Scenarios
If a service type is configured during workload deployment, the workload is scheduled to the most appropriate node based on the service type.
Supported Capabilities
- Error request interception using Webhook: A foolproof mechanism is added to intercept requests with invalid service type fields based on a trustlist and return a message.
- Balanced scheduling: Workloads are scheduled to the most appropriate nodes based on service types.
Highlights
Anti-affinity scheduling is performed based on service types.
Implementation Principles
Workload Type Configuration
You can explicitly specify the service type through annotations in the YAML file of a workload. The system supports the following service types:
- I/O-intensive
- Memory-sensitive
- Computing-intensive
Configuration example:
apiVersion: v1
kind: Pod
metadata:
annotations:
# Core business type label
business.workload/type: "io-intensive" # Optional: io-intensive/memory-sensitive/compute-intensive
In addition, this policy supports configuring multiple service types for a workload. The following is an example:
apiVersion: v1
kind: Pod
metadata:
annotations:
business.workload/type: "io-intensive,memory-sensitive" # Separate multiple business types by commas.
This scheduling policy selects the node with the fewest workloads of the same service type by applying a weighted scoring mechanism based on the service type.
For different service types, the scheduler places priorities on different resource dimensions:
-
I/O-intensive
type.score = 0.6(1 – DiskIOUsage) + 0.2(1 – CPUUsage) + 0.2(1 – MemUsage)
-
Memory-sensitive
type.score = 0.6(1 – MemUsage) + 0.2(1 – CPUUsage) + 0.2(1 – DiskIOUsage)
-
Computing-intensive
type.score = 0.6(1 – CPUUsage) + 0.2(1 – MemUsage) + 0.2(1 – DiskIOUsage)
Variable description:
-
DiskIOUsage: disk I/O usage
-
MemUsage: memory usage
-
CPUUsage: CPU usage
Hybrid Scheduling Calculation for Multiple Service Types
If multiple service type labels are configured for a workload, the total score is computed by aggregating the scores from each configured policy:
NOTE
type.weight: policy weight. The default value is 1.
Scheduling Decision-Making
The system schedules the workload to the node with the highest Balanced.score.
Related Features
This feature is an extended scheduling policy of Volcano and must be used with Volcano.
Instances
Code link: https://gitcode.com/openFuyao/many-core-scheduler
Installation
This feature is an extended scheduling policy of Volcano and is installed with volcano-config-service during deployment.
Prerequisites
- Kubernetes 1.21 or later has been deployed.
- containerd 1.7 or later has been deployed.
Procedure
- In the left navigation pane of the openFuyao platform, choose Application Market > Applications. The Applications page is displayed.
- Select Extension in the Type filter on the left to view all extensions. Alternatively, you can enter volcano in the search box to search for the component.
- Click the volcano-config-service card. The details page for the scheduling extension is displayed.
- Click Deploy. The Deploy page is displayed.
- Enter the application name and select the desired installation version and namespace.
- Enter the values to be deployed in Values.yaml.
- Click Deploy.
- Select Extension in the Type filter on the left to view all extensions. Alternatively, enter many-core in the search box to search for the component.
- Click the many-core-scheduler card. The details page for the scheduling extension is displayed.
- Click Deploy. The Deploy page is displayed.
- Enter the application name and select the desired installation version and namespace.
- Enter the values to be deployed in Values.yaml.
- Click OK. The components related to load balancing scheduling are deployed.
- In the left navigation pane, click Extension Management to manage the scheduling component.
Standalone Deployment
In addition to installation and deployment through the application market, this component also supports standalone deployment. The procedure is as follows:
-
Pull the image.
helm pull oci://harbor.openfuyao.com/openfuyao-catalog/charts/volcano-config-service --version xxx
helm pull oci://harbor.openfuyao.com/openfuyao-catalog/charts/many-core-scheduler --version xxxReplace xxx with the version of the Helm image to be pulled, for example, 0.0.0-latest.
-
Decompress the installation package.
tar -zxvf volcano-config-service-xxx.tgz
tar -zxvf many-core-scheduler-xxx.tgz -
Disable openFuyao and OAuth.
vim volcano-config-service/charts/volcano-config-website/values.yamlSet the enableOAuth and openFuyao options to false.
-
Install the component.
helm install volcano-config-service ./volcano-config-service/
helm install many-core-scheduler ./many-core-scheduler/
Enabling the Scheduling Policy
Prerequisites
None.
Context
You need to modify the scheduling policies of a scheduling extension already deployed.
Restrictions
The scheduling extension that supports scheduling policy configuration has been deployed.
Procedure
-
In the left navigation pane, choose Resource Management > Configuration and Keys > ConfigMap to add a balance policy by configuring the YAML file.
apiVersion: v1
data:
volcano-scheduler.conf: |
actions: "enqueue, allocate, backfill"
tiers:
- plugins:
- name: balance # Add the balance policy
- name: priority
- name: gang
enablePreemptable: false
- name: conformance
- plugins:
- name: overcommit
- name: drf
enablePreemptable: false
- name: predicates
- name: proportion
- name: nodeorder
- name: binpack -
In the left navigation pane, choose Resource Management > Workloads > Pod and restart the Volcano scheduler pod for the new scheduling configuration to take effect.
-
In the left navigation pane, choose Resource Management > Workloads > Pod to create a pod by configuring the YAML file. The following is an example. Add the service type label to annotations and specify Volcano as the scheduler.
apiVersion: v1
kind: Pod
metadata:
annotations:
# Core business type label
business.workload/type: "io-intensive"
spec:
schedulerName: volcano
containers: xxx -
Use the YAML file to create a workload and schedule the workload based on the configured service type.