Movatterモバイル変換


[0]ホーム

URL:


CN115242786A - Multi-mode big data job scheduling system and method based on container cluster - Google Patents

Multi-mode big data job scheduling system and method based on container cluster
Download PDF

Info

Publication number
CN115242786A
CN115242786ACN202210491445.9ACN202210491445ACN115242786ACN 115242786 ACN115242786 ACN 115242786ACN 202210491445 ACN202210491445 ACN 202210491445ACN 115242786 ACN115242786 ACN 115242786A
Authority
CN
China
Prior art keywords
big data
container cluster
middleware
resource
job
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210491445.9A
Other languages
Chinese (zh)
Other versions
CN115242786B (en
Inventor
谢冬鸣
黄进军
廖子南
黄林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongyun Ruilian Wuhan Computing Technology Co ltd
Original Assignee
Dongyun Ruilian Wuhan Computing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongyun Ruilian Wuhan Computing Technology Co ltdfiledCriticalDongyun Ruilian Wuhan Computing Technology Co ltd
Priority to CN202210491445.9ApriorityCriticalpatent/CN115242786B/en
Publication of CN115242786ApublicationCriticalpatent/CN115242786A/en
Application grantedgrantedCritical
Publication of CN115242786BpublicationCriticalpatent/CN115242786B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention provides a multi-mode big data job scheduling system and a method based on a container cluster, wherein the system comprises at least one logic middleware, a resource management middleware and a container cluster; by introducing the container cluster arrangement intermediate layer, large data jobs and clusters in multiple modes are scheduled by using a uniform scheduling mode, and the large data jobs in the multiple modes are managed by adopting a uniform life cycle, the technical problem that the clusters are difficult to uniformly manage because each large data job scheduling platform and service are independent under the existing multi-mode clusters is solved.

Description

Multi-mode big data job scheduling system and method based on container cluster
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a multi-mode big data job scheduling system and method based on a container cluster.
Background
In recent years, big data analysis and processing techniques have been widely used in various industries. The typical data analysis and processing methods include conventional Hadoop, spark batch processing, spark Streaming, and Hive query-based. These work patterns are different in purpose, working manner and principle, and therefore, companies or organizations often face the following problems: under the multi-mode cluster, each big data job scheduling platform and service are independent, and the cluster is difficult to uniformly manage; big data operation management and scheduling under multiple modes are inconsistent, and the life cycles of operation in the modes are different; the resource utilization rate is low, and the cluster resource idle rate is high; the large-scale distributed operation has a communication performance bottleneck, and a communication protocol based on a TCP/IP network is usually limited by the bandwidth of an ethernet card.
Disclosure of Invention
In order to solve the above technical problems mentioned in the background art, the present invention provides a container cluster-based multi-mode big data job scheduling method and apparatus, and aims to improve innovativeness of traditional multi-mode big data calculation framework scheduling.
The invention provides a multi-mode big data job scheduling system based on a container cluster, which comprises at least one logic middleware, a resource management middleware and a container cluster;
the logic middleware is used for receiving a big data job request of a client by using a uniform API (application program interface), wherein the big data job request comprises basic information of the big data job;
the resource management middleware is used for carrying out mode-related adaptation and format processing according to the basic information of the big data operation, converting the mode-related adaptation and format processing into a resource description file supported by a container cluster and submitting the resource description file to the container cluster;
the container cluster is used for processing the scheduling of the operation of the corresponding mode by using an Operator which is deployed in advance.
Preferably, the logic middleware is used for providing a uniform programming interface for the multi-mode big data framework; the logic middleware is also used for providing uniform abstraction for the life cycle of the multi-mode big data frame operation.
Preferably, the resource management middleware is configured to provide resource layer configuration for the multi-mode big data framework;
the resource management middleware is used for being responsible for scheduling adaptation to the container cluster aiming at the multi-mode big data framework;
the resource management middleware is also used for managing the difference, configuration and conversion of the life cycle of the multi-mode big data frame operation to be unified abstraction, and feeding back the unified abstraction to the logic middleware.
Preferably, the container cluster is used for providing a containerized running environment for the multi-mode big data job, wherein a scheduler supporting multiple big data framework modes is deployed on the container cluster.
Preferably, the resource management middleware is further configured to provide resource allocation of a high performance network for big data frame jobs;
accordingly, each working node of the container cluster comprises an RDMA network card, and the container cluster is used for providing high-performance network communication among containers in a high-performance network mode.
In addition, in order to achieve the above object, the present invention provides a container cluster-based multi-mode big data job scheduling method, including the following steps:
the logic middleware receives a client operation request by a uniform API (application programming interface), wherein the big data operation request comprises basic information of the big data operation;
performing mode-related adaptation and format processing by the resource management middleware according to the basic information of the big data operation, converting the mode-related adaptation and format processing into a resource description file supported by the container cluster, and submitting the resource description file to the container cluster;
and processing the scheduling of the corresponding mode job by the container cluster by using a pre-deployed Operator.
Preferably, the step of performing, by the resource management middleware, mode-related adaptation and format processing according to the basic information of the big data job, converting the resource description file into a resource description file supported by a container cluster, and submitting the resource description file to the container cluster includes:
converting the request into a uniform abstract model by the logic middleware and submitting the uniform abstract model to the resource management middleware;
converting the uniform operation abstract model into resource allocation of a corresponding big data scheduling plug-in on a container cluster by the resource management middleware according to a concrete big data frame;
the resource management middleware sends the converted job resource request to the container cluster;
correspondingly, the step of using a pre-deployed Operator to process scheduling of the corresponding mode job by the container cluster specifically includes:
creating a container operation job by the container cluster according to the parameters in the job resource request by using a corresponding Operator;
and querying and acquiring corresponding operation running conditions from the container cluster by the resource manager, converting the operation state into a unified operation abstract life state, and feeding back the unified operation abstract life state to the logic middleware.
Optionally, the scheduling method further includes:
the logic middleware receives a stop request of a client to the big data frame operation through the uniform API;
the logic middleware converts the stop request into a uniform abstract model and submits the uniform abstract model to the resource management middleware;
the resource management middleware converts the unified operation model into the resource configuration of the corresponding big data scheduling plug-in on the container cluster according to the concrete big data frame abstract model;
sending, by the resource management middleware, the converted job resource request to the container cluster;
the container cluster uses a corresponding Operator to stop corresponding big data operation according to parameters in the operation resource request, and releases container cluster resources;
and the resource management middleware acquires a stop result of the big data operation from the container cluster, converts the stop result into a life state of uniform operation abstraction and feeds the life state back to the logic middleware.
Optionally, the scheduling method further includes:
the logic middleware receives a query request of a client for the operation state of the big data frame through the uniform API;
converting the query request into a uniform abstract model by the logic middleware and submitting the uniform abstract model to the resource management middleware;
the resource management middleware converts a unified operation model into resource allocation of a corresponding big data scheduling plug-in on a container cluster according to a specific big data framework;
the resource management middleware sends the converted job resource request to the container cluster to acquire a resource state;
and converting the operation state acquired from the container cluster into a life state of unified operation abstraction by the resource management middleware, and feeding back the life state to the logic middleware.
Preferably, the resource manager sets high performance network parameters in the mode dependent job resource configuration; each working node of the container cluster comprises an RDMA network card;
correspondingly, the scheduling method further comprises the following steps:
the logic middleware provides the client with uniform operation template parameters to declare the requirement on the high-performance network resources;
configuring high-performance network resources for the operation by the resource middleware and submitting the high-performance network resources to the operation cluster;
and allocating corresponding high-performance network resources for the operation when the operation is operated by the operation cluster.
The invention has the beneficial effects that: the invention provides a multi-mode big data job scheduling system and method based on a container cluster.
Drawings
FIG. 1 is a schematic diagram of an architecture of a container cluster-based multi-mode big data job scheduling method and system according to an embodiment of the present application;
FIG. 2 is a schematic overall workflow diagram of a container-based multi-mode big data job scheduling system according to an embodiment of the present application;
fig. 3 is a schematic diagram of a job management logic middleware according to an embodiment of the present application.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.
In order to make the technical scheme of the invention more clear, the invention is further described in detail with reference to the attached drawings and the embodiment. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention.
Noun explanation
In the context of the high performance network described herein, the following nouns are meant to be described as follows.
RDMA (Remote Direct Memory Access): remote direct memory access technique, a technique of high
The performance network communication protocol has the technical characteristics of high bandwidth and low delay. The network card supporting this protocol is called an RDMA network card.
InfiniBand: infiniband networks, a generic term for a class of high-performance network hardware (including network cards, switches, communication links, etc.) and corresponding software stacks. The network supports the RDMA protocol and thus belongs to
One type of RDMA network.
RoCE (RDMA over Converged Ethernet): an RDMA network over an aggregated ethernet network,
a high-performance network hardware (including network card, switch, communication link, etc.) and a software stack. This network supports the RDMA protocol and thus belongs to one type of RDMA network.
RDMA communication memory: RDMA data communication capable of meeting technical requirements of RDMA network
The memory space of the message. It is physically equivalent to ordinary memory space, but needs to satisfy "page locks"
And the like.
Kubernets Operator is a method for packaging, deploying and managing Kubernets application, and the same is true for the Kubernets application
The time is also an application specific controller, and the functions of the Kubernets API can be extended, and complex container application choreographies can be created, configured and managed by using custom resources.
It can be appreciated that conventional multi-mode big data computation scheduling mechanisms are generally as follows:
1. the big data frame of each mode is an independent cluster, such as a single Spark cluster and a single Flink cluster.
2. The lifecycle of the job in each mode is different and is defined by the mode itself.
3. Traditional big data frameworks are typically deployed and run on physical and virtual machine nodes.
4. The traditional big data frame network layer usually works based on TCP/IP network
The mechanism provided by the invention comprises a multi-mode big data job scheduling system and method based on container clusters. The container cluster-based multi-mode big data job scheduling system comprises at least one container cluster, a logic middleware and a resource management middleware, and preferably, the system can comprise a set of high-performance networks such as RDMA; the multi-mode big data job scheduling method based on the container cluster at least comprises a job starting method, a job stopping method and a job state inquiring method.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating a container cluster-based multi-mode big data job scheduling method and system according to an embodiment of the present disclosure. The user submits a unified big data job request through a REST API interface provided by the logical middleware. Wherein the job request includes basic information of the job. And the resource management middleware carries out mode-related adaptation and format processing according to the job information carried by the job request, converts the job information into a resource description file supported by the container cluster and submits the resource description file to the container cluster, and the container cluster uses a pre-deployed Operator to process the scheduling of the job in the corresponding mode.
Referring to fig. 1, an important feature of the embodiment of the present application is to use a container cluster to run a multi-mode big data job. In this embodiment, batch, streaming and ad hoc query-based big data jobs represented by Spark, flight and Hive are executed on a kubernets container cluster.
Referring to fig. 2, fig. 2 is a flowchart illustrating an overall workflow of a container-based multi-mode big data job scheduling system according to an embodiment of the present application, where the overall workflow is as follows:
1. client submits REST request of big data frame job scheduling in certain mode to logic middleware
2. The logic middleware obtains parameters from the request and converts the parameters into a unified big data frame operation model
3. The logical middleware uses the REST API to submit job requests to the resource management middleware
4. The resource management middleware analyzes the operation mode and the operation parameters according to the acquired operation request
5. The resource manager converts the operation parameters into the Yaml format of the corresponding mode operation according to the analyzed operation parameters
6. Resource manager submits Yaml to container cluster Kubernets
7. The container cluster uses a pre-deployed Operator component to process a resource request corresponding to the Yaml and carry out operation scheduling; starting operation container operation
8. Resource management middleware requests the running state of submitted jobs from a container cluster
9. The resource management middleware feeds back the acquired state of the operation to the logic middleware
Referring to fig. 3, fig. 3 is a schematic diagram of an operation management logic middleware according to an embodiment of the present application. The logic middleware in this embodiment is a Java middleware developed by using a Spring Boot technology, and provides an access interface to an end user in the form of REST API, and its main functions include:
(1) The unified REST API is used for providing management of multi-mode big data operation for the client, and specifically, the following interfaces and functions are realized in the embodiment of the application:
-providing a REST API that submits jobs in a uniform abstract data format;
-providing a REST API for obtaining job status in a uniform abstract data format;
-providing a REST API that stops jobs in a uniform abstract data format;
-format conversion of external requests to internal unified job model is handled internally;
(2) Unified operation abstraction and life cycle abstraction are defined inside, and concretely, the operation abstraction realized by the embodiment of the application is as follows; it should be understood that in practical applications, the uniformly abstracted big data job format may include, but is not limited to, the following:
Figure BDA0003631143680000081
the specific uniform abstract format for deep learning in the embodiment of the present application is described in JSON as follows:
Figure BDA0003631143680000082
Figure BDA0003631143680000091
(3) And calling a resource management middleware interface, and sending the unified operation request to the resource management middleware.
Please refer to a schematic diagram of a resource management middleware provided in an embodiment of the present application. The resource management middleware in the embodiment of the application is a Java middleware developed by using a Spring Boot technology, provides an access interface to the job logic management middleware in the form of REST API, and has the main functions of:
1 provides REST APIs that manage job lifecycle in a unified abstract big data job data format, including but not limited to create jobs, get job status, stop jobs;
2, converting the uniform abstract big data job format into a resource description corresponding to a mode, specifically, converting the uniform abstract big data job format into a corresponding big data job format supported by a container cluster in the embodiment of the application; the converted batch type job Yaml and the ad hoc query type job Yaml in the embodiment of the present application are as follows:
spark batch operation
Figure BDA0003631143680000101
Figure BDA0003631143680000111
·
Hive instant query job
Figure BDA0003631143680000112
Figure BDA0003631143680000121
Figure BDA0003631143680000131
Figure BDA0003631143680000141
Figure BDA0003631143680000151
·
3. Submitting the converted mode-related job Yaml to a Kubernetes container cluster, and starting scheduling and running the job by the container cluster
4. The resource management middleware of the embodiment of the application acquires the state of the submitted job from the Kubernets container cluster through the API provided by the Kubernets container cluster, converts the state into uniform abstraction, and feeds the uniform abstraction back to the logic middleware
The important feature of the embodiment of the present application is that a container cluster is used to run big data operations, and the container cluster in the embodiment of the present application adopts a container arrangement platform kubernets which is mainstream in the industry. Kubernetes is open-source container orchestration platform system software, provides a good expansion mechanism such as an Operator to expand self capacity, and in the embodiment of the application, a plurality of Operator plug-ins of community open sources are deployed, so that the capacity of adapting to multi-mode large data framework operation is achieved. Specifically, the method comprises the following steps:
1. the Kubernets cluster of the embodiment is provided with spark-on-k8s-Operator which is open source by Google corporation and Operator plug-ins of two kinds of Kubernets which are open source by Huawei corporation, so as to respectively drive corresponding multi-mode big data frame operation
2. After receiving a job request in a Yaml format submitted by a resource management middleware, the kubernets container cluster of this embodiment matches the job request to an installed Operator to perform task scheduling, and the Operator starts scheduling a big data frame job task corresponding to the Operator according to parameters in the specific Yaml
After the operator finishes scheduling, kubernets creates a large data frame job cluster in a container mode and runs the job, and after the job runs, the container is destroyed
Preferably, the embodiment also deploys a high-performance network support component, and in the embodiment, the work content and the flow of this part are as follows: deploying and configuring high performance network hardware and plug-ins on a container cluster:
(1) In the embodiment of the application, the nodes of the container cluster are provided with the high-performance network cards, and the embodiment is provided with the RoCE and InfiniBand network cards
(2) In the container cluster in the embodiment of the application, a self-developed kubernetesevice plug is deployed and used for communicating with a high-performance network card on a node
2, when a user submits a job, declaring high-performance hardware required to be used in a uniform abstract format; specifically, the Yaml definition of the big data job declaration high-performance network resource in the embodiment of the present application is as follows:
Figure BDA0003631143680000161
Figure BDA0003631143680000171
3, when the resource management middleware performs resource template rendering, the resource management middleware converts the high-performance network resource declared in the uniform abstraction into a specific container cluster resource configuration, specifically, the configuration in the embodiment of the present application is as follows:
Figure BDA0003631143680000172
Figure BDA0003631143680000181
Figure BDA0003631143680000191
Figure BDA0003631143680000201
Figure BDA0003631143680000211
the beneficial effect of this embodiment lies in:
unified management on the multi-mode big data operation cluster is realized through containerization cluster arrangement and Operator management, and operation and maintenance cost is effectively reduced;
the consistency of the multi-mode big data job scheduling management is realized through a uniform abstract mechanism, and the difference of the life cycles of the multi-mode big data jobs is shielded for users; end-users can submit jobs with a unified UI or API
Based on the realization mode of uniform abstraction and Operator drive, the method is more convenient for expanding the big data frame operation supporting other modes in the future, and only the corresponding Operator needs to be realized and deployed
By utilizing container cluster arrangement, the cluster environment for running the big data job does not need to be deployed in advance, but is bound along the life cycle of the big data job, and the big data job is deleted when the big data job is used up; greatly improves the resource utilization rate
The system provides high-performance network support for big data jobs in a mode transparent to users, the users only need to declare that the high-performance network is needed when submitting jobs, the system automatically carries out drive layer configuration and distributes related resources, and the communication bottleneck under the large-scale distributed job scene is solved
It is to be noted that the foregoing description is only exemplary of the invention and that the principles of the technology may be employed. Those skilled in the art will appreciate that the present invention is not limited to the particular embodiments described herein, and that various obvious changes, rearrangements and substitutions will now be apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A multi-mode big data job scheduling system based on a container cluster is characterized in that the system comprises at least one logic middleware, a resource management middleware and a container cluster;
the logic middleware is used for receiving a client big data job request by using a uniform API (application programming interface), wherein the big data job request comprises basic information of the big data job;
the resource management middleware is used for carrying out mode-related adaptation and format processing according to the basic information of the big data operation, converting the mode-related adaptation and format processing into a resource description file supported by the container cluster and submitting the resource description file to the container cluster;
the container cluster is used for processing the scheduling of the operation of the corresponding mode by using an Operator which is deployed in advance.
2. The scheduling system of claim 1 wherein the logic middleware is to provide a uniform programming interface out of the multi-mode big data frame job; the logical middleware is also used for providing uniform abstraction for the life cycle of the multi-mode big data frame operation.
3. The scheduling system of claim 1 wherein the resource management middleware is to provide resource layer configuration for multi-mode big data frame jobs;
the resource management middleware is used for being responsible for scheduling adaptation to the container cluster aiming at multi-mode big data frame operation;
the resource management middleware is also used for managing the difference, configuration and conversion of the life cycle of the multi-mode big data frame operation to be unified abstraction, and feeding back the unified abstraction to the logic middleware.
4. The scheduling system of claim 1 wherein the container cluster is configured to provide a containerized runtime environment for multi-mode big data jobs, wherein a scheduler supporting multiple big data framework modes is deployed on the container cluster.
5. The scheduling system of claim 4 wherein the resource management middleware is further operable to provide resource configuration for high performance networks for big data frame jobs;
accordingly, each working node of the container cluster comprises an RDMA network card, and the container cluster is used for providing high-performance network communication among containers in a high-performance network mode.
6. A multi-mode big data job scheduling method based on a container cluster is characterized by comprising the following steps:
the method comprises the steps that a logic middleware receives a client big data job request through a uniform API (application programming interface), wherein the big data job request comprises basic information of a big data job;
performing mode-related adaptation and format processing by the resource management middleware according to the basic information of the big data operation, converting the mode-related adaptation and format processing into a resource description file supported by the container cluster, and submitting the resource description file to the container cluster;
and processing the scheduling of the corresponding mode job by the container cluster by using a pre-deployed Operator.
7. The scheduling method according to claim 6, wherein the step of performing, by the resource management middleware, mode-dependent adaptation and format processing according to the basic information of the big data job, converting the result into the resource description file supported by the container cluster, and submitting the resource description file to the container cluster includes:
converting the request into a uniform abstract model by the logic middleware and submitting the uniform abstract model to the resource management middleware;
converting the uniform operation abstract model into resource allocation of a corresponding big data scheduling plug-in on a container cluster by the resource management middleware according to concrete big data frame operation information;
the resource management middleware sends the converted job resource request to the container cluster;
correspondingly, the step of scheduling, by the container cluster, the big data job in the corresponding mode by using a pre-deployed Operator specifically includes:
creating a container operation job by the container cluster according to the parameters in the job resource request by using a corresponding Operator;
and inquiring and acquiring the operation condition from the container cluster by the resource manager, converting the operation state into a unified operation abstract life state, and feeding back the unified operation abstract life state to the logic middleware.
8. The scheduling method of claim 7, wherein the scheduling method further comprises:
the logic middleware receives a stop request of a client for big data frame operation through the uniform API;
the logic middleware converts the stop request into a uniform abstract model and submits the uniform abstract model to the resource management middleware;
converting a uniform big data operation model into resource configuration of a corresponding big data scheduling plug-in on a container cluster by the resource management middleware according to the concrete big data operation abstraction;
sending, by the resource management middleware, the converted job resource request to the container cluster;
the container cluster stops corresponding big data operation by using a corresponding Operator according to parameters in the operation resource request, and releases container cluster resources;
and the resource management middleware acquires a stopping result of the big data operation from the container cluster, converts the stopping result into a life state of uniform operation abstraction and feeds the life state back to the logic middleware.
9. The scheduling method of claim 7, wherein the scheduling method further comprises:
the logic middleware receives a query request of a client for a big data operation state through the uniform API;
converting the query request into a uniform abstract model by the logic middleware and submitting the uniform abstract model to the resource management middleware;
the resource management middleware converts a uniform operation model into resource allocation of a corresponding big data scheduling plug-in on a container cluster according to specific big data operation;
the resource management middleware sends the converted job resource request to the container cluster to acquire the state of a job container;
and converting the operation container state acquired from the container cluster into a life state of unified operation abstraction by the resource management middleware, and feeding back the life state to the logic middleware.
10. The scheduling method of claim 6 wherein the resource manager sets high performance network parameters in a pattern dependent big data job resource configuration; each working node of the container cluster comprises an RDMA network card;
correspondingly, the scheduling method further comprises the following steps:
the logic middleware provides the client with uniform big data operation template parameters to declare the requirement on high-performance network resources;
configuring high-performance network resources for the operation by the resource middleware, and submitting the high-performance network resources to the operation cluster;
and allocating corresponding high-performance network resources for the operation when the big data operation runs by the operation cluster.
CN202210491445.9A2022-05-072022-05-07Multi-mode big data job scheduling system and method based on container clusterActiveCN115242786B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202210491445.9ACN115242786B (en)2022-05-072022-05-07Multi-mode big data job scheduling system and method based on container cluster

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202210491445.9ACN115242786B (en)2022-05-072022-05-07Multi-mode big data job scheduling system and method based on container cluster

Publications (2)

Publication NumberPublication Date
CN115242786Atrue CN115242786A (en)2022-10-25
CN115242786B CN115242786B (en)2024-01-12

Family

ID=83668078

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202210491445.9AActiveCN115242786B (en)2022-05-072022-05-07Multi-mode big data job scheduling system and method based on container cluster

Country Status (1)

CountryLink
CN (1)CN115242786B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118260036A (en)*2022-12-282024-06-28青岛中科曙光科技服务有限公司 A method, system and medium for processing Flink jobs

Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030028645A1 (en)*2001-08-062003-02-06Emmanuel RomagnoliManagement system for a cluster
US20150095917A1 (en)*2013-09-272015-04-02International Business Machines CorporationDistributed uima cluster computing (ducc) facility
US20180300174A1 (en)*2017-04-172018-10-18Microsoft Technology Licensing, LlcEfficient queue management for cluster scheduling
US20190228303A1 (en)*2018-01-252019-07-25Beijing Baidu Netcom Science And Technology Co., Ltd.Method and apparatus for scheduling resource for deep learning framework
CN110737529A (en)*2019-09-052020-01-31北京理工大学cluster scheduling adaptive configuration method for short-time multiple variable-size data jobs
US20200394120A1 (en)*2019-06-132020-12-17Paypal, Inc.Big data application lifecycle management
CN112445590A (en)*2020-10-152021-03-05北京仿真中心Computing resource access and scheduling system and method
CN113065848A (en)*2021-04-022021-07-02东云睿连(武汉)计算技术有限公司Deep learning scheduling system and scheduling method supporting multi-class cluster back end
US20220004431A1 (en)*2018-07-122022-01-06Vmware, Inc.Techniques for container scheduling in a virtual environment
CN114327770A (en)*2021-12-302022-04-12东云睿连(武汉)计算技术有限公司 Container cluster management system and method
CN114356549A (en)*2021-12-082022-04-15上海浦东发展银行股份有限公司 Method, device and system for scheduling container resources in multi-container cluster
CN114374609A (en)*2021-12-062022-04-19东云睿连(武汉)计算技术有限公司Deep learning operation running method and system based on RDMA (remote direct memory Access) equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030028645A1 (en)*2001-08-062003-02-06Emmanuel RomagnoliManagement system for a cluster
US20150095917A1 (en)*2013-09-272015-04-02International Business Machines CorporationDistributed uima cluster computing (ducc) facility
US20180300174A1 (en)*2017-04-172018-10-18Microsoft Technology Licensing, LlcEfficient queue management for cluster scheduling
US20190228303A1 (en)*2018-01-252019-07-25Beijing Baidu Netcom Science And Technology Co., Ltd.Method and apparatus for scheduling resource for deep learning framework
US20220004431A1 (en)*2018-07-122022-01-06Vmware, Inc.Techniques for container scheduling in a virtual environment
US20200394120A1 (en)*2019-06-132020-12-17Paypal, Inc.Big data application lifecycle management
CN110737529A (en)*2019-09-052020-01-31北京理工大学cluster scheduling adaptive configuration method for short-time multiple variable-size data jobs
CN112445590A (en)*2020-10-152021-03-05北京仿真中心Computing resource access and scheduling system and method
CN113065848A (en)*2021-04-022021-07-02东云睿连(武汉)计算技术有限公司Deep learning scheduling system and scheduling method supporting multi-class cluster back end
CN114374609A (en)*2021-12-062022-04-19东云睿连(武汉)计算技术有限公司Deep learning operation running method and system based on RDMA (remote direct memory Access) equipment
CN114356549A (en)*2021-12-082022-04-15上海浦东发展银行股份有限公司 Method, device and system for scheduling container resources in multi-container cluster
CN114327770A (en)*2021-12-302022-04-12东云睿连(武汉)计算技术有限公司 Container cluster management system and method

Cited By (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118260036A (en)*2022-12-282024-06-28青岛中科曙光科技服务有限公司 A method, system and medium for processing Flink jobs

Also Published As

Publication numberPublication date
CN115242786B (en)2024-01-12

Similar Documents

PublicationPublication DateTitle
CN103092698B (en)Cloud computing application automatic deployment system and method
CN103064742B (en)A kind of automatic deployment system and method for hadoop cluster
US20140181818A1 (en)Optimization of packet processing by delaying a processor from entering an idle state
CN112181648B (en) A configuration-driven lightweight hybrid infrastructure platform and data processing method
CN106533713B (en)Application deployment method and device
JP2015537307A (en) Component-oriented hybrid cloud operating system architecture and communication method thereof
CN111327692A (en)Model training method and device and cluster system
CN102937911A (en)Management method and system for virtual machine sources
US9513934B2 (en)Platform and software framework for data intensive applications in the cloud
JP2024501005A (en) Management method and device for container clusters
CN104243617A (en)Task scheduling method and system facing mixed load in heterogeneous cluster
WO2023046141A1 (en)Acceleration framework and acceleration method for database network load performance, and device
WO2025002287A1 (en)Method and system for providing computing resource, and electronic device and storage medium
CN114615308A (en)RPC-based asynchronous multithreading concurrent network communication method and device
CN114518955A (en)Flunk cloud native deployment architecture method and system based on kubernets
CN112527451A (en)Management method, device, equipment and storage medium of container resource pool
CN116069481B (en) A container scheduling system and scheduling method for sharing GPU resources
CN115391006A (en)Heterogeneous multi-cluster data processing method, device, medium and terminal
CN118265973A (en)Message processing method and device
CN107967166A (en)Method for realizing remote sensing image processing service flow in cloud environment
CN118113471A (en) A GPU sharing method and device for server-unaware inference load
CN114745377B (en)Edge cloud cluster service system and implementation method
Qi et al.LIFL: A Lightweight, Event-driven Serverless Platform for Federated Learning
CN115242786A (en)Multi-mode big data job scheduling system and method based on container cluster
CN115987872A (en)Cloud system based on resource routing

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp