US9692676B2

Movatterモバイル変換

Info

Publication number: US9692676B2
Application number: US13/536,635
Authority: US
Inventors: Cynthia Ann Gabriel; Bill Shao
Original assignee: International Business Machines Corp
Current assignee: International Business Machines Corp
Priority date: 2012-06-28
Filing date: 2012-06-28
Publication date: 2017-06-27
Also published as: US10033615B2; US20140068070A1; US20140006594A1

Abstract

Systems include a network element with a plurality of server blades, including a control server and one or more application servers. The control server communicates with each application server to acquire state information about each application server. A switch is in communication with each server blade. The switch is configured to communicate with the control server in order to obtain therefrom the state information acquired by the control server about each application server.

Description

FIELD OF THE INVENTION

The invention relates generally to data centers and data processing in data centers comprised of switches and servers. More particularly, the invention relates to off-loading computing processes from switches to servers.

BACKGROUND

Data centers are generally centralized facilities that provide Internet and intranet services needed to support businesses and organizations. A typical data center can house various types of electronic equipment, such as computers, servers (e.g., email servers, proxy servers, and DNS servers), switches, routers, data storage devices, and other associated components. Within the data center, switches often monitor the status of other components, such as the servers, in order to make packet forwarding decisions. Limitations on the computing power of a switch, however, can constrain the number of servers that the switch is able to monitor. Not only do the limitations of the switch pose a performance bottleneck, but they can also limit the scalability of the data center.

SUMMARY

In one aspect, the invention features a network element comprising a plurality of server blades including a control server and one or more application servers. The control server communicates with each application server to acquire state information about each application server. The network element further comprises a switch in communication with each server blade. The switch is configured to communicate with the control server in order to obtain therefrom the state information acquired by the control server about each of the application servers.

In another aspect, the invention features a data center comprising a plurality of interconnected network elements. Each network element comprises a switch and a plurality of server blades including one or more application servers. The server blades of a given one of the network elements further include a control server. The control server communicates with each application server of each network element to acquire state information about that application server. The switch of the given network element is in communication with each server of the plurality of server blades of the given network element. The switch of the given network element is configured to communicate with the control server in order to obtain therefrom the state information acquired by the control server about each of the application servers of each network element.

BRIEF DESCRIPTION OF THE DRAWINGS

The above and further advantages of this invention may be better understood by referring to the following description in conjunction with the accompanying drawings, in which like numerals indicate like structural elements and features in various figures. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating the principles of the invention.

FIG. 1 is an embodiment of a networking environment including a data center, client systems, and a management station.

FIG. 2 is a diagram of an embodiment of the data center including a plurality of interconnected network elements.

FIG. 3 is a flow diagram of an embodiment of a process of configuring a switch of a network element such that a computing process, for example, health checking, is offloaded to a control server of the network element.

FIG. 4 is a flow diagram of an embodiment of a process by which the switch performs a health check of the control server.

FIG. 5 is a flow diagram of an embodiment of a process by which the control server acquires state information from the application servers and updates the state information held by the switch.

FIG. 6 is a flow diagram of an embodiment of a process by which the control server can increase, decrease, or leave unchanged the number of enabled application servers in the data center in response to the workload of the application servers.

DETAILED DESCRIPTION

Described herein are systems and methods for offloading computing processes, for example, server health checking, from a switch, which generally has limited CPU and memory capacity, to one or more server blades. The offloading reduces the computing load on the CPU of the switch by enabling a control server to monitor the application servers on behalf of the switch and to provide asynchronous status updates directly to the switch. Accordingly, the switch does not perform any health checks of the application servers because the control server performs this role. The role of the switch is to report its currently held view of the state information of the application servers to the control server on a periodic basis. If the current views of the control server and the switch are out-of-sync, the control server sends operational requests to the switch to update the switch with the latest state information about the application servers.

The offloading thus achieves a division of labor, as the CPU-intensive work of the health check operation is offloaded from the switch to the control server, leaving the switch to apply its CPU capacity to other operations, for example, traffic shaping. As a result, the offloading removes a performance bottleneck at the switch and achieves a scalability that enables more server blades/application servers and server appliances to be added to the data center than could otherwise be handled by the switch alone.

FIG. 1 shows an embodiment of anetworking environment2 including adata center10 in communication with a plurality ofclient systems4 and amanagement station6 over anetwork8. In general, theclient systems4 are computer systems running applications that require the services of thedata center10. Themanagement station6 provides a centralized point of administration for managing thedata center10. Theclient systems4 and amanagement station6 communicate withdata center10 over thenetwork8, embodiments of which include, but are not limited to, local-area networks (LAN), metro-area networks (MAN), and wide-area networks (WAN), such as the Internet or World Wide Web. Although shown external to thedata center10, in one embodiment, themanagement station6 may be part of thedata center10.

Thedata center10 is generally a facility that houses various computers, routers, switches, and other associated equipment in support of applications and data that are integral to the operation of a business, organization, or other entities. Thedata center10 includes a plurality ofnetwork elements12 interconnected into anetwork14. Each of thenetwork elements12 are independent (standalone) electronic enclosures or chassis. Thedata center10 can have fewer or more than the threenetwork elements12 shown. In addition, embodiments of thedata center10 may be at a single site or distributed among multiple sites.

In brief overview, thenetwork elements12 of thedata center10 provide switching and server services for packet-traffic from theclient systems4. Through anexternal management agent16 running on themanagement station6, an administrator of thedata center10 can communicate with one or more of thenetwork elements12 in order to statically configure thenetwork elements12 to offload certain computing processes conventionally performed by a switch to certain servers, as described in more detail below. The configuration can also operate to adjust the server capacity dynamically for supporting services.

Themanagement station6 can connect directly (point-to-point) or indirectly to a givennetwork element12 of thedata center10 over one of a variety of connections, such as standard telephone lines, digital subscriber line (DSL), asynchronous DSL, LAN or WAN links (e.g., T1, T3), broadband connections (Frame Relay, ATM), and wireless connections (e.g., 802.11(a), 802.11(b), 802.11(g), 802.11(n)). Using a network protocol, such as Telnet or SNMP (Simple Network Management Protocol), themanagement station6 can access a command-line interface (CLI) of the givennetwork element12.

FIG. 2 shows an embodiment of thedata center10 including a network element12-1 comprised of aswitch20 and server blades (or simply servers)22 in communication with theswitch20. In general, eachserver blade22 is a computer that provides one or more services to thedata center10, examples of which include, but are not limited to, email servers, proxy servers, DNS (Domain Name System) servers, HTTP (HyperText Transport Protocol) servers, firewall servers, video compression servers, deduplication servers.

Theswitch20 includes ports24-1,24-2,24-3,24-4 (generally,24) in communication with external systems (i.e., client systems, other network elements, networking equipment), and server-facing ports (not shown) in communication with theservers22. In theexample data center10, the port24-1 is in communication withclient systems4 over the network8 (FIG. 1), the port24-2 is in communication with a port of a second network element12-2, and the port24-3 is in communication with the port of a third network element12-3, which is daisy-chained to another network element12-N, and the port24-4 couples the network element12-1 to a local network (e.g., an Intranet). Theswitch20 can have fewer or more than the number of ports24 shown. In one embodiment, theswitch20 is an Ethernet switch and the ports24 of theswitch20 can support 10 GB line rates. In addition, thedata center10 generally hasmore network elements12 than those shown, although the network element12-1 alone is sufficient to illustrate the principles of the invention.

In addition, theswitch20 includes a frame processor (FP)26, a management processor (MP) orCPU28, and a server state table30. Implementation of theswitch20 can be with ASIC (Application Specific Integrated Circuit) technology on one or more semiconductor chips. In general, theframe processor26 includes logic for forwarding packets bidirectionally between ports (e.g., port24-1 and server-facing ports), using a packet-forwarding table (not shown) to determine the packet-forwarding behavior of theswitch20. Theframe processor26 may be implemented as hardware, software, or a combination of hardware and software. Examples of the packet-forwarding table include, but are not limited to, an L2 forwarding table, an L3 routing table, a link aggregation (i.e. static or LACP trunk) table, an Equal Cost Multi Path (ECMP) table, and a frame/field processor (i.e. access control list) table.

Themanagement processor28 dynamically adds, removes, or modifies entries in the packet-forwarding table used by theframe processor26. In addition, themanagement processor28 constantly monitors the health of theserver blades22 by offloading the responsibility for performing the health checks to designatedserver blades22, as described further below. Examples of such health-check mechanisms include, but are not limited to, a link-state health check, a PING health check, an ARP (Address Resolution Protocol) health check, a UDP/TCP (User Datagram protocol/Transmission Control Protocol) health check, a service-based health check (i.e. HTTP, SMTP, SSL, SIP, etc.), and a user scriptable health check.

Themanagement processor28 maintains a server state table30, which holds state information about eachserver blade22 as presently viewed by theswitch20. The state information for eachserver blade22 includes a MAC address, an IP address, an operational state (enabled or disabled), and an alive state (enabled or failed). The alive state pertains to servers that are operationally enabled. The “operational” state and “alive” state information are managed independently. The state information ultimately affects the packet-forwarding behavior of theswitch20, because theswitch20 can forward packets to only those servers that are operationally enabled and alive. Themanagement processor28 may be implemented as hardware, software, or a combination of hardware and software.

Theswitch20 also includes an application program interface (API)32 through which the offloading behavior of theswitch20 can be configured. Here, for example, themanagement processor28 has theAPI32. TheAPI32 enables external agents to communicate with theswitch20 to command theswitch20 to perform certain functions. For example, the external management agent16 (FIG. 1) or any of theserver blades22 can use theAPI32 to cause theswitch20 to update the state information kept in the server state table30.

TheAPI32 provides operational commands that allow an external agent to perform certain functions directly on the switch20 (e.g., using SNMP). One command causes theswitch20 to set the health check alive state (ena/dis) for a specifiedserver22. A second command directs theswitch20 to issue an ARP request to the IP address of a specifiedserver22 to update the ARP entry (MAC and port) on theswitch20. Allservers22 have a defined RIP (Real IP address) in the switch configuration and an L3 network interface with which to connect. Other commands cause theswitch20 to enable or disable a specifiedserver22. In addition, theAPI32 has a health check option to allow theswitch20 to prepare a packet containing the switch's current view of the state of allservers22 and to send the packet to a designatedserver22 at a configurable interval, as described in more detail below.

Each of theserver blades22 includes a software module that determines the role of theserver blade22 in the health check process. One or more of theservers22 runs control-service (CS)software38; hereafter, such server blades are referred to ascontrol servers40. Theother servers22 run application-service (AS)software42; hereafter,such servers22 are referred to asapplication servers44. Through execution of theCS software38, thecontrol server40 performs its role of actively monitoring and directly setting the states of theapplication servers44 and reporting updates of these states to theswitch20. Through execution of theAS software42, anapplication server44 performs multiple functions such as responding to health check requests to communicate its current application state in addition to its role of providing one or more services to thedata center10, examples of which include, but are not limited to, email servers, proxy servers, DNS (Domain Name System) servers, HTTP (HyperText Transport Protocol) servers, firewall servers, video compression servers, deduplication servers, etc.

The other network elements12-2,12-3,12-N includeother application servers44 with which acontrol server40 of network element12-1 communicates in order to acquire state information. Theswitches20′ of these network elements12-2,12-3,12-N can lack the “intelligence” of theswitch20 of the network element12-1, operating simply as packet forwarding devices. Although one or more server blades of such network elements12-2,12-3,12-N can be configured as acontrol server40, preferably thecontrol server40 is implemented in the same chassis as theintelligent switch20 to enhance reliability of transmissions and the synchronization of state information between theswitch20 and thecontrol server40, as described in more detail below.

FIG. 3 shows an embodiment of aprocess100 for configuring theswitch20 to offload certain computing processes (e.g., health checking) from theswitch20 to thecontrol server40. In general, the role of theswitch20 in the offloaded process becomes that of reporting to thecontrol server40 periodically its current view of the state information about theapplication servers44. An administrator statically configures theswitch20 to perform this role before runtime. When configuring theswitch20, the administrator defines (step102) a group of application servers44 (hereafter, application server group). In general, theswitch20 does not perform a designated offloaded computing process (e.g., a health check) in connection with servers placed in this group. Options available to the administrator include adding and removing servers from a group, naming the group, and selecting the offloaded process (e.g., the type of health check to be performed).Application servers44 cannot be a member of more than one application server group.

The administrator also defines (step104) a group of control servers40 (hereafter, control server group). In general, theswitch20 performs health checks to servers placed in this group (as described in connection withFIG. 4). The administrator associates (step106) this control server group with the application server group. By this association, the administrator makes thecontrol servers40 in the control server group responsible for communicating with each of the application servers in the application server group for purposes of performing the offloaded computing process. Ifmultiple control servers40 are added to the control server group, the control servers share responsibility for managing the associated application server group, and resolve amongst themselves any potential contention in the performance of this responsibility.

In addition, the administrator configures (step108) theswitch20 for controlling communications withcontrols servers40 in the control server group. This configuration can define parameters affecting, for example, the frequency of status notifications to be sent from theswitch20 to acontrol server40 of the control server group, error detection, and error handling. After completion of the configuration, theswitch20 runs (step110).

FIG. 4 shows an embodiment of aprocess120 by which theswitch20 performs a health check of thecontrol server40 in the control server group. Atstep122, the configuredswitch20 begins operating. While operating, theswitch20 periodically sends (step124) status notifications to thecontrol server40. These status notifications include the current view of the states of theapplication servers44 in the AS group according to theswitch20. The state information for eachapplication server44 in a status notification can include a MAC address, a IP address, a physical switch port, the operational status of the application server (disabled or enabled), and the alive state of the application server (failed or running).

In one embodiment, theswitch20 encapsulates its version of the state information of theapplication servers44 in the body of an HTTP (HyperText Transport Protocol) PUT request within an Ethernet frame. The HTTP PUT request is directed to a predefined URL associated with thecontrol server40. The PUT request operates to cause thecontrol server40 to store the contents of the HTTP entity-body at the requested URL. The HTTP header of the request can be further encapsulated in Ethernet, IP, and TCP headers. In another embodiment, theswitch20 uses an HTTP Post request to carry the state information.

Thecontrol server40 acknowledges (step126) to theswitch20 each status notification received from theswitch20. For example, in response to an HTTP PUT request, thecontrol server40 replies with an HTTP200 OK server response code to acknowledge successful receipt of the application server state table30 in the health check data payload and successful storing at the predefined URL. Theswitch20 handles all other server response codes as an error, which implies a health check failure to thecontrol server40.

Loss of connectivity between thecontrol server40 and theswitch20, for example, can result in theswitch20 operating with stale application server state information. In this event, one corrective mechanism can be to failover to a backup chassis (i.e., a backup switch). As part of the failover, allapplication servers44 are initialized according to the latest synchronized state, and allcontrol servers40 are initialized to an “alive” state. Subsequently, thenew control server40 corrects any incorrect application server state information, and the newactive switch20 adjusts the health check states of thecontrol servers40 upon resuming health checks to the control server group.

FIG. 5 shows an embodiment of aprocess150 by which thecontrol server40 performs a health check of theapplication servers44 in its associated AS group and keeps the application server state information current at theswitch20. Theprocess150 runs concurrently with and asynchronously to transmissions of the status notifications by theswitch20, as described in connection with theprocess120 ofFIG. 4. The asynchronous execution of the

processes

120,150 advantageously avoids connection timeouts between theswitch20 and thecontrol server40, thus ensuring a reliable communication control path from theswitch20 to thecontrol server40. Atstep152, thecontrol server40 executes theCS software38 to begin the control service. During operation, thecontrol server40 actively performs (step154) health checks to theapplication servers44 in its associated AS group. The administrator can pre-configure the type of health check, examples of which include, but are not limited to, a link-state health check, a PING health check, an ARP (Address Resolution Protocol) health check, a UDP/TCP (User Datagram protocol/Transmission Control Protocol) health check, a service-based health check (i.e. HTTP, SMTP, SSL, SIP, etc.), and a user scriptable health check. Based on the health check responses from theapplication servers44 in the AS group, thecontrol server40 updates (step156) its own local view of the AS group state.

Thecontrol server40 compares (step158) its updated local view of the AS group state with the view of the AS group state held by theswitch20. To obtain the current view held by theswitch20, thecontrol server40 references and processes the last server state table30 stored at the predefined URL in response to the last HTTP PUT request (i.e., status notification) received from theswitch20.

If the comparison finds no differences between the views of the AS group held by theswitch20 and thecontrol server40, theprocess150 continues withstep154, wherein thecontrol server40 continues performing health checks on theapplication servers44 of the AS group. If thecontrol server40 identifies (step160) a difference in the views of the AS group state, thecontrol server40 sends (step162) a request to theswitch20 directing theswitch20 to update its AS group state with the view held by thecontrol server40.

Out-of-sync state tables between theswitch20 and thecontrol server40 can potentially occur because of transient state changes or request errors resulting from dropped packets, out-of-order packet transfers, or a failure to execute an operational command. Such loss of synchronization is inherent to designs that offload state. Error handling on out-of-sync state tables is generally self-correcting because of the asynchronous processing of the state table independent of the health check cycle. Both state table views on theswitch20 and thecontrol server40 are expected to converge at steady state because of the periodic view updates sent by theswitch20 to thecontrol server40.

For instance, if thecontrol server40 finds a difference in the “alive” state (ena/dis) for anapplication server44, thecontrol server40 can issue an operational command to the switch20 (through the switch'sAPI32 via SNMP) to set the “alive” state for theapplication server44 to the current state. If thecontrol server40 finds a difference in the MAC or port for anapplication server44, thecontrol server40 can issue an operational command to the switch20 (through the switch'sAPI32 via SNMP) to perform an ARP update request to thepertinent application server44. If an error occurs while issuing any operational commands, or if the application server state difference persists, thecontrol server40 can retry the operation until the number of retries reaches a predefined maximum (defined on the control server40). If the state tables between theswitch20 and thecontrol server40 do not converge within the maximum number of retries, thecontrol server40 can attempt to operationally disable thediscrepant application server44 by issuing an “oper dis” command directly to theswitch20 to disable theapplication server44.

With the management of the AS group offloaded to thecontrol server40, thecontrol server40 acquires the ability to enable and disable dynamically preconfiguredapplication servers44 in the AS group on theswitch20 during runtime. This capability effectively allows thecontrol server40 to expand or contract the workload capacity of the AS group during runtime.FIG. 6 shows an embodiment of aprocess170 for controlling the capacity of the AS group during runtime. Atstep172, thecontrol server40 monitors the workload on theapplication servers44 in the AS group.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method, and computer program product. Thus, aspects of the present invention may be embodied entirely in hardware, entirely in software (including, but not limited to, firmware, program code, resident software, microcode), or in a combination of hardware and software. All such embodiments may generally be referred to herein as a circuit, a module, or a system. In addition, aspects of the present invention may be in the form of a computer program product embodied in one or more computer readable media having computer readable program code embodied thereon.

Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wired, optical fiber cable, radio frequency (RF), etc. or any suitable combination thereof.

Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as JAVA, Smalltalk, C#, C++, and Visual C++ or the like and conventional procedural programming languages, such as the C and Pascal programming languages or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

The program code may execute entirely on a user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on a remote computer or server. Any such remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).

Aspects of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.

Any flowchart and block diagrams in the Figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiments were chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It is be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed.

While the invention has been shown and described with reference to specific preferred embodiments, it should be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the following claims.

Claims

What is claimed is:

1. A network element comprising:

a plurality of server blades coupled together in a manner that allows any of the server blades to be designated a control server and one or more of the other server blades being designated as application servers, wherein one server blade is selected as the control server, and the control server is capable of communicating with each application server to acquire state information about each application server; and

a switch in communication with each server blade, the switch being configured to obtain state information relating to the application servers and direct an amount of data traffic it receives to each application server based upon the state information obtained about each application server,

wherein the switch is further configured, in response to a computational load of the switch exceeding a predetermined amount, to offload to the control server for the control server to perform, a computational process of performing a health check on the application servers to obtain state information of the one or more application servers, and the switch is further configured to receive the state information of the application servers from the control server and direct data traffic it receives to each application server based upon the received state information of each application server.

2. The network element ofclaim 1, wherein the communications between the switch and the control server occurs asynchronously with respect to the communication between the control server and each application server.

3. The network element ofclaim 1, wherein the switch is configured to maintain a data structure containing the state information indicating the health of the application servers stored at the switch of each application server and to send the data structure containing the state information periodically to the control server to compare to a data structure of state information of each application server stored at the control server.

4. The network element ofclaim 3, wherein the switch encapsulates the data structure containing the state information within an HTTP (HyperText Transport Protocol)-based packet and sends the HTTP packet to a URL (Universal Resource Locator) associated with the control server.

5. The network element ofclaim 3, wherein the control server is configured to maintain a data structure of the state information of each application server, to compare the state information in the data structure maintained by the control server with the state information in a data structure received from the switch, and to request that the switch update the data structure maintained by the switch if the state information contained in the data structure maintained by the control server differs from the state information contained in the data structure maintained by the switch.

6. The network element ofclaim 5, wherein the request from the control server to the switch is for the switch to update an alive state for a given application server.

7. The network element ofclaim 5, wherein the request from the control server to the switch is for the switch to perform an ARP (Address Resolution Protocol) update request to a given application server.

8. The network element ofclaim 1, wherein the switch includes an application program interface (API) through which the switch is configured to define a group comprised of the one or more application servers and to associate the control server with the group.

9. The network element ofclaim 8, wherein the control server monitors a workload of each application server, and expands or contracts the group of application servers dynamically in response to the workload by communicating with the switch to set an operational state of one or more application servers.

10. A data center comprising:

a plurality of interconnected network elements, each network element comprising a switch and a plurality of server blades coupled together in a manner that allows any of the server blades to be designated as a control server, and one or more of the other server blades being designated as application servers, wherein one server blade is selected as the control server, and the control server is capable of communicating with each application server of each network element to acquire state information about that application server,

wherein the switch of the given network element is in communication with each server of the plurality of server blades of the given network element, the switch of the given network element being configured to obtain the state information relating to the application servers, receive data traffic, and direct an amount of data traffic it receives to each application server based upon the state information obtained for each application server of each network element,

wherein the switch is further configured, in response to a computational load of the switch exceeding a predetermined amount, to offload to the control server for the control server to perform, a computational process of performing a health check on the application servers to obtain state information of the application servers, and the switch is further configured to receive the state information of the application servers from the control server, and direct data traffic it receives to each application server based upon the received state information relating to each application server.

11. The data center ofclaim 10, wherein the switch of the given network element is configured to send periodically a packet containing the state information of each application server of each network element to the control server, and the control server is configured to command the switch to update the state information maintained by the switch if the state information acquired by the control server from the application servers differs from the state information contained in a last packet received from the switch.

12. The data center ofclaim 11, wherein the periodic packet transmissions from the switch to the control server are asynchronous with respect to the update commands sent by the control server to the switch.