CLAIM OF PRIORITYThe present application claims priority from Japanese application serial JP 2006-141426 filed on May 22, 2006 and is a continuation application of U.S. application Ser. No. 11/522,374 filed on Sep. 18, 2006, the contents of which are hereby incorporated by reference into this application.
BACKGROUND OF THE INVENTIONThe present invention relates to a technique for reducing the power consumption of a computing system.
For information processing equipment such as servers which constitute a computing system, the amount of power consumed by each server increases with enhanced server performance. Further, the use of high-density information processing systems such as blade servers requires a huge amount of power to be distributed to them at a high density. Thus, the cost required for supplying power to such information processing equipment and/or systems and cooling them increases more and more.
In view hereof, in a distributed processing system such as a grid computing where computer processing tasks of assumed amounts of computing are allocated to and executed by systems which reside in a plurality of areas respectively, there is a method for avoiding a high power demand localized in an area making it hard to supply power to the system in that area by leveling the power loads used in each area, for example, as described in Japanese Patent Laid-Open No. 2005-63066.
There is also a technique for reducing the power consumption of a server by decreasing the operating frequency of its CPU and decreasing the CPU performance, which is provided by a CPU (Central Processing Unit) manufacturer such as Intel Corporation.
SUMMARY OF THE INVENTIONHowever, the technique as described above allows for reducing the power consumption of a computing system in an area, but cannot reduce the power consumption of the whole computing system covering all the areas and, therefore, cannot reduce the cost required for supplying power to the whole computing system.
However, by the technique for power consumption reduction by decreasing the CPU operating frequency, the server performance is decreased, which results in a decrease in the performance of a job being run on the server. This may result in that performance requirements for the job such as Service Level Agreement (SLA) required of the computing system by a user to run the job required by a user cannot be fulfilled.
An object of the present invention is to provide a method, a system and a computer program for reducing the power consumption of a computing system, the method, the system and the computer program capable of reducing the power consumed by the computing system in which a plurality of servers are connected by a network and one or more jobs are run.
To achieve the above object, an aspect of the present invention provides a method for reducing the power consumption of a computing system where a plurality of servers are connected via a network and one or more jobs are run, the method comprising the following steps which are executed by at least one of the plurality of servers: obtaining server-related information including power properties of each of the servers constituting the computing system; obtaining job-related information including performance requirements for each of the jobs which are run in the computing system; searching for one or more jobs to be relocated and one or more destination servers, based on these server-related information job-related information, to the extent that the performance requirements for each job are fulfilled; relocating one or more jobs selected to be relocated through the search to one or more destination servers selected through the search; and controlling power supply to turn off the power supply of one or more servers on which no job is running in consequence of the relocation, if such a server exists. Another aspect of the invention provides a system for reducing the power consumption thereof and a computer program for reducing the power consumption.
Terms used herein are described. A server refers to all system components having a minimum configuration with at least a processor (CPU) and a storage device (memory) (i.e., a computer-readable storage medium) and capable of executing a job. Among the servers, a supervisory server refers to a special server on which only a power reduction program is executed and for which the method for reducing power consumption is not applied. A job is a generic term used to refer to programs which perform processing in response to an input and returns an output. The program for reducing the power consumption, which will be described by using a concrete example, is not included in the jobs.
In a computing system where a plurality of servers are connected via a network and one or more jobs are run, the method according to the invention searches for one or more jobs to be relocated and one or more destination servers, based on the power properties of the servers and the performance requirements for each job, to the extent that the performance requirements for each job are fulfilled, relocates the one or more jobs, and shuts off the power supply of one or more servers on which no job is running. Thus, it is possible to reduce the power consumed, while complying with the SLA of the computing system.
BRIEF DESCRIPTION OF THE DRAWINGSFIG. 1 shows an overall structural diagram of a computing system according to a first embodiment of the invention.
FIG. 2 shows a structural diagram of each server111-114 in the first embodiment.
FIG. 3 shows a structural diagram of asupervisory server101 in the first embodiment.
FIG. 4 shows a structural diagram of apower reduction facility110 in the first embodiment.
FIG. 5 illustrates a server table410 in the first embodiment.
FIG. 6 illustrates a job table411 in the first embodiment.
FIG. 7 shows a flowchart of operation of the first embodiment.
FIG. 8 shows a sub-flowchart of a server properties obtaining function in the first embodiment.
FIG. 9 illustrates a server information input screen in the first embodiment.
FIG. 10 illustrates a server table410A to which a column for power consumption per unit of performance of a server was added.
FIG. 11 shows a sub-flowchart of a jobproperties obtaining function405 in the first embodiment.
FIG. 12 shows a sub-flowchart of asearch function402 in the first embodiment.
FIG. 13 shows a sub-flowchart of ajob relocating function403 in the first embodiment.
FIG. 14 illustrates a jobrelocation confirmation screen1301 in the first embodiment.
FIG. 15 shows a schematic diagram explaining an example of a job relocation method in the first embodiment.
FIG. 16 shows a schematic diagram explaining another example of a job relocation method in the first embodiment.
FIG. 17 shows a schematic diagram explaining another example of a job relocation method in the first embodiment.
FIG. 18 shows a schematic diagram explaining another example of a job relocation method in the first embodiment.
FIG. 19 shows a sub-flowchart of a server powersupply control function405 in the first embodiment.
FIG. 20 shows a structural diagram of the power reduction facility in a second embodiment of the invention.
FIG. 21 shows an overall structural diagram of a computing system according to a third embodiment of the invention.
FIG. 22 shows a structural diagram of aserver200A in the third embodiment.
FIG. 23 shows a sub-flowchart of a powerproperty measuring function2101 in the third embodiment.
FIG. 24 illustrates a sever table410B in the third embodiment.
FIG. 25 illustrates a job table411A in a fourth embodiment of the invention.
FIG. 26 shows an overall system structural diagram in the fourth embodiment.
FIG. 27 illustrates a job table411B in a fifth embodiment of the invention.
FIG. 28 shows a structural diagram of thepower reduction facility110B in a sixth embodiment of the invention.
FIG. 29 illustrates a search policy table412 in the sixth embodiment.
FIG. 30 shows a sub-flowchart of the search function in the sixth embodiment.
FIG. 31 shows a structural diagram of thepower reduction facility110C in a seventh embodiment of the invention.
FIG. 32 shows a flowchart of operation of the seventh embodiment.
FIG. 33 shows a sub-flowchart of averification function3201 in the seventh embodiment.
FIG. 34 shows a sub-flowchart of the server powersupply control function404A in an eighth embodiment of the invention.
FIG. 35 shows an overall system structural diagram according to a ninth embodiment of the invention.
DETAILED DESCRIPTION OF THEINVENTIONEmbodiment 1In the following, illustrative embodiments of the present invention will be described with the drawings.
FIG. 1 shows an overall structural diagram of a computing system according to a first embodiment of the present invention. The computing system of the present embodiment comprises a plurality of servers111-114 and asupervisory server101. The servers111-114 and the supervisory server are interconnected through anetwork102. On the servers111-114, jobs120-124 are running. However, a server with no job running on it may exist like aserver114. In the figure, the jobs120-124 are particularly appended with identifiers like Job a and Job b, if their contents are different.
Here, the jobs120-124 are generic terms used to refer to programs which perform processing in response to an input and returns an output. A job may be, for example, an application program, EJB (registered trademark) application, Java (registered trademark) application, and one of processes running in an OS (Operating System). Furthermore, all programs such as an OS, drivers, middleware, and applications running together on a server may be regarded as a single job. Alternatively, by means of server virtualization technology such as Xen open source software and VMware (registered trademark) from VMware, Inc., it is possible to run one of more virtual servers on a single physical server. The servers111-114 are equipped with resources such as processors (CPUs) and storage devices (memories).
Using the server virtualization technology, the resources of the servers111-114 may be divided and allocated to different virtual servers, thereby allowing for simultaneous run of one of more virtual servers on a single physical server. Job a and Job b may be run on such virtual servers.
On thesupervisory server101, apower reduction facility110 is run. Thepower reduction facility110 is a computer program that performs control to reduce the power consumed by the servers111-114; the details thereof will be described later. The number of the servers111-114 and the number of the jobs120-124, used in this embodiment, are only exemplary and these numbers may vary as required. There should be at least onesupervisory server101 and at least one power reduction facility.
FIG. 2 shows a detailed structure of each server111-114 in the first embodiment, wherein each server is represented as aserver200. Theserver200 is composed of amemory201 which stores programs such asjobs120,121, at least oneCPU202 which executes the programs from the memory, achip set203, an I/O device204 such as a Host Bus Adapter (HBA) and a Network Interface Card (NIC), aNIC205 for making connection to thenetwork102 inFIG. 1, a Baseboard Management Controller (BMC)206 which is responsible for status monitoring and power supply control, acooling device207 such as a fan or a water cooler, apower supply device208 for supplying the power of the server, and anauxiliary storage device209 such as a hard disk or a flash memory. TheBMC206 is provided with a powersupply control function260, enabling power supply control, turning the power supply of theserver200 on and off from the outside via theNIC205. Here, theserver200 does not always have to include all components shown inFIG. 2. It may be configured with at least theCPU202 and thememory201. By this configuration,jobs120,121 stored in thememory201 can be executed by theCPU202 and functionality as the server can be provided.
FIG. 3 shows a detailed structure of thesupervisory server101 as mentioned inFIG. 1 which illustrates the first embodiment. Thesupervisory server101 is composed of a storage device (memory)301 which stores thepower reduction facility110, at least one processor (CPU)302 which executes programs from the memory, aNIC303 for making connection to thenetwork102 inFIG. 1, and an I/O device304 to which input devices such as a mouse, a keyboard, etc., storage such as USB media, a display device such as a display are connected and which is responsible for input/output of information to/from the server. To the I/O device304, theinput devices305 such as the mouse and keyboard and thedisplay device306 such as a display are connected. Additionally, to the I/O device304, an external storage device (not shown) may be connected for reading/writing information. Further, an internal storage device may be built into thesupervisory server101.
FIG. 4 shows details of thepower reduction facility110 as mentioned inFIG. 1 illustrating the first embodiment. Thepower reduction facility110 is stored in thememory301 of thesupervisory server101 and consists of subprograms for performing several functions and tables in which information that is processed by these subprograms are stored. Specifically, thepower reduction facility110 is composed of a serverproperties obtaining function401 which obtains the properties of the servers, asearch function402 which searches among the servers111-114 in the computing system for a source server and a destination server from which/to which a job is relocated and searches among the jobs120-124 for a job to be relocated, ajob relocating function403 which performs control to relocate a job from the source server to the destination server, a server powersupply control function404 which shuts off the power supply of a server on which no job is running, a jobproperties obtaining function405 which obtains the properties of the jobs, a server table410 containing a list of information for the servers111-114 constituting the computing system, and a job table411 containing a list of information for the jobs120-124 to be run in the computing system. Here, the job table411 may include information for other jobs along with the jobs120-124 running in the computing system. Thesearch function402 searches for jobs and servers according to the following condition for relocation: more jobs shall be allocated to a server with a smallest or smaller value of power consumption per unit of performance to the extent that performance requirements for each job are fulfilled, thus maximizing the number of servers on which no job is running, as will be further detailed later.
FIG. 5 shows details of a concrete example of the server table410 as mentioned inFIG. 4. The server table410 contains information for all servers constituting the computing system. The method of obtaining the server table410 will be described later. Afirst column501 has the identifier of each server. Asecond column502 has a value of performance of the corresponding server in thefirst column501. Athird column503 has power consumption for the performance in thesecond column502 of the corresponding server. In the present embodiment, the value of performance given in thesecond column502 indicates the value of peak performance of the server identified by the server identifier in thefirst column501. The power consumption given in thethird column503 includes, in addition to the power consumed by the CPU of the server identified in thefirst column501, the power consumed by all components of the server, namely, memory, chip set, I/O device, NIC, cooling device, power supply device, auxiliary storage device, BMC, etc. Thethird column503 may be subdivided to specify power consumption fractions for the components, namely, CPU, memory, chip set, I/O device, NIC, cooling device, power supply device, auxiliary storage device, BMC, etc. of the server identified in thefirst column501.
FIG. 6 shows details of the job table411 as mentioned inFIG. 4. The job table411 contains information related to all jobs to be run in the computing system. How to obtain the job table411 will be described later. Afirst column601 has the identifier of a job. Asecond column602 has the identifier of a server running the job identified in thefirst column601 and thesecond column602 corresponds to thefirst column501 in the server table410 shown inFIG. 5. Here, if the job identified in thefirst column601 is not running, thesecond column602 will be empty, like the job “f” in thefirst column601. Athird column603 has a value of performance requirements for the job identified in thefirst column601. In the figure, for example, the job with the identifier “a” given in thefirst column601 is required to be run at 100 bops as the performance given in thethird column603. Here, thethird column603 may be subdivided to specify performance requirements of not only the CPU performance, but also I/O performance to the network, internal storage device, and external storage device, and memory performance. Thethird column603 value may be added and updated, triggered by an event that updates this table411 or an event that updates only thethird column603 individually.
FIG. 7 shows a general operation flow of the first embodiment. The start (700) of this operation flow (procedure) may be triggered by addition and removal of a server to/from the computing system, addition and removal or completion of a job, change in a server configuration, change in the contents of a job or performance requirements for a job, change in the power supply environment of a server such as connection to Uninterruptible Power Supply (UPS), electricity expense, and stable distribution of power, and start request from a user. This procedure may be triggered by building a new computing system like initial installation or transition from a test environment to an actual operation environment. Further, this procedure may be executed periodically. Step701 is to obtain the identifiers of the servers111-114 constituting the computing system, the performance values of the servers, and the power consumptions to attain the performance values. Step702 is to obtain the identifiers of the jobs that run in the computing system and the identifiers of the servers running the jobs, respectively, and performance requirements for each job to meet the Service Level Agreement (SLA).
Step703 is to determine among the servers111-114 constituting the computing system a source server from which and a destination server to which a job is relocated and a job to be relocated, according to the condition for relocation that more jobs shall be allocated to a server with a smallest or smaller value of power consumption per unit of performance to the extent that performance requirements for each job are fulfilled, thus maximizing the number of servers on which no job is running, as will be further detailed later. Here, the job to be relocated resides on the source server. However, no source server may exist, if the job to be relocated is not running. Step704 is to relocate the job determined to be relocated from the source server to the destination server. Here, if no source server exists, then the job is added to the destination server. Step705 is to look for a server on which no job will run and, if there is such a server, to turn off the power supply of that server. However, no action will occur if the power supply of that server is already off. In the following, details of each step of the procedure ofFIG. 7 will be described.
FIG. 8 illustrates details ofstep701 as mentioned inFIG. 7. Step701 inFIG. 7 corresponds to the serverproperties obtaining function401 as mentioned inFIG. 4. Now,step801 is to obtain the identifiers of the servers111-114 constituting the computing system and know the number of the servers constituting the computing system. Step802 is to obtain the servers' performance values and the servers' power consumptions to attain the performance values. The performance values and power consumptions to be obtained are peak performance values and the power consumptions at the time and may include other permanence values and associated power consumptions. However, the peak performance values and associated power consumptions must be obtained. The power consumption of a server includes, in addition to the power consumed by the servers CPU, the power consumed by the components of the server including the memory, chip set, I/O device, NIC, cooling device, power supply device, auxiliary storage device, BMC, etc. Further, the power consumption fractions for each component of the server, namely, the memory, chip set, I/O device, NIC, cooling device, power supply device, auxiliary storage device, BMC, etc. may be obtained. There are possible methods of obtaining the servers' properties. The servers' properties may be entered by a user by means of a Graphical User Interface (GUI) provided by thepower reduction facility110 or using command lines (CUI), may be retrieved from a file stored in a storage device (not shown) connected to thesupervisory server101, and may be acquired via a network.
Step803 is to register the information related to performance and power consumption obtained atstep802 into the server table410. As a result, the peak performance value and associated power consumption of each server must be stored in the server table410,410A. Step804 is to calculate power consumption per unit of performance of each server, referring to the information registered in the server table410. Here, a method of calculation is to divide the power consumption at the time of a performance value by the performance value, thus obtaining the amount of power consumption per unit of performance. However, other than this method, a method for obtaining the power consumption efficiency of a server quantitatively may be used. Step805 is to add the calculated power consumption per unit of performance of each server to the server table410 in a new column on the corresponding line. This will be described later.
FIG. 9 shows an example of the GUI by which the user enters the performance and associated power consumption of each server, as described forsteps801 and802 inFIG. 8. Thepower reduction facility110 displays this GUI on thedisplay device306 as mentioned inFIG. 3, using a browser or a special program and in text form, etc. Infields911,912,913 shown in the figure, a value can be entered. Using the keyboard or the like, the user will enter the identifier of a server in thefirst field911, and in thesecond field912, a performance value of the server entered in thefirst field911, and in thethird field913, power consumption associated with the performance entered in thesecond field912. After the entry, by user action such as choosing and clicking on a Set button with the mouse, the information entered to the GUI is sent to the serverproperties obtaining function401.
FIG. 10 shows an example a server table410A to which the power consumption per unit of performance of each server was added, as described forstep805 inFIG. 8. The power consumption per unit of performance of each server is entered in afourth column504. A value in thefourth column504 indicates the power consumption per unit of performance, when the server identified in thefirst column501 operates at the performance value given in thesecond column502, consuming the power given in thethird column503. In the present embodiment, a smaller value in thefourth column504 denotes better power consumption efficiency.
FIG. 11 illustrates details ofstep702 as mentioned inFIG. 7. Step702 inFIG. 7 corresponds to the jobproperties obtaining function405.Step1101 is to obtain information for the jobs to be run in the computer system and know the type(s) and the number of the jobs to be run in the computing system.Step1102 is to obtain performance requirements for the jobs obtained atstep1101. At thesesteps1101 and1102, the jobs' properties may be obtained in different ways; for example, they may be entered by a user by means of the GUI provided by thepower reduction facility110 or using command lines, may be retrieved from a file stored in a storage device (not shown) connected to thesupervisory server101, and may be acquired via a network. For the performance requirements to be obtained, for example, the number of instructions of a job executed per unit time or throughput of processing Web requests, transaction requests and I/O requests to a job may be obtained and converted to a value such as Billions of Operations Per Second (bops) representing the CPU performance.Step1103 is to register the jobs and jobs' properties obtained atsteps1101 and1102 into the job table411.
FIG. 12 illustrates details ofstep703 as mentioned inFIG. 7. Step703 inFIG. 7 corresponds to thesearch function402.Step1201 is to search among all jobs for a job to be relocated and search for the source server from which and the destination server to which the job is relocated. In the present invention, the search at thisstep1201 is performed, according to the condition for relocation that more jobs shall be allocated to a server with a smallest or smaller value of power consumption per unit of performance to the extent that performance requirements for each job are fulfilled, thus maximizing the number of servers on which no job is running. This intensive allocation of jobs to a subset of the servers is realized by job relocation. Here, the power consumption per unit of performance can be obtained by reference to the server table410A shown inFIG. 10. The performance requirements for each job can be obtained by reference to the job table411 shown inFIG. 6. In the present embodiment, to determine whether or not more jobs can intensively be allocated to a server to the extent that performance requirements for each job are fulfilled, the peak performance of each server is obtained from the server table410. If the sum of the performance requirements for all jobs to be run on the server does not exceed the peak performance of the server, the intensive allocation of the jobs to the server is allowed. If the sum exceeds the peak performance, the intensive allocation of the jobs to the server is not allowed. Also in the present embodiment, to maximize the number of servers on which no job is running, all jobs running on a server with a high value as the power consumption per unit of performance should be candidates to be relocated preferentially.
If a job is not running, there is no source server. A plurality of sets of a job to be relocated, the source server, and the destination server may be determined as the result of the search atstep1201.Step1202 is to notify thejob relocating function403 of the destination server identifier, the identifier of the job to be relocated, the identifier of the source server, if such a source server exists, obtained atstep1201. Thisfunction403 will be detailed, usingFIG. 13. If there are a plurality of sets of the destination server identifier, the identifier of the job to be relocated, and the source server identifier, obtained atstep1201,step1202 is to notify thejob relocating function403 of all the sets. If it is found atstep1201 that the whole computing system performance is not enough to meet the performance requirements for all jobs, the search function may signal this situation to a function that automatically adds resources to the computing system or present a warning on thedisplay device306 of thesupervisory server101. Additionally, the search atstep1201 may be conditioned in terms of stability of power distribution of the source server and the destination server and electricity expenses per server and the search may be performed to allocate more jobs to a server to which power is distributed stably and costing less electricity expense. In the present embodiment, however, it is needed to satisfy the condition that more jobs are allocated to a server to the extent that performance requirements for each job are fulfilled.
FIG. 13 illustrates details ofstep704 as mentioned inFIG. 7. Step703 inFIG. 7 corresponds to thejob relocating function403.Step1301 is to obtain the source server identifier, the identifier of the job to be relocated, and the destination server identifier from thesearch function402. Here, this step may not obtain the source server identifier, if no source server identifier exists.Step1302 is to stop the job to be relocated on the source server, if the source server exits. However, if a technique that enables relocation of a job without stopping the job to be relocated, it is not needed to performstep1302. As the technique enabling seamless job relocation, for example, a function “Live migration” provided by Xen open source software and Vmotion provided by VirtualCenter software supplied from VMware, Inc. may be available. Also, in the case where no source server exists, it is not needed to performstep1302.Step1303 is to relocate the job from the source server to the destination server.
There are possible methods of relocation of a job, including copying the image of the job program from the source server to the destination server, pre-storing of the image of the job program on both the source server and the destination server, and sharing the image of the same job program by the source server and the destination server. Here, the image of the job program is a combination of a program and its associated data. The program may be any of the types, for example, OS, middleware, application, driver, etc. If no source server exists, thesupervisory server101 may be regarded as the source server or a method may be used for distributing the image of the job program to the destination serve, using job program image distribution software or the like.Step1304 is to restart the relocated job on the destination server.Step1305 is to change the server running the relocated job to the destination server in thesecond column602 in the job table411 ofFIG. 6. If a plurality of sets of the source server, destination server, and the job to be relocated are received atstep1301, thesubsequent steps1302 to1305 are repeated as many times as the number of the sets received atstep1301.
FIG. 14 is an example of GUI display prompting the user to confirm job relocation before thejob relocating function403 relocates the jobs before starting the job relocation. The GUI displays a screen1401 shown here on thedisplay device306 as mentioned inFIG. 3, using a browser or a special program and in text form, etc., thereby asking the user whether or not to perform job relocation before starting the job relocation. Afirst column1411 lists the identifiers of the jobs to be relocated. Asecond column1412 lists the identifiers of the destination servers to which the jobs listed in the first column are relocated. Athird column1413 lists the identifiers of the servers that currently run the jobs listed in thefirst column1411. If a job given in thefirst column1411 is not running, the corresponding field in thethird column1413 will be empty. Avalue1414 of power indicates the expected power that can be cut after the execution of the job relocation. The expected value of the power that can be cut may be fractionized into those per server and per job. If the user performs the relocation, by choosing and clicking on the “Do Relocation”button1415 with the mouse, then thejob relocating function403 is executed. If the user cancels the job relocation, click on the “Cancel”button1416 with the mouse. The GUI screen for the present embodiment is, for example, shown for an instance where a plurality of jobs are relocated at a time, whereas a single job may be relocated. On this GUI screen, the remaining margins of performance capacity for each server and for the whole computing system may be shown.
FIG. 15 shows an example of a job relocation method instep1303 inFIG. 13. On thesource server1501 and thedestination server1502, image copy functions1510,1520 for copying the contents of the auxiliary storage device are activated. The image copy functions1510,1520 may be implemented by, for example, a ftp command, a rcp command, or an agent program for copy. Theimage1502 of the job program on the source server is copied to the auxiliary storage device of thedestination server1502 via the image copy functions1510,1520. Thereby, the job can be run on the destination server. A data disk or the like required by the job may be shared using an external storage device.
FIG. 16 shows another example of a job relocation method instep1303 inFIG. 13. In this method, images of thesame job program1612,1622 are pre-stored in theauxiliary storage devices1611,1621 of both thesource server1601 and thedestination server1602.
FIG. 17 shows another example of a job relocation method instep1303 inFIG. 13. In this method, thesource server1701 and thedestination server1702 can refer to the same image of thejob program1740 stored in anexternal storage device1704.
FIG. 18 shows another example of a job relocation method instep1303 inFIG. 13. This method is to relocate a job, when jobs are run on virtual servers realized byvirtual server functions1810,1820 of Xen open-source software, VMware registered trademark) from VMware, Inc., and the like. Theimage1840 of the job is held in anexternal storage device1804 which is shared by thesource server1801 and thedestination server1802. Thevirtual server function1810 being run on thesource server1801 holds thejob status1811. Here, the job status is the information that is temporarily retained on the memory of the server during the run of the program such as OS, middleware, driver, and application on the server. In the example ofFIG. 18, when the job is relocated from thesource server1801 to thedestination server1802, thejob status1811 on the source is copied to thedestination server1802 through thenetwork1803. Thereby, thedestination server1802 can restart the relocated job promptly. Thejob image1840 may be shared by thesource server1801 and thedestination server1802 via the network without using theexternal storage device1804.
FIG. 19 shows details ofstep705 as mentioned inFIG. 7. Step705 inFIG. 7 corresponds to the flow of the server powersupply control function404.Step1901 is to search for a server on which no job is running in the computing system, using the server table410 and the job table411. Atstep1902, if a server on which no job is running is found as the result ofstep1901, the procedure proceeds to step1903; if such server does not exist, the procedure terminates.Step1903 is to obtain the power supply status of the server. Here, the servers power supply status may be obtained from a server management module such as BMC built in the server or an agent running on the server. Atstep1904, if the servers power supply is on as the result ofstep1903, the procedure proceeds to step1905; otherwise, the procedure terminates. If a plurality of servers on which no job is running are found atstep1901, thesubsequent steps1902 to1904 are repeated as many times as the number of the servers found atstep1901.Step1905 is to turn off the power supply of the server found to be on as the result ofstep1903. Here, the server's power supply may be turned off by requesting the BMC built in the server to turn off the power supply or requesting the agent running on the server to shut down the power.Step1905 may issue an instruction to put the server in standby mode and transfer the information existing in the server memory to the auxiliary storage device temporarily, instead of turning off the power supply, thus allowing for faster recovery from standby mode when the servers power supply is turned on.
For reducing the power consumption of the computing system, the method of the present embodiment searches among all jobs for a job to be relocated and searches for the source server from which and the destination server to which the job is relocated, according to the condition for relocation that more jobs shall be allocated to a server with a smallest or smaller value of power consumption per unit of performance to the extent that performance requirements for each job are fulfilled, thus maximizing the number of servers on which no job is running. By this relocation of jobs, it is possible to minimize the power to be consumed, while complying with the SLA of the system.
Embodiment 2A second embodiment (Embodiment 2) of the present invention is a method for intensive allocation of jobs to a subset of the servers, based on an event triggering the power reduction procedure by thepower reduction facility110 inEmbodiment 1. A combination of the second embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 20 shows details of thepower reduction facility110A in the second embodiment. Difference fromEmbodiment 1 lies in that the power reduction facility is provided with a triggerevent detecting facility420 which detects what event triggers the procedure of the invention illustrated inFIG. 7 inEmbodiment 1. The triggerevent detecting facility420 detects one of the trigger events as will be specified below and starts the procedure illustrated inFIG. 7 inEmbodiment 1. The trigger events are addition and removal of a server to/from the computing system; addition and removal or completion of a job, change in a server configuration; change in the contents of a job or performance requirements for a job; change in the power supply environment of a server such as connection to UPS, electricity expense, and stable distribution of power; start request from a user; and an event that the power consumed by each of the servers constituting the computing system or the power consumed by the whole computing system has exceeded or is below a predefined amount of power consumption. The triggerevent detecting facility420 can notify each function of thepower reduction facility110A of the type of the trigger event detected. If the trigger event is addition of a server to the computing system or addition of a job, information for the added server or job will be added to the server table410 or the job table411 by the serverproperties obtaining function401 or the jobsproperties obtaining function405. On an event of deletion of a server from the computing system, an event of deletion of a job, and an event of change in a server configuration or the contents of a job, the change occurring will be reflected in the server table410 and the job table411. This reflection may be performed automatically by the triggerevent detecting facility420 or by some other method.
Embodiment 3A third embodiment (Embodiment 3) of the present invention is a method of obtaining server power properties from a power property measuring function installed in each server in the server properties obtaining function inEmbodiment 1. A combination of the third embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 21 shows an overall structural diagram of the computing system inEmbodiment 3. Difference fromEmbodiment 1 lies in that eachserver111A-114A is provided with a powerproperty measuring function2101. The powerproperty measuring function2101 can measure performance that the server utilizes (utilization performance) and the amount of power consumed by the whole server at the time as well as the amount of power consumed by each of its components and can report its measurement values to thepower reduction facility110.
FIG. 22 shows a detailed structure of eachserver111A-114A in the third embodiment, wherein each server is represented as aserver200A. Difference fromEmbodiment 1 lines in that the server is provided with ameasurement device2201 and the powerproperty measuring function2101. Themeasurement device2201 is able to measure the amount of power consumed by thewhole server200A and the amount of power consumed by each component and report the results to the powerproperty measuring function2101. Here, the amount of power consumed by each component is the amount of power consumed separately by thememory201,CPU202, chip set203, I/O device204,NIC205,BMC206,cooling device207,power supply device208, andauxiliary storage device209, which constitute the server.
FIG. 23 shows an operation flow (procedure) of the powerproperty measuring function2101. This procedure is performed as a means for obtaining server performance and power consumption at the time instep802 of the procedure performed by the serverproperties obtaining function401, shown inFIG. 8 inEmbodiment 1.Step2301 is to determine whether or not to perform the measurement of server power properties. At this decision, if a server is the one whose power properties have never been measured, the procedure proceeds to step2302 to perform the measurement. If the power properties of the server have been measured, the procedure proceeds to step2304 without performing the measurement. However, in the case where the server configuration has been changed by, for example, adding a device to the server, the procedure proceeds to step2302 to execute the measurement of the server power properties again.Step2302 is to perform the measurement of the server power properties. Here, the power property measuring function itself has a mechanism to apply varying load to the server. While this mechanism gradually changes the computational amount from low to high to gradually increase the utilization performance of the server up to the peak performance, the amount of power consumption depending on the utilization performance is measured by the measurement device. Based on the measurements, the power property measuring function creates a power properties list indicating the server performance and the power consumption characteristic depending on the performance. Here, during a job run on the server, that is, when a program producing a load is running, the program may be automatically terminated or shut down and restarted after the measurement or may be terminated manually by the user. In the latter case, a message may be displayed on the GUI.Step2303 is to report the power properties list created atstep2302 to the serverproperties obtaining function401.Step2304 is to measure the server utilization performance and power consumption periodically and repeat the reporting to thepower reduction facility110. However, in the present embodiment,step2304 may not be performed.
FIG. 24 shows an example of a server table410B in the third embodiment. While the examples of the server tables410410A shown inEmbodiment 1 list only the peak performance values and associated power consumptions of the servers, performance values other than the peak and associated power consumptions are added to the server table410B in the third embodiment. When a server is on standby with a performance value of 0, the power consumption per unit of performance is empty, as it cannot be measured. In the present embodiment, thesearch function402 can obtain power consumptions per unit of performance for utilization performance values other than the peak of each server, referring to the server table410B. Thus, it can obtain the power consumption per unit of performance for a utilization performance value corresponding to the sum of the performance requirements for all jobs running on the server. Using this value, thesearch function402 can search for one or more sets of a job to be relocated, its source server and destination server, so that the power consumption of the whole computing system can be reduced.
By obtaining the power consumed by each server or the power consumed by the whole computing system utilizing the powerproperty measuring function2101 in the third embodiment, the triggerevent detecting facility420 as described inEmbodiment 2 can determines whether or not the power consumed by each of the servers constituting the computing system or the power consumed by the whole computing system has exceeded or is less than the predefined amount of power consumption.
Embodiment 4A fourth embodiment (Embodiment 4) of the present invention is a method in which the job properties obtaining function inEmbodiment 1 obtains the load variation characteristic of each job and a search for a job to be relocated is performed based on this load variation characteristic. A combination of the fourth embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 25 shows a job table411A in the fourth embodiment. Difference fromEmbodiment 1 lies in that afourth column604 is added. Thefourth column604 has the load variation characteristic of the job given in thefirst column601. The load variation characteristic indicates how the utilization performance for the job varies with regard to the time axis; for example, “constant” indicates almost no variation of load and “large variation” indicates that load varies largely. For example, for a job like Web server, in most cases, the load is high during a time zone when a great quantity of service requests from users come, whereas the load is low during other hours. Thus, the load variation characteristic for this job is “large variation”. The load variation characteristic is obtained by the jobsproperties obtaining function405. When searching for a job to be relocated, thesearch function402 refers to the job table411A and preferentially selects jobs whose load variation characteristic is “constant” as candidates to be relocated rather than those of “large variation”. The load variation characteristic may include information such as variation degrees in terms of I/O utilization performance and memory utilization as performance.
FIG. 26 shows a computing system structure in the fourth embodiment. The jobproperty measuring function2601 within eachserver111B-114B can obtain the load variation characteristic of each job by measuring the utilization performance of the running jobs. Also, the jobproperty measuring function2601 can notify the jobsproperties obtaining function405 of the obtained jobs' load variation characteristics. Here, the jobproperty measuring function2601 can make use of a program such as, for example, IPI/Performance Management or Linux's top command.
Embodiment 5A fifth embodiment (Embodiment 5) of the present invention is a method in which the job properties obtaining function inEmbodiment 4 obtains priority of job relocation and a search for a job to be relocated is performed based on this priority. A combination of the fifth embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 27 shows a job table411B in the fifth embodiment. Difference fromEmbodiment 4 lies in that afifth column605 is added. Thefifth column605 has priority of job relocation for the job given in thefirst column601. In the example show here, a priority level is given to each job in selecting a job to be relocated in order of “high”, “medium” and “low” and “impossible” denotes impossibility of relocation of the job. Thesearch function402 refers to this priority value and preferentially selects a job with a “high” priority as the job to be relocated and deselects a job for which the priority value is “impossible”.
Embodiment 6A sixth embodiment (Embodiment 6) of the present invention is a method allowing the user to specify as a search policy a condition for relocation by which thesearch function402 inEmbodiment 1 searches for a job to be relocated, its source and destination servers. A combination of the sixth embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 28 shows details of the power reduction facility1108 in the sixth embodiment. Difference fromEmbodiment 1 lies in that a search policy table412 is added. Thesearch function402A searches for a job to be relocated and its source and destination servers, using one or more conditions defined in the search policy table412 in addition to the condition that more jobs are allocated to a subset of the servers, while complying with the performance requirements for all jobs running in the computing system, as described inEmbodiment 1.
FIG. 29 shows details of the search policy table412. Afirst column2901 has the identifier number of a search policy. The user can set a condition for relocation as the condition to be used by thesearch function402A by specifying its identifier number and may set a plurality of search policies as the conditions at a time. Asecond column2902 has the description of a search policy whose identifier number is given in thefirst column2901. Search policies may be registered into this table in different ways: they may be entered by a user by means of the GUI provided by thepower reduction facility110 or using command lines, may be retrieved from a file stored in a storage device connected to thesupervisory server101, and may be acquired via a network. InFIG. 29, a search policy withpolicy number1 states that jobs should be relocated to level the powers consumed by each server in the computing system. A search policy withpolicy number2 states that jobs should be relocated to have utilization performance at the lowest power consumption per unit of performance at each server, as given in the server table410. A search policy withpolicy number3 states that jobs should be relocated so that only one job runs on theserver2.
FIG. 30 shows an operation flow (procedure) of thesearch function402A in the sixth embodiment. Difference fromEmbodiment 1 isstep3001.Step3001 is to search for a job to be relocated, its source server and destination server under conditions based on the contents of the search policy table412. The user can set a policy as the condition to be used by specifying its identifier number and may set a plurality of search policies as the conditions at a time. There are possible methods of setting conditions to be used by the user. Setting may be entered by a user by means of the GUI provided by thepower reduction facility110 or using command lines, may be retrieved from a file stored in a storage device connected to thesupervisory server101, and may be acquired via a network.
In the above description, one or more search policies set by the user are assumed to be used in combination with the default condition for relocation by which the search is performed atstep1201 described inFIG. 12 inEmbodiment 1. However, a search policy may be provided to state that the performance requirements for the jobs be ignored. According to this policy, the performance requirements for the jobs may not be fulfilled.
Embodiment 7A seventh embodiment (Embodiment 7) of the present invention is a method in which verification is performed after the execution of job relocation inEmbodiment 1 and recovery processing is performed if a problem is detected. A combination of the seventh embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 31 shows details of thepower reduction facility110C in the seventh embodiment. Difference fromEmbodiment 1 lies in that averification function421 is added. The verification function verify the operation status of the computing system after the execution of job relocation.
FIG. 32 shows a general operation flow of Embodiment 7. Difference fromEmbodiment 1 isstep3201.Step3201 is to verify the operating status of the whole computing system after job relocation performed atstep704.
FIG. 33 shows details ofstep3201 as mentioned inFIG. 32.Step3201 inFIG. 33 corresponds to the operation flow of theverification function421 as mentioned inFIG. 31.Step3301 is to obtain the operating status of each of the servers constituting the computing system and each job. Here, the operating status of each server is obtained by obtaining the utilization performance and power consumption of the server by means of the powerproperty measuring function2101 or the like, as described inEmbodiment 3. The operating status of each job is obtained by obtaining the utilization performance for the job by means of the jobproperty measuring function2601 or the like, as described inEmbodiment 4.Step3302 is to very the operating status of each server and each job obtained atstep3301. The verification function verifies that the performance requirements for all jobs running in the computing system are fulfilled and that the power consumption of the whole computing system is lower than the power consumption before the execution of job relocation. It is determined that there is a problem, if a server does not meet the performance requirements for all jobs running on the server, or if the measured power consumption of the server is significantly larger than the performance and associated power consumption given in the server table410. However, because the power consumption of the whole computing system varies depending on the operating status of each job, some margin may be added taking this variation into account, when comparing the power consumption after the job relocation with that before the job relocation. Because the power supply of the source server remains on, the verification ignores the power consumption of the source server. Atstep3303, a branch occurs, depending on whether there is a problem or not as the result of the verification atstep3302. If there is a problem, the procedure proceeds to step3304; if there is no problem, the procedure terminates.Step3304 is to perform a problem solution process for the problem detected atstep3302. The problem solution process may include returning the relocated jobs to the source servers and re-executing the procedure of thesearch function402.
Embodiment 8An eighth embodiment (Embodiment 8) of the present invention is a method in which the server powersupply control function404 inEmbodiment 1 turns off the power supply of a server and automatically sets the server to be used as a cold standby server. A combination of the eighth embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 34 an operation flow (procedure) of the server powersupply control function404A in the eighth embodiment. Difference fromEmbodiment 1 lies in thatstep3401 is added. Atstep3401, after the servers power supply is turned off atstep1905, the information for the server is sent to a cold standby function. Having received the information, the cold standby function automatically uses the server as a cold standby server. Here, the cold standby function hands over the job(s) running on an active server to another standby server if a fault occurs in the active server and makes the job(s) impossible to continue or when instructed by the user.
Embodiment 9A ninth embodiment (Embodiment 9) of the present invention is a system configuration where a special supervisory server is not installed. This is an example of operation in which power reduction facilities run on each sever running jobs, unlikeEmbodiment 1 where thepower reduction facility110 runs in thesupervisory server101, as described forFIG. 1. A combination of the ninth embodiment and any other embodiment can be regarded as one embodiment of the present invention.
FIG. 35 shows a structure of the computing system in Embodiment 9. Difference fromEmbodiment 1 lies in thatpower reduction facilities110D run on theservers111C-114C. In the present embodiment, thepower reduction facilities110D may be regarded as jobs as defined herein, but thesepower reduction facilities110D are not the objects to be relocated. To exclude thepower reduction facilities110D from the objects to be relocated, they are not included in the job table411,411A,411B. That is, thepower reduction facilities110D are not listed up when creating the job table411.
Thepower reduction facilities110D are regarded as programs that are automatically start when each server is booted, like OS. In the system structure ofFIG. 35, if, for example, ajob e124 existing on aserver113C is relocated to aserver111C or112C by thepower reduction facility110D, there will be no job running on theserver113C. Therefore, the power supply of theserver113C is turned off and, at the same time, thepower reduction facility110D on theserver113C is deactivated. Thereafter, control is exercised by thepower reduction facility110D on theserver111C or112C.
By the method and computer program for reducing the power consumption of a computing system of the present invention, described above in detail, in the computing system where a plurality of servers are connected by a network and one or more jobs are run, it is possible to reduce the power consumed, while complying with the SLA of the computing system.
The computing system in which the present invention is applied may be an intelligent home appliance system. The invention is also effective as the method for reducing the power consumption of the intelligent home appliance system. For example, in a situation where, during the operation of an air conditioner, a microwave oven connected to the air condition by home networking or the like is used, by applying the present invention, such application is possible that the air conditioner is temporarily deactivated so that no power is consumed by it, thereby ensuring stable power distribution to the microwave oven. Needless to say, each server is configured with the controller's CPU and a program memory in each of home appliance products constituting the intelligent home appliance system.