Background technology
Cloud computing has intimate unlimited calculating, storage, its communication ability, and is corresponding, and it is the data center that has assembled the large-scale basis resource that the IT architecture of cloud computing service is provided.Data center has a large amount of servers usually, but the visit of a large amount of servers brings white elephant can for the network of data center.Along with number of servers and density constantly increase, the network of data center becomes and sinks beneath one's burden.Support the large-scale application of full the Internet level, server cluster may reach thousands of, tens thousand of, and all can there be a lot of problems in professional deployment, expansion, operation, support.
Fig. 1 is a kind of non-blocking network model.Access-layer switch descending has 40 gigabit ports, and clog-free up 4 10,000,000,000 ports are to n platform (n=2 among Fig. 1) convergence switch (or other core network devices).Use equal-cost route (Equal-Cost Multipath Routing in the network planning, ECMP) uplink bandwidth with access switch carries out load balancing, thereby can realize the clog-free exchange of whole network, can have the ability of gigabit wire speed arbitrarily between the Service-Port, eliminate the bandwidth constraints factor of cloud computing cluster inside fully.Wherein, equal-cost route is meant concerning same Routing Protocol, allows many routes that the destination is identical and expense is also identical of configuration.When in the route of same destination, when not having the route of higher priority, these several routes are all adopted, when the message of this destination is gone in forwarding, send by each paths successively, thus the load balancing of realization network.For same destination, specific Routing Protocol also may be found the route of several equivalences, if this Routing Protocol is the highest at all active Routing Protocol medium priorities, these several different routes all are counted as the current effective route so.Like this, on the Routing Protocol aspect, guaranteed the load balancing of IP flow.
As shown in Figure 2, in the prior art, the aggregated links that connects convergence switch when access switch occurs when unusual, and for example: the link failure of port P2, at this moment aggregated links P1/P2 does not interrupt.But, the bandwidth of aggregated links changes because reducing the expense that causes this link.Access switch judges that the priority of this aggregated links is lower than the route priority of aggregated links P3/P4, so the route of P1/P2 is deleted from equal-cost route, though cause the P1/P2 aggregated links to also have bandwidth resources to utilize, be set in idle state.All flows are transmitted from aggregated links P3/P4, cause this link flow pressure excessive, the congested packet loss that causes occurs, thereby make the business of data center occur influencing user's normal visit unusually.
Summary of the invention
The invention provides the method and apparatus that a kind of message is transmitted, avoid because flow is all transmitted the congestion packet loss problem that causes by the aggregated links of high route priority.
The invention provides the method that a kind of message is transmitted, be applied to clog-free access network, message forwarding equipment by many each other the aggregated links of equal-cost route, identical high route priority carry out message and transmit, this method comprises:
As the Member Link fault of the aggregated links of equal-cost route the time, described message forwarding equipment is transmitted the message that receives by the aggregated links that remains high route priority;
Described message forwarding equipment is added up the packet loss that the aggregated links of the high route priority of described residue E-Packets;
When described packet loss satisfies when pre-conditioned, described message forwarding equipment is the message marking that receives according to preset strategy;
Described message forwarding equipment according to the mark of message with message respectively the link outside the aggregated links of aggregated links by the high route priority of described residue and the high route priority of described residue send.
Described message forwarding equipment is that the message marking that receives comprises according to preset strategy: the shared bandwidth of message that described message forwarding equipment statistics receives, and when the bandwidth that obtains of statistics surpassed the bandwidth of aggregated links of the high route priority of described residue, described message forwarding equipment was that the message that receives is stamped first mark;
Described message forwarding equipment according to the mark of message with message respectively the link outside the aggregated links of aggregated links by the high route priority of described residue and the high route priority of described residue send and comprise: described message forwarding equipment will carry the aggregated links transmission of the message of described first mark by the high route priority of described residue, and the message that will not carry described first mark sends by the link outside the aggregated links of the high route priority of described residue.
As the Member Link fault of the aggregated links of equal-cost route the time, described message forwarding equipment is transmitted the message receive by the aggregated links that remains high route priority and is comprised: as the Member Link fault of the aggregated links of equal-cost route the time, the state of the aggregated links of described message forwarding equipment fault is set to static standby by enlivening active, the aggregated links that stops by this fault is transmitted the message that receives, and the aggregated links by the high route priority outside the aggregated links of this fault E-Packets;
When the Member Link that breaks down recovers just often, described message forwarding equipment is updated to active with the state of the aggregated links of described fault, recovers to E-Packet by this aggregated links.
The shared bandwidth of message that described message forwarding equipment statistics receives, and when the bandwidth that obtains of statistics surpasses the bandwidth of aggregated links of the high route priority of described residue, described message forwarding equipment is that the message that receives is stamped first mark and comprised: described message forwarding equipment issues polymerization CAR configuration speed limit on downlink port, stamps first mark for the message in the bandwidth range of the aggregated links of the high route priority of described residue that receives.
When the link outside the aggregated links of the high route priority of described residue has different route priority, also comprise:
When the bandwidth that obtains when statistics surpasses the bandwidth of aggregated links of the high route priority of described residue, described message forwarding equipment is that the message that receives is stamped different marks according to the route priority of the link outside the aggregated links of the high route priority of described residue, carries out message according to the mark of message by link corresponding and transmits.
The invention provides a kind of message forwarding equipment, be applied to clog-free access network, comprising:
Transmitting element, be used for by many each other the aggregated links of equal-cost route, identical high route priority carry out message and transmit, as as the Member Link fault of the aggregated links of equal-cost route the time, transmit the message that receives by the aggregated links that remains high route priority; When the packet loss of the aggregated links of the high route priority of described residue satisfies when pre-conditioned, according to the mark of message with message respectively the link outside the aggregated links of aggregated links by the high route priority of described residue and the high route priority of described residue send;
Statistic unit is used for as the Member Link fault of the aggregated links of equal-cost route the time, adds up the packet loss that the aggregated links of the high route priority of described residue E-Packets;
Indexing unit is used for satisfying when pre-conditioned when packet loss that described statistic unit statistics obtains, is the message marking that receives according to preset strategy.
Described indexing unit also is used for: the shared bandwidth of message that statistics receives; When the bandwidth that obtains when statistics surpasses the bandwidth of aggregated links of the high route priority of described residue, for the message that receives is stamped first mark;
Described transmitting element also is used for: will carry the aggregated links transmission of the message of described first mark by the high route priority of described residue, the message that will not carry described first mark sends by the link outside the aggregated links of the high route priority of described residue.
Also comprise state set unit, be used for as the time that the state of the aggregated links of fault is set to static standby by enlivening active as the Member Link fault of the aggregated links of equal-cost route; When the Member Link that breaks down recovers just often, the state of the aggregated links of described fault is updated to active.
Described indexing unit also is used for: issue polymerization CAR configuration speed limit on downlink port, stamp first mark for the message in the bandwidth range of the aggregated links of the high route priority of described residue that receives.
Described indexing unit also is used for: when the link outside the aggregated links of the high route priority of described residue has different route priority, if the bandwidth that statistics obtains surpasses the bandwidth of the aggregated links of the high route priority of described residue, then the route priority according to the link outside the aggregated links of the high route priority of described residue is that the message that receives is stamped different marks;
Described transmitting element also is used for: the mark according to message is transmitted by carrying out message with the mark respective links.
Compared with prior art, the present invention has the following advantages at least:
Among the present invention, when the aggregated links packet loss of high route priority satisfies when pre-conditioned, message forwarding equipment E-Packets by the link outside the aggregated links of high route priority, thereby avoid making data center professional normal because the aggregated links of all flows by high route priority sends the congestion packet loss problem that causes.
Embodiment
Core concept of the present invention is: message forwarding equipment comprises multilink at up direction, and the up link that the message that message forwarding equipment receives from down direction passes through high priority sends, and the up link of low priority is by idle.When causing up packet loss owing to the flow of down direction is excessive, message forwarding equipment carries out the priority division to the flow of down direction, up link by high priority sends the message of high priority, sends the message of low priority by the up link of low priority.
For the clear method of introducing message forwarding provided by the invention, illustrate this method below by concrete application scenarios.As shown in Figure 3, being applied to non-blocking network with this method is example, and wherein message forwarding equipment is access-layer switch S1.As shown in Figure 4, this method may further comprise the steps:
Step 401, S1 carries out the uplink port link aggregation according to the state of uplink port, with the P1/P2 link aggregation, with the P3/P4 link aggregation.
Step 402, S1 finds that P1/P2 link (being P1 and P2 aggregated links) is identical with the P3/P4 link priority after calculating route according to Link State, is equal-cost route, then by P1/P2 link and P3/P4 chain road direction S3 and S4 transmission message.
Wherein, S1 calculates link priority and mainly considers factors such as link bandwidth, link overhead, and among the present invention, the P1/P2 link is identical with the P3/P4 link bandwidth, and priority is identical, constitutes equal-cost route, and S1 uses P1/P2 link and P3/P4 link to carry out load balancing.
Step 403 after S1 determines the P2 link failure, becomes Standby with P1/P2 aggregated links state from Active, and the follow-up message that receives from server is only by the P3/P4 link transmission.
The P1/P2 aggregated links is as logical links, after its state be set be Standby, still can carry out transfer of data by physical link P1 link wherein, but because the P1 link bandwidth is lower than the P3/P4 link, priority is lower, at the P3/P4 link just often, S1 only uses P3/P4 link transmission message.
Step 404, S1 knows that according to port packet loss statistics the packet loss of uplink port P3 and P4 satisfies pre-conditioned, determine the uplink bandwidth deficiency, in downlink port hair band tolerance speed configuration up and down, flow mark DSCP in the limiting bandwidth scope is 7 (limit priorities), is 6 (low priorities) at the extraneous flow mark of limiting bandwidth DSCP.
S1 is provided with port packet loss statistical function, detects the message packet loss of uplink port P3 and P4.When definite P3 and P4 packet loss, when perhaps detecting the message packet loss greater than threshold value, S1 determines the uplink bandwidth deficiency.S1 issues polymerization CAR (Committed Access Rate, agreement access rate) configuration speed limit on downlink port, stamp the flow mark of different stage for the flow that enters S1.Concrete, the message that the S1 downlink port receives enters the message process chip, and the message process chip disposes according to polymerization CAR the flow that enters in the particular port scope is added up, and is that flow is stamped the flow mark according to statistics.In the embodiment of the invention, polymerization CAR configuration requirement carries out the bandwidth speed limit to the flow that all downlink ports enter, be that the message process chip is added up the flow that all downlink ports receive, flow in the limiting bandwidth scope is stamped high priority flag, the extraneous flow of limiting bandwidth is stamped low-priority flags.
Step 405 when S1 sends message by up link, is remembered capable message into according to flux scale and is transmitted, and is 7 message by P3/P4 link transmission DSCP, is 6 message by P1 link transmission DSCP.
S1 is provided with the message forwarding strategy, and the message of the high priority up link by high priority is sent, and the message of the low priority up link by low priority is sent.In the embodiment of the invention, the P3/P4 link transmission DSCP by high priority is 7 message, and the P1 link transmission DSCP by low priority is 6 message.
If the P2 link-recovery is normal, then this method can also comprise:
Step 406, S1 knows that the P2 link-recovery is normal, and P1/P2 aggregated links state is become Active from Standby.
Step 407, the bandwidth speed limit configuration that on downlink port, issues before the S1 deletion.
Need explanation, method provided by the invention is not only applicable to scene shown in Figure 3, and is suitable equally for the situation of 3 above link aggregations and 3 above aggregated links.As shown in Figure 5, Access Layer comprises A1 and A2 switch, and convergence-level comprises C1, C2, C3 and C4 switch.A1 forms aggregated links P1 with link P11, P12, P13 and P14 polymerization, link P21-P24, P31-P34, the polymerization of P41-P44 difference are formed aggregated links P2, P3 and P4, and the bandwidth of every aggregated links is 4*n, and total uplink traffic is m, when m≤16*n, uplink traffic normally sends.If the link P11 between A1 and the C1 is unusual, the link P21/P22 between A1 and the C2 is unusual, and then the aggregated links between A1 and the C1/C2 is deleted from equal-cost route, causes the bandwidth resources waste of 8*n.At this moment, congestion packet loss can appear in flow when m>8*n.
Among the present invention, when the A1 switch detects the packet loss generation, issue the configuration of bandwidth speed limit at downlink port, flow mark high priority in the limited speed belt wide region, the flow mark low priority that the limited speed belt wide region is outer, the flow of high priority sends by aggregated links P3 and P4, and the flow of low priority sends by aggregated links P1 and P2.Optionally, flow mark high priority in the limited speed belt wide region is set, the flow mark suboptimum of the 3*n bandwidth that the speed limit bandwidth is outer is level earlier, all the other flow mark low priorities, the flow of high priority sends by aggregated links P3 and P4, the flow of inferior priority sends by aggregated links P1, and the flow of low priority sends by aggregated links P2.
By adopting method provided by the invention, when the up link packet loss of high priority satisfies when pre-conditioned, message forwarding equipment sends message by the up link outside the high priority, thereby avoid making data center professional normal because the up link of all flows by high priority sends the congestion packet loss problem that causes.
Based on the same or analogous technical conceive of said method embodiment, the present invention also provides a kind of message forwarding equipment, comprises many up links that priority is different, as shown in Figure 6, this message forwarding equipment comprises:
Transmittingelement 11, be used for by many each other the aggregated links of equal-cost route, identical high route priority carry out message and transmit, as as the Member Link fault of the aggregated links of equal-cost route the time, transmit the message that receives by the aggregated links that remains high route priority; When the packet loss of the aggregated links of the high route priority of described residue satisfies when pre-conditioned, according to the mark of message with message respectively the link outside the aggregated links of aggregated links by the high route priority of described residue and the high route priority of described residue send;
Statistic unit 12 is used for as the Member Link fault of the aggregated links of equal-cost route the time, adds up the packet loss that the aggregated links of the high route priority of described residue E-Packets;
Indexingunit 13 is used for satisfying when pre-conditioned when packet loss that describedstatistic unit 12 statistics obtain, is the message marking that receives according to preset strategy.
Wherein, indexingunit 13 is that the message marking that receives specifically comprises according to preset strategy: the shared bandwidth of message that statistics receives; When the bandwidth that obtains when statistics surpasses the bandwidth of aggregated links of the high route priority of described residue, for the message that receives is stamped first mark.Accordingly, transmittingelement 11 will carry the aggregated links transmission of the message of described first mark by the high route priority of described residue, and the message that will not carry described first mark sends by the link outside the aggregated links of the high route priority of described residue.Concrete, describedindexing unit 13 also is used for: issue polymerization CAR configuration speed limit on downlink port, stamp first mark for the message in the bandwidth range of the aggregated links of the high route priority of described residue that receives.This equipment can also comprise configuration deletecells 15, is used for after the aggregated links that breaks down is recovered normally, the polymerization CAR configuration that deletion issues on downlink port.
This equipment also comprisesstate set unit 14, is used for as the Member Link fault of the aggregated links of equal-cost route the time, and the state of the aggregated links of fault is set to static standby by enlivening active; When the Member Link that breaks down recovers just often, the state of the aggregated links of described fault is updated to active.
This equipment can also comprise routepriority determining unit 16, be used for determining the route priority of link according to Link State, for example determine the route priority of link according to link bandwidth, when the Member Link fault of aggregated links, the bandwidth of aggregated links reduces, routepriority determining unit 16 determines that the route priority of this aggregated links reduces, after the fault Member Link of this aggregated links recovers normally, the bandwidth of aggregated links increases, and routepriority determining unit 16 determines that the route priority of this aggregated links raises.There is the concrete of priority to determine that mode does not limit for the road in the embodiment of the invention.
Describedindexing unit 13 also is used for: when the link outside the aggregated links of the high route priority of described residue has different route priority, if the bandwidth that statistics obtains surpasses the bandwidth of the aggregated links of the high route priority of described residue, then the route priority according to the link outside the aggregated links of the high route priority of described residue is that the message that receives is stamped different marks; Accordingly, described transmittingelement 11 also is used for: according to the mark of message by transmitting with the mark respective links.
By adopting message forwarding equipment provided by the invention, when the aggregated links packet loss of high route priority satisfies when pre-conditioned, message forwarding equipment E-Packets by the link outside the aggregated links of high route priority, thereby avoid making data center professional normal because the aggregated links of all flows by high route priority sends the congestion packet loss problem that causes.
Through the above description of the embodiments, those skilled in the art can be well understood to the present invention and can realize by the mode that software adds essential general hardware platform, can certainly pass through hardware, but the former is better execution mode under a lot of situation.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in the storage medium, comprise that some instructions are with so that a computer equipment (can be a personal computer, server, the perhaps network equipment etc.) carry out the described method of each embodiment of the present invention.
It will be appreciated by those skilled in the art that accompanying drawing is the schematic diagram of a preferred embodiment, module in the accompanying drawing or flow process might not be that enforcement the present invention is necessary.
It will be appreciated by those skilled in the art that the module in the device among the embodiment can be distributed in the device of embodiment according to the embodiment description, also can carry out respective change and be arranged in the one or more devices that are different from present embodiment.The module of the foregoing description can be merged into a module, also can further split into a plurality of submodules.
More than disclosed only be several specific embodiment of the present invention, still, the present invention is not limited thereto, any those skilled in the art can think variation all should fall into protection scope of the present invention.