Embodiment
The list item method of calibration that the embodiment of the present application is provided, can apply to the number such as interchanger, routerVerified according to the list item in communication equipment, to judge whether list item data malfunction, to find mistakeWhen correct as early as possible, ensure data communications equipment reliability.Set as shown in figure 1, illustrating data communicationApplication system framework in standby, can include in the device:Host-processor 11 (HOST CPU),One or more hardware chips 12, for example, the hardware chip 12 can be FPGA (Field-Programmable Gate Array, field programmable gate array), NPU (Network Processing Unit,Network processing unit) etc. chip, PCI-E can be passed through between host-processor 11 and hardware chip 12(Peripheral Component Interface Express, bus and interface standard) bus is connected.
The data table items that host-processor 11 can generate control plane, are issued to hardware chip 12,The processing data packets of data plane are performed according to the data table items by hardware chip 12, for example, forwarding dataBag.The storage location of these data table items issued, can both be stored in the on-chip memory of hardware chip,Such as SRAM, can also be stored in the chip external memory of hardware chip, such as DDR SDRAM (canWith referred to as:SDRAM), no matter being stored in piece or outside piece, hardware chip 12 may be by theseData table items carry out processing data packets.As shown in figure 1, the host-processor 11 of the embodiment of the present application toWhen hardware chip 12 issues data table items, at least two identical data table items can be issued, this at least twoBar identical data table items can be located at same hardware chip, can also be located at different hardware chips.ExampleSuch as, at least two identical data table items shown in Fig. 1 can be located at two hardware chips 12 respectively,The a plurality of data table items issued into hardware chip can form tables of data.
The list item method of calibration of the embodiment of the present application, being will be to host-processor 11 to hardware chip 12The data table items (can be referred to as list item) issued are verified, and whether detection list item makes a mistake;AndAnd, the application is that the verification to list item is realized in the way of software detection, to solve ECC, EDAC etc.The problem of when technology is realized to hardware resource consumption.Fig. 2 example is may refer to, the software detection modeIt can be the algorithm flow that list item detection is run by host-processor HOST CPU, realize the school to list itemTest.Fig. 2 shows that a kind of processor 21 in entity structure diagram of data communications equipment, the equipment can be withIt is HOST CPU, the processor 21 can perform the list item check logic instruction stored in memory 23,To realize list item method of calibration.
It is following to instruct the list item method of calibration of design to illustrate list item check logic:This method will toIt is compared between at least two identical data table items that the hardware chip of data communications equipment is issued, withDetermine at least two data list items with the presence or absence of the abnormal list item for mistake occur.If it is above-mentioned at leastData consistency is met between two identical data table items, it is determined that at least two identical data table itemsList item to be verified (all or part of list item at least two described identical data table items includedFor list item to be verified, that is, need to judge the whether correct data table items of list item) correctly, otherwise, it determines extremelyThere is the abnormal list item of mistake in few two data list items." the data consistency judgement " can include manyKind of aspect, for example, it may be judging whether are application data list item to be verified and some reference data list itemIt is identical, or, it can also be whether judgement is stored in identical between many parts of list items to be verified of diverse location,It will be subsequently described by multiple examples.
In following example, exemplified by being verified to one of list item, to illustrate a variety of list item schoolsThe application of proved recipe method, each list item to be verified can make to be verified in the following method.
For example, in the example of a method of calibration, as shown in figure 1, HOST CPU11 can be to itIn a hardware chip 12 issue data table items (it is aftermentioned compare data consistency when as to be verifiedList item), the data table items can be stored in the on-chip SRAM of hardware chip 12, HOST CPU11Can also to HOST CPU DDR SDRAM (referred to as:HOST RAM, the piece of host-processorExternal memory) in send it is a with the data table items identical list item (it is aftermentioned compared data consistency whenIt is used as list item copy to be verified), the HOST RAM can be another hardware chip 12 in Fig. 1.It should be noted that list item copy to be verified and list item to be verified are actually identical list item, be only fromNominally make a distinction.In addition, in the present example, list item to be verified can be referred to as application data list item,List item copy to be verified is referred to as reference data list item, and can will receive the chip of reference data list itemReferred to as the first hardware chip, is referred to as the second hardware chip by the chip for receiving application data list item.
Fig. 3 illustrates a kind of flow of list item method of calibration, the flow describe it is above-mentioned shouldThe method for being compared to judge whether error entry with data table items and reference data list item.For example,In step 301, can be according to reference data list item verification Application data table items.The verification of this step canIt is by reference data table in this method to be whether benchmark data table items and application data list item are identicalItem is used as the benchmark of verification, it is believed that reference data list item is correct list item, then by application data list itemIt is compared with the reference data list item, it is possible to determine whether application data list item malfunctions.In addition, needingIt is noted that HOST CPU are when to forwarding list item under hardware chip, it is possible to record depositing for list itemStorage space is put, and so in the verification list item of this example, HOST CPU can be according to the position stored beforeInformation reads the application data list item in hardware chip 12 and the reference data list item for comparing therewith,To carry out list item verification.In step 302, reference data list item and application data list item are compared,Judge whether both data are identical.If both are identical, it is determined that two list items meet data consistency,Application data list item is correct, otherwise, it determines application data list item is the abnormal list item of mistake.
As above in example, the on-chip SRAM of hardware chip, usual capacity is smaller, the entry number of storageAmount is relatively fewer, and HOST CPU can retain the list item that is issued to SRAM in HOST RAMList item copy.For SDRAM outside the piece of hardware chip, the entry number of the outer SDRAM storages of such as chankingsCan also be by HOST CPU storage items copies in HOST RAM, using above-mentioned phase when measuring lessSame list item method of calibration, the list item to be verified in SDRAM is compared with list item copy to be verified.In addition, this verification mode for being compared list item to be verified with list item copy to be verified, can also be applied toOther examples, such as, when some list item to be issued to the on-chip SRAM of multiple hardware chips, eachList item in on-chip SRAM can be compared in verification with the list item copy in HOST RAM.
In the present example, can be to reference data in order to be further ensured that the correctness of reference data list itemList item uses some protection mechanisms.For example, HOST RAM can use ECC mechanism, so, HOSTList item in RAM can be repaired if there is single BIT mistakes by ECC, and two BIT mistakes can lead toCross hardware to detect and repaired by HOST CPU, it is possible to ensure by ECC protectionsHOST RAM in all list items be correct.So, then a needs can be only setTaking the ECC module of hardware space can then ensure that the list item in data communications equipment is reliable, save hardPart space.In another example, two parts or more part list items can also be retained in HOST RAM, only whenThe multiple copies of identical list item think that the list item is reliable when all consistent, can be used for treating verification list item progressJudge.
The list item method of calibration of Fig. 4 signals, is also being compared with reference data list item of using in application drawing 3Mode, difference is that application scenarios are different.As shown in figure 4, hardware chip 12 may be plug-in multipleChip external memory, such as multiple SDRAM, due to the limitation such as capacity, veneer area, can not be everyIndividual SDRAM configures mechanism for correcting errors such as ECC protections, but can realize one of SDRAMFor band ECC protections, the SDRAM of band ECC protections is properly termed as benchmark chip external memory, benchmarkIt is all correct that the list item stored in chip external memory, which can consider,.
In Fig. 4, can be with data table memory tbl_d ', tables of data tbl_d ' in benchmark chip external memory 14Middle storage is stored in other chip external memory tables of data as the list item copy to be verified of reference data list itemList item to be verified (i.e. application data list item) in tbl_d can be with the reference data in tables of data tbl_d 'List item is compared.For example, it is assumed that the tables of data tbl_d that is stored with SDRAM15, tables of data tbl_dIn include application data list item, can be by the application data list item in tables of data tbl_d and tables of data tbl_d 'In reference data list item be compared, if both are consistent, show the application number in tables of data tbl_dIt is correct according to list item.In another example, the tables of data that is stored with SDRAM16 tbl_e, in tables of data tbl_eInclude application data list item, can be by the application data list item in tables of data tbl_e and reference plate external memoryThe reference data list item included in tables of data tbl_e ' in reservoir 14 is compared, if both are consistent,Show that the application data list item in tables of data tbl_e is correct.
In addition, chip plug-in chip external memory such as SDRAM, general capacity is in tens G byte numbersMagnitude, the list item of storage is more, the method generally replicated using bank (data channel), in multiple bankIt is middle to store a list item respectively, in the case of list item is normal, same list item in each bank inAppearance is identical.In this example, many bank list items replicated can be existed using the method shown in Fig. 4Also store a in benchmark chip external memory, made during verification with the list item stored in the benchmark chip external memoryOn the basis of data table items, from multiple bank in other chip external memories same list item it is different copyIt is compared, judges whether the application data list item in each bank is identical with reference data list item.For example,The application data list item formation tables of data tbl_d stored in SDRAM15, is carried out in SDRAM15Many bank are replicated, then the portion that is also stored with the bank of benchmark chip external memory 14 is formed by data table itemsTables of data tbl_e ', can will in the Data Representation in tbl_e ' and SDRAM15 each bank numberIt is compared according to the data table items copy in table tbl_d, if identical, confirms answering in tables of data tbl_dIt is correct with data table items.
In the above-mentioned example for being compared application data list item with reference data list item, reference data list item can be withIt is the list item protected using mechanism such as ECC, it can be ensured that the correctness of reference data list item, by baseQuasi- data table items are compared with application data list item, it is possible to determine the correctness of application data list item.This mode, equivalent to the list item for storing a copy of it list item technical protections such as ECC, other positionsIt need not protect, therefore, although the protection of reference data list item occupies fractional hardware resource, phaseThe mechanism all protected for all list item memories in conventional art, significantly reduces disappearing for hardware resourceConsumption.And the hardware resource saved, it can be used for more effectively using, such as, can be used for arrangementMore chips, store more list items, improve the utilization rate to hardware resource.In another example,In the verification mode verified that application data list item is compared with reference data list item, if verification knotFruit determines entry error, and further the list item of mistake can also be repaired.Due to reference data list itemIt is correct, the reference data list item can be used to repair the list item of mistake.If for example, by HOSTWhen reference data list item in RAM is compared with the application data list item in SRAM, application data is determinedEntry error, can use the application number in the reference data entry updating SRAM in HOST RAMAccording to list item.In another example, in the example in fig. 4, when by the reference data list item in benchmark chip external memoryWhen being compared with the application data list item in other chip external memories, if it is determined that other application data list itemsIt is abnormal, then it can use the reference data entry updating mistake in the benchmark chip external memory of band ECC protectionsApplication data list item by mistake.
The list item method of calibration of the embodiment of the present application, additionally provides the mode of another list item verification, that is, existsIt is compared between the book copying list item of list item to be verified, the list item of the book copying is identical dataList item, judges whether the data of the book copying list item of list item to be verified are identical, if this book copying tableItem is all identical, it is determined that the list item is correct;Otherwise, it determines the list item is abnormal.Here book copying tableItem can include a variety of situations, for example, it may be on the on-chip memory of at least two hardware chipsDuplication list item.Some list item has storage on the on-chip SRAM of two fpga chips, can be byTwo list items in the two fpga chips are referred to as two parts of duplication list items of the list item, can be two parts by thisReplicate list item to be compared, if both contents are consistent, then it is assumed that list item is normal, says if inconsistentBright entry error.
In another example, book copying list item can also be the copy table in multiple bank of chip external memory, the bank be distributed at least one be used for using list item to be verified hardware chip in;Wherein, bankDuplication is a kind of mode for being used to improve SDRAM access speeds, i.e., by same list item in multiple bankIt is middle to replicate a respectively.Book copying list item described in this example can be some list item at oneThe book copying list item replicated in multiple bank in a plug-in FPGA SDRAM, or, alsoCan be that many parts replicated in multiple bank in list item SDRAM plug-in under two FPGA are multipleTabulation item, etc..It will be compared between duplication list item in these bank, if book copying list itemData are identical, then it is assumed that the corresponding list item of book copying list item is correct.
By taking the verification of some list item replicated of multiple bank in one of SDRAM as an example:
In one example, it is assumed that have two bank in the SDRAM, list item to be verified is in the two bankIn each a list item of storage, i.e., list item to be verified has two duplication list items.Carried out treating verification list itemDuring verification, the duplication list item being stored respectively in the two bank can be compared, if both phasesTogether, it is determined that list item to be verified is correct;Otherwise, illustrate that list item there occurs mistake.In this example embodiment, whenWhen detecting generation entry error, HOST CPU can be reported, so that HOST CPU carry out equipment weightMasterslave switchover is opened or carried out, network is recovered normal as early as possible.
In another example, it is assumed that have the bank of three or more than three, this feelings in the SDRAMThe method that the above-mentioned data by between each bank are compared can also be used under condition to determine list itemIt is whether correct, in addition, this example additionally provide it is a kind of for more than three bank when simple verification sideMethod, that is, compare counter process.But, many bank comparison algorithms are not limited to compare counter process, otherMany numbers can also be applied in list item verification according to comparison algorithm.
The application for comparing counter process is illustrated with an example:By taking the list item that four bank are replicated as an example,Assuming that data0, data1, data2, data3 are same list items be respectively stored in bank0, bank1, bank2,Data in bank3.So this four data data0, data1, data2, data3 are properly termed as the tableFour duplication list items, be it is each duplication list item set respectively one it is corresponding compare counter counter,It is as shown in table 1 below, counter n (n=0,1,2,3).
Table 1 replicates list item and compares counter with corresponding
| bank0 | bank1 | bank2 | bank3 |
| data0 | data1 | data2 | data3 |
| counter0 | counter1 | counter2 | counter3 |
When carrying out list item verification, these each corresponding comparison counters of duplication list item are first set to zero,Then will in pairs it be compared between each duplication list item and other duplication list items, however, it is determined that two tables of dataItem is differed, then the numerical value for comparing counter corresponding to described two data table items plus one respectively, and works asAt the end of comparing in pairs, if there is nonzero value in the numerical value of each comparison counter, it is determined that this at least threeThere is the abnormal list item for mistake occur in individual data table items.
For example, by taking the counter0 in table 1 as an example, by data0 successively with data1, data2, data3It is compared, the corresponding counter that compares of the data plus one when two data comparative results are inconsistent, it is assumed thatData0, data1, data3 are correct data, and data2 is the data of mistake, then according to above-mentionedPrinciple, when data0 is compared with data1, data2, data3 successively, only data0 and data2 differsCause, then counter0 adds 1, equal to 1, meanwhile, the corresponding counter2 of data2 also add 1, equal to 1.
If compared by above-mentioned three times, i.e., after data0 is compared with data1, data2, data3 successively,The numerical value of each comparison counter is all identical, such as, if this four duplication list items are all correct, then fourThe individual numerical value for comparing counter is all 0, it is determined that this four bank list item is all correct, this verification knotBeam.
If three times comparative result at least occurs once inconsistent, such as in above example, it is assumed thatDuring data2 mistakes, counter 0 and counter2 numerical value are that 1, counter1 and counter3 is zero,It then will now continue to compare.Data1 is compared with data2, data3 successively, data2 and data3It is compared (i.e. all data are compared two-by-two, and four groups of data carry out comparing for six times altogether).When two-by-twoAt the end of comparing, all numerical value for comparing counter are more than or equal to 1.Still with the example of above-mentioned data2 mistakes,Compare after end, the numerical value of each comparison counter is:Counter0=1, counter1=1, counter2=3,Counter3=1.That is the corresponding numerical value for comparing counter of this book copying list item is unequal, including more thanOne and the numerical value equal to one, such case is then thought in this each corresponding data table items of four bank at leastOne there is mistake.
And then can determine to determine to be likely to occur the list item of mistake according to comparing the numerical value that counter recorded,Since it is desired that compared in pairs, thus abnormal list item and other list items compare when can all be counted,It so can be understood as the list item that wrong abnormal list item is counter numerical value maximum occur.Such as,In the above example, counter0=1, counter1=1, counter2=3, counter3=1, from comparingFrom the point of view of the numerical value of counter, counter2 numerical value is 3, and other counter numerical value is 1, counter2Numerical value it is maximum, it can be understood as the corresponding duplication list item data2 of counter2 are exactly abnormal list item.SeparatelyOutside, minimum value that can be from the numerical value for comparing counter has several in the duplication list item to judge multiple bankIt is individual mistake occur, there is one to occur in that mistake in four duplication list items of explanation if minimum value is 1 as described aboveBy mistake, illustrate that multiple duplication list items occur in that multiple mistakes if minimum value is more than 1.
By the description above it can be seen that, using the method for comparing counter, multiple duplications can be detectedSeveral parts of error in data are there occurs in the such as multiple bank of list item., can also be to wrong data in this exampleRepaired, for example, when there is the bank of three or more than three, passing through above-mentioned comparison counterMethod, if it is determined that there occurs a entry error, then with the duplication for occurring comparing counter values minimumList item is repaired to comparing the maximum duplication list item of counter values.Such as, in the above example,The corresponding data2 of counter2 are repaired with the corresponding data0 of counter0.And if it is determined that there occurs at leastIt during two parts of error in data, cannot repair, can will detect the positional information of mistake (such as,The corresponding data2 of bank2 position) and the information reporting HOST CPU that whether repair atReason.Even if the list item of repaired mistake, the positional information and reparation situation of mistake can also be reported.
The list item method of calibration of this example, when it is determined that list item to be verified makes a mistake, in order that must judgeAs a result it is more accurate, the data consistency by continuous preset times can be set and judged, table to be verifiedWhen Xiang Jun is unsatisfactory for data consistency, then determine entry error to be verified.Because, if once verifiedThink that list item to be verified is unsatisfactory for being considered as entry error to be verified during data consistency, there may come a time when it is to missSentence, such as, and the moment of forwarding list item under control plane, contents in table and SRAM in HOST RAMIn content, or content in SDRAM between multiple bank is probably inconsistent, here by adoptingWith continuous N time detection it is prevented that this inconsistent caused erroneous judgement.This continuous N time detectionMechanism may be equally applicable for example, two bank that reference data list item and application data list item compareBetween the example that is compared etc..That is, after at least two data list items described in comparing, record is describedThe inconsistent number of times of at least two data list items;When the number of times be more than preset times when, it is determined that it is described extremelyOccurs the abnormal list item of mistake in few two data list items.
In addition, the list item method of calibration of the application can all be embodied as the software performed by HOST CPUDetection mode, or, can also be by SDRAM detection (such as, between two or more bankList item compares) realized in hardware chip (e.g., the chip such as FPGA or NPU).When in hardware chipWhen realizing, only just errors present information and restoration information can be notified when detecting list item exceptionHOST CPU, so as to mitigate HOST CPU load.
The list item method of calibration of the embodiment of the present application, is the consistency verification of data integrally carried out to list item,And the technology such as traditional ECC, EDAC can only detect single bit mistakes, two bit mistakes in list item etc.,Many bit mistakes can not be accurately detected, and by way of list item integrally compares, such as, will be to be verifiedApplication data list item and reference data list item content be compared it is whether identical, it is this compare containThe full content of list item, whether list bit, two bit or many bit, as long as it is consistent to be unsatisfactory for data, justEntry error can be determined.
The list item method of calibration not only can find mistake as soon as possible when mistake occurs in list item, can also beEntry error is corrected in the case of certain, such as, is repaired with reference data list item.In addition, the table of the applicationItem method of calibration is a kind of software detecting method, as long as reading the table of relevant position according to the present processesBe compared, according to comparative result be can determine whether list item whether mistake, will not take or seldom take hardPart resource, can improve the utilization rate of hardware resource.
In one example, if list item quantity to be protected is relatively more in data communications equipment, Ke YifenRepeatedly complete the verification to all list items;If list item negligible amounts, can also without several times, butOnce complete the verification of all list items.That is, during the data table items, periodically comparePartial data list item in relatively at least two data list items described in part, until comparing at least two described in completionData list item.
For example, (actual to implement so that the list item to being stored in SRAM and SDRAM is verified as an exampleIn be not limited to the memories of these types), it is assumed that have three tables T10, T11, T12 in SRAM, pointThere are not E10, E11, E12 list items, then total list item number is exactly (E10+E11+E12).Assuming that all list items detection for needing complete in a SRAM for N1 seconds, and the time interval of timerIt it is S1 milliseconds, then number C1=((E10+E11+E12) * S1)/(1000*N1) detected every time.
In another example, same principle, it is assumed that a total of three tables T20, T21, T22 in SDRAM,There are E20, E21, E22 list items respectively, then total list item number is exactly (E20+E21+E22),Assuming that all list items detection for needing complete in a SDRAM for N2 seconds, and between the time of timerEvery being S2 milliseconds, then number C2=((E20+E21+E22) * S2)/(1000*N2) detected every time.
The timer that SRAM and SDRAM respectively start a cycle, each cycle difference can be corresponded toC1 and C2 list item are detected, the detection is to carry out data consistency judgement, Mei Gebiao to these list itemsThe verification of item can be carried out according to the method for calibration described in above-described embodiment.If found in detectionThe C1 and C2 list item all meet data consistency, then can determine that these list items all do not note abnormalities,It is correct, this detection terminates.
The embodiment of the present application, which can instruct the list item check logic in the data communications equipment shown in Fig. 2, to be claimedFor list item calibration equipment, as shown in figure 5, the device can include:List item issues module 51 and list item ratioCompared with module 52.
List item issues module 51, and at least two are issued for the hardware chip into the data communications equipmentIdentical data table items;
List item comparison module 52, at least two data list items described in comparison, it is determined that described at least twoOccurs the abnormal list item of mistake in data table items.
In one example, at least two identical data table items, including:Reference data list item andApplication data list item.
The list item issues module 51, for issuing the reference data list item to the first hardware chip, andThe application data list item is issued to the second hardware chip;
The list item comparison module 52, for verifying the application data sheet according to the reference data list item, if the application data list item is differed with the reference data list item, it is determined that the application dataList item is abnormal list item.
In one example, referring to Fig. 6, the device can also include:First list item repair module 53,For when it is determined that the application data list item is abnormal list item, institute to be repaired by the reference data list itemState application data list item.
In one example, the list item issues module 51, for respectively into the hardware chip extremelyIdentical data table items are issued in few three bank;
The list item comparison module 52, for being compared in pairs between at least three data table items, ifDetermine that two data table items are differed, then the counter that compares corresponding to described two data table items respectivelyNumerical value adds one;At the end of comparing in pairs, if there is nonzero value in the numerical value respectively compared in counter, reallyThere is abnormal list item in fixed at least three data table items.
In one example, the device also includes:Second list item repair module 54, based on determining to compareThe maximum data table items of number device numerical value are abnormal list item, and determine to compare the minimum tables of data of counter valuesItem is modified to the abnormal list item.
If the function is realized using in the form of SFU software functional unit and as independent production marketing or usedWhen, it can be stored in a computer read/write memory medium.Understood based on such, the application'sThe part or the part of the technical scheme that technical scheme substantially contributes to prior art in other words canTo be embodied in the form of software product, the computer software product is stored in a storage medium,Including some instructions to cause a computer equipment (can be personal computer, server, orNetwork equipment etc.) perform all or part of step of the application each embodiment methods described.And it is foregoingStorage medium includes:USB flash disk, mobile hard disk, read-only storage (ROM, Read-Only Memory),Random access memory (RAM, Random Access Memory), magnetic disc or CD etc. are various canWith the medium of store program codes.
The preferred embodiment of the application is the foregoing is only, it is all at this not to limit the applicationWithin the spirit and principle of application, any modification, equivalent substitution and improvements done etc. should be included inWithin the scope of the application protection.