CN102546159B

Movatterモバイル変換

Info

Publication number: CN102546159B
Application number: CN201210020150.XA
Authority: CN
Inventors: 王勇
Original assignee: Guilin University of Electronic Technology
Current assignee: Zhaoqing Yihua Internet Technology Co ltd
Priority date: 2012-01-29
Filing date: 2012-01-29
Publication date: 2014-07-30
Anticipated expiration: 2032-01-29
Also published as: CN102546159A

Abstract

本发明属于信息安全领域，特别是密码学领域，涉及一种构造抗查表攻击的基于随机函数的单向哈希函数的方法。该方法以随机函数作为哈希函数，在计算的时候由明文消息和公开参数确定随机函数的具体形式。对于不同的公开参数，hash值是不一样的，这种方法构造的hash函数给破译带来了多方面的障碍，查表攻击一般针对确定的hash值，要建立一个满足不同公开参数的表是需要耗费大量的时间和空间代价的，且这里的抗查表攻击并没有依赖于计算量的增加和hash值长度的增加。现有的hash函数分析一般针对确定的hash函数，当密码分析者在仅仅只知道hash值的情况下，破译将会更加困难，由于不知道随机函数的具体的函数形式，采用一般的破译方法会无从下手。

The invention belongs to the field of information security, in particular to the field of cryptography, and relates to a method for constructing a random function-based one-way hash function resistant to table lookup attacks. In this method, a random function is used as a hash function, and the specific form of the random function is determined by plaintext messages and public parameters during calculation. For different public parameters, the hash value is different. The hash function constructed by this method brings many obstacles to deciphering. Table lookup attacks generally target a certain hash value. It is necessary to create a table that satisfies different public parameters. It takes a lot of time and space, and the anti-lookup table attack here does not depend on the increase in the amount of calculation and the increase in the length of the hash value. The existing hash function analysis is generally aimed at a certain hash function. When the cryptanalyst only knows the hash value, it will be more difficult to decipher. Since the specific function form of the random function is not known, the general deciphering method will There is no way to start.

Description

Translated fromChinese

抗查表攻击的随机单向哈希函数构造方法A random one-way hash function construction method against table look-up attack

技术领域technical field

本发明属于信息安全领域，涉及一种单向哈希函数的构造方法，特别是基于随机函数的单向哈希函数的构造方法。The invention belongs to the field of information security, and relates to a construction method of a one-way hash function, in particular to a construction method of a one-way hash function based on a random function.

背景技术Background technique

现有的hash（又译为哈希、杂凑、散列）函数具有固定的结构和函数，这为密码分析提供了便利。对于hash函数H=Hash(M)，一般通过M计算H容易，而通过H求对应的一个或者多个M是非常困难的，在有限的计算能力下往往需要很长的时间。不过近年来王小云破解了一系列的hash函数。2004年，2005年，王小云带领的研究小组先后破译了被全球广泛运用于计算机安全系统的MD5和SHA-1两大密码算法，引起国际密码学界的极大关注和高度赞誉。对于一个固定的函数，要保证这种单向性是比较困难的，因为原则上说可以根据hash函数的结构进行逆推（虽然它是单向的，原文和hash值是多对一的映射关系，但是在借助一些数学方法和计算工具的情况下，可能进行成功的逆推，这提供了一个破解的入口），需要将算法设计的非常复杂。假如一个hash函数的算法是随机的、不确定的，则密码分析者很难着手。与传统的确定函数相对，我们这里提出随机函数的概念，即这个函数的表达式、结构和形式是随机的，不确定的，比如随机函数y=F(a，b，c)，F(a，b，c)并没有明确的形式，它的具体形式可能是f₁(a，b，c)、f₂(a，b，c)、f₃(a，b，c)、f₄(a，b，c)之中的一个函数。The existing hash (also translated as hash, hash, hash) function has a fixed structure and function, which provides convenience for cryptanalysis. For the hash function H=Hash(M), it is generally easy to calculate H through M, but it is very difficult to find one or more corresponding M through H, and it often takes a long time under limited computing power. However, in recent years, Wang Xiaoyun has cracked a series of hash functions. In 2004 and 2005, the research team led by Wang Xiaoyun successively deciphered the two major cryptographic algorithms, MD5 and SHA-1, which are widely used in computer security systems around the world, which aroused great attention and high praise from the international cryptography community. For a fixed function, it is more difficult to ensure this one-way, because in principle, it can be reversed according to the structure of the hash function (although it is one-way, the original text and the hash value are a many-to-one mapping relationship , but with the help of some mathematical methods and calculation tools, it is possible to carry out successful reverse deduction, which provides an entry point for cracking), and the algorithm needs to be designed very complicated. If the algorithm of a hash function is random and uncertain, it is difficult for cryptanalysts to start. Compared with the traditional deterministic function, here we propose the concept of random function, that is, the expression, structure and form of this function are random and uncertain, such as random function y=F(a,b,c), F(a , b, c) has no definite form, and its specific form may be f₁ (a, b, c), f₂ (a, b, c), f₃ (a, b, c), f₄ ( a function among a, b, c).

除了技术性的攻击，现有的一些蛮力攻击也是非常有效而且现实的，比如查表攻击可以很快破译一些hash函数，比如现在的一些hash函数的消息和hash值的对应关系，已经被一些黑客制作成了彩虹表，将所有的hash值对应的一个或者多个明文都保存在一个表中，并且进行检索排序，这样黑客只需要查找即可找到碰撞或者原来的明文，一些网站也提供了部分或者全部hash函数的破译。固然，提高hash值长度是一个解决方法，但是，这为用户带来了不便，比如，需要记忆、存储、记录很长的hash值，需要比较很长的值。本发明利用随机函数来构建随机的hash值，hash值是随机的，由一个公开参数P确定，由于hash值随机，这样不仅能够抵抗技术性的攻击，而且，也增加了查表和蛮力攻击的难度。这样的算法并没有直接依靠增加hash值长度和计算量来增加查表攻击和其他破译的难度。In addition to technical attacks, some existing brute force attacks are also very effective and realistic. For example, table lookup attacks can quickly decipher some hash functions. For example, the correspondence between some hash function messages and hash values has been hacked by some hackers A rainbow table is made, and one or more plaintexts corresponding to all hash values are stored in a table, and searched and sorted, so that hackers only need to search to find the collision or the original plaintext. Some websites also provide some Or the deciphering of all hash functions. Of course, increasing the length of the hash value is a solution, but this brings inconvenience to the user, for example, it needs to memorize, store, and record a very long hash value, and it needs to compare very long values. The present invention uses a random function to construct a random hash value. The hash value is random and determined by a public parameter P. Since the hash value is random, it can not only resist technical attacks, but also increase the risk of table lookup and brute force attacks. difficulty. Such an algorithm does not directly rely on increasing the length of the hash value and the amount of calculation to increase the difficulty of table lookup attacks and other deciphering.

发明内容Contents of the invention

本发明采用随机函数来构造hash函数，虽然本发明中hash算法是不确定的，hash值也是随机的，但是，在一般的情况下，为了达到验证的目的，明文一定时候，算法和hash值应该是确定的，为了解决这个问题，我们设定hash函数与一个公开参数和明文输入的相关性，当公开参数和输入明文是确定的时候，hash函数的形式也是确定的，在已知明文和公开参数的条件下，hash函数是确定的，即hash函数由明文以某种方式确定，但是当公开参数不确定的时候，hash值也不确定，这给查表攻击带来了一些障碍。The present invention uses a random function to construct a hash function. Although the hash algorithm in the present invention is uncertain and the hash value is also random, in general, in order to achieve the purpose of verification, when the plaintext is certain, the algorithm and the hash value should be It is certain. In order to solve this problem, we set the correlation between the hash function and a public parameter and plaintext input. When the public parameter and the input plaintext are definite, the form of the hash function is also definite. In the known plaintext and public Under the condition of the parameters, the hash function is definite, that is, the hash function is determined in a certain way by the plaintext, but when the public parameters are uncertain, the hash value is also uncertain, which brings some obstacles to table lookup attacks.

我们假设随机hash函数是D=H(m)，H是随机的，m是函数的变量输入，借鉴传统的确定性的hash函数的特点，在采用随机函数的hash函数中，明文除了要决定hash函数的具体形式h_i(m)外，还作为D= h_i(m)算法的输入m。决定hash函数具体形式的明文部分和作为参数m输入hash函数的部分应该是高度相关的，这样的设计是为了防止密码分析者各个击破，因为对于这样的hash函数，先确定函数后，再寻找合适m的过程中，m的变换也会导致函数的变化，无法同时满足。在本发明中需要将它们的关系设计的非常复杂，使得很难破解。明确地说，就是设计的时候，需要将明文的尽量多的比特数都同时参与决定函数形式和m，那么将明文的全部用于决定两者是最为合适的，考虑到现有的hash函数的形式均需要进行填充，所以本发明中将填充处理后的明文分组的用于决定hash函数形式和m。We assume that the random hash function is D=H(m), H is random, m is the variable input of the function, drawing on the characteristics of the traditional deterministic hash function, in the hash function using the random function, in addition to determining the hash In addition to the specific form of the function h_i (m), it is also used as the input m of the D= h_i (m) algorithm. The plaintext part that determines the specific form of the hash function and the part that is input into the hash function as the parameter m should be highly correlated. This design is to prevent cryptanalysts from breaking through one by one, because for such a hash function, first determine the function, and then find a suitable one. In the process of m, the transformation of m will also lead to the change of the function, which cannot be satisfied at the same time. In the present invention, their relationship needs to be designed very complicated, so that it is difficult to decipher. To be clear, when designing, it is necessary to use as many bits of the plaintext as possible to determine the function form and m at the same time, so it is most appropriate to use all the plaintext to determine the two. Considering the existing hash function All forms need to be filled, so in the present invention, the filled plaintext group is used to determine the hash function form and m.

鉴于MD5、SHA-1已经被王小云破解，而这些hash函数都存在单bit位运算较多，缺乏像S盒之类的多bit的操作，因此，安全的hash应该采用S盒以及类似的多bit混淆的运算。In view of the fact that MD5 and SHA-1 have been cracked by Wang Xiaoyun, and these hash functions have more single-bit operations and lack of multi-bit operations such as S-boxes, therefore, safe hashes should use S-boxes and similar multi-bit Obfuscated operations.

对于hash函数的构造，有多种方法，比如采用压缩函数和借用分组密码方法。本发明采用压缩函数的方法。For the construction of the hash function, there are many methods, such as using a compression function and borrowing a block cipher method. The present invention adopts the method of compressing functions.

具体的hash函数构造方法为：The specific hash function construction method is:

第一步，确定压缩函数每一次处理的消息分组长度L。The first step is to determine the message packet length L processed by the compression function each time.

第二步，设计填充和附加长度信息的方式，对消息进行填充，由于hash需要有确定的结果，所以，填充的方式应该是固定的，对于一个消息，只能有惟一的填充方式，对于填充，应该能够很容易确定填充的区域，由于填充长度本身是不确定的，所以，可以采用以下两种方式：第一种，如果消息长度不是nL-f，则第一位填充0，随后所有bit都填充1，或者相反，采用这种方式，除了消息长度刚好是nL-f形式的情形，很容易确定填充内容，填充的长度也是确定的，将其填充到nL-f，如果长度为nL-f，则不填充，n为符合该条件的最小整数，f是一个固定的值，小于L，f是预留用于附加消息长度的，其长度以能够满足大多数的消息和文件的长度不超过2^f，同时也不要太长为标准；第二种，如果消息长度不是nL-f，则采用上述的填充方式，如果消息长度为nL-f形式，则在后面填充一个L长度的消息，即填充后的长度为（n+1）L-f，依然是第一位填充0，随后所有bit都填充1，或者相反。根据消息的长度和填充的规则都可以确定填充的长度，两者应该是相等的，这样可以进行校验，也给伪造和破译设置了障碍。附加消息长度的方式为：当消息的长度的二进制值超过f时，取末尾的f bit，当长度不及f时，在前面附加填充0，最终经过填充和附加后的消息成为分组长度L的倍数。由于有两种数据存储方式，LITTLE-ENDIAN和BIG-ENDIAN，应该根据机器类型选取最为合适的一种来表示填充前的消息的长度。The second step is to design the way of padding and appending length information to pad the message. Since the hash needs to have a definite result, the padding method should be fixed. For a message, there can only be a unique padding method. For padding , it should be easy to determine the padding area. Since the padding length itself is uncertain, the following two methods can be used: first, if the message length is not nL-f, the first bit is filled with 0, and all subsequent bits are all filled with 1, or on the contrary, in this way, except for the case where the length of the message is just in the form of nL-f, it is easy to determine the content of the filling, and the length of the filling is also determined. Fill it to nL-f, if the length is nL- f, no padding, n is the smallest integer that meets the condition, f is a fixed value, less than L, f is reserved for additional message length, and its length can meet the length of most messages and files More than 2^f , and not too long is the standard; second, if the message length is not nL-f, then use the above filling method, if the message length is in the form of nL-f, then fill a L-length message behind, That is, the length after filling is (n+1)Lf, and the first bit is still filled with 0, and then all bits are filled with 1, or vice versa. The length of the padding can be determined according to the length of the message and the padding rules, and the two should be equal, so that it can be verified, and it also sets up obstacles for forgery and deciphering. The method of adding the message length is: when the binary value of the length of the message exceeds f, take the f bit at the end, and when the length is less than f, add 0 to the front, and finally the message after filling and appending becomes a multiple of the packet length L . Since there are two data storage methods, LITTLE-ENDIAN and BIG-ENDIAN, the most appropriate one should be selected according to the machine type to represent the length of the message before padding.

第三步，设定hash值的长度，考虑各种现有的hash分析技术，除非特别需要便利而不太需要安全的场合，长度值n应该大于160bit。The third step is to set the length of the hash value, considering various existing hash analysis techniques, unless convenience is particularly needed but security is not required, the length value n should be greater than 160bit.

第四步，确定hash的初始值，这个（这些）值是确定的，总长度为n。选择数据的存储方式，一般应该和消息长度的存储方式相同。The fourth step is to determine the initial value of the hash. This (these) values are determined and the total length is n. Select the storage method of the data, which should generally be the same as the storage method of the message length.

第五步，设计随机的压缩函数H，压缩函数是hash函数的最关键部分，它的输入是前一个hash值或者是hash的初始值，输出为中间hash值或者最终hash值。本发明如前面说讨论的，为了达到较好的扩散和混乱效果，应该采用S盒或者类似的多比特代替部件。S盒的设计原则同对称密码学中的S盒设计是一样的，比如速度快，最好可以用运算表示（而不是查表），有比较好的雪崩效应、扩散效果、差分均匀性，但是无需要求可逆性。一些公开标准算法的S盒都经过分析与测试具有很好的效果，可以择优直接使用，并且无需考虑是否可逆（因为hash函数值只是用于验证，无需解密）。随机压缩函数构造可以考虑借鉴现有的对称密码和hash函数的构造方法。独特之处在于，一些部件可以采用随机的函数，比如随机S盒可以有多个S盒、循环移位的位数可以是随机的，比特的运算随机选择异或、与及或等。其他的各种运算都可以采用随机函数。除非需要达到特别的互补、对冲效应，所有的随机函数部件的随机性都应该是统计独立的。即任何两个或者多个随机函数部件之间所做的具体函数形式选择都是不相关的。随机函数的各个具体函数形式一般应该具有相对的等效性（比如运算量相当，输出值范围接近，如果有冗余，也应该接近），并且能够与其他的部件配合达到很好的安全性。可选地，可以在最后一个分组使用压缩函数后，对最终压缩的数据进行进一步的压缩，这样一般是用于当为了提高安全性而将中间hash值的长度设计的很长，而最终的hash值太长又不方便的场合。The fifth step is to design a random compression function H. The compression function is the most critical part of the hash function. Its input is the previous hash value or the initial value of the hash, and the output is the intermediate hash value or the final hash value. In the present invention, as discussed above, in order to achieve better diffusion and confusion effects, S-boxes or similar multi-bit replacement components should be used. The design principle of the S-box is the same as that of the S-box in symmetric cryptography. For example, the speed is fast, and it is best to use calculations (rather than look-up tables). It has better avalanche effect, diffusion effect, and differential uniformity. Reversibility is not required. The S-boxes of some public standard algorithms have been analyzed and tested with good results, and can be used directly without considering whether they are reversible (because the hash function value is only used for verification and does not need to be decrypted). The construction of the random compression function can be considered for reference to the existing construction methods of symmetric ciphers and hash functions. The uniqueness is that some components can use random functions, such as a random S-box can have multiple S-boxes, the number of bits of the cyclic shift can be random, and the operation of bits can randomly choose XOR, AND, and OR, etc. Various other operations can use random functions. Unless special complementary and hedging effects are required, the randomness of all random function components should be statistically independent. That is, the choice of specific function form between any two or more random function components is irrelevant. Each specific function form of the random function should generally have relative equivalence (for example, the amount of calculation is equivalent, the output value range is close, and if there is redundancy, it should be close), and it can cooperate with other components to achieve good security. Optionally, after the last packet uses the compression function, the final compressed data can be further compressed, which is generally used when the length of the intermediate hash value is designed to be very long in order to improve security, and the final hash Occasions where the value is too long and inconvenient.

第六步，确定随机函数的确定编码A的结构和形式，压缩函数本身是随机的，但是在具体的明文（消息）下，函数应该是确定的。本发明通过确定编码A来确定这个函数的具体形式。分别考虑方便性和数据节省，可以采用两种方式：1）独立的随机函数部件的编码采用单独的bit位，如果一些随机函数部件是相关的，可以结合在一起编码。随机函数部件对应的确定编码的bit位长度不小于log₂e，e是该随机函数部件的具体形式的个数。根据随机压缩函数的随机部件数量及其对应的元素数量有关系。取各个随机部件所需的最少位数，即log₂e（如果为整数）或log₂e取整加1，如果考虑读取和存储方便，也可以取最小的8的倍数的位数。这样可以得到确定随机函数的编码的格式，比如前面多少bit或者字节是确定哪个随机部件，接着多少长度的信息是确定什么随机部件，如此类推。这样可以确定A的长度。2）将随机函数作为一个整体考虑其所有可能的具体函数形式数目E，取确定随机函数所需的最少位数，即log₂E（如果为整数）或log₂E取整加1，作为A的长度。建立A与各个具体形式的对应关系。The sixth step is to determine the structure and form of the definite code A of the random function. The compression function itself is random, but under the specific plaintext (message), the function should be deterministic. The present invention determines the specific form of this function by determining the code A. Convenience and data saving are considered separately, and two methods can be adopted: 1) The encoding of independent random function components uses a separate bit, and if some random function components are related, they can be coded together. The bit length of the definite code corresponding to the random function component is not less than log₂ e, where e is the number of specific forms of the random function component. There is a relationship between the number of random components and their corresponding number of elements according to the random compression function. Take the minimum number of digits required by each random component, that is, log₂ e (if it is an integer) or log₂ e plus 1, if you consider the convenience of reading and storage, you can also take the smallest number of digits that is a multiple of 8. In this way, the encoding format for determining the random function can be obtained, such as how many bits or bytes in the front determine which random component is determined, and the information of the following length determines which random component, and so on. This determines the length of A. 2) Consider the random function as a whole with the number E of all possible specific functional forms, and take the minimum number of digits required to determine the random function, that is, log₂ E (if it is an integer) or log₂ E plus 1, as A length. Establish the corresponding relationship between A and each specific form.

第七步，设计计算随机函数的确定编码A的方法。随机函数的确定由当前分组或者是第一个分组来确定，前者减少运算量，后者则更加安全。这可以认为是设计2个函数f和g，使得A=g（f（am），P），这里的am代表相应的消息分组，P为前面提到的公开参数，P的二进制长度不超过A的二进制长度。这样可以让A可以遍历所有的函数具体形式，而且也减少等效公开参数P带来的冗余。注意，计算出来的A最终截取出来的值可能是有冗余的，这可以进行取模处理，比如，某个随机函数部件有5种具体形式，采用3bit来编码，则可以将它们分别编码0，1，2，3，4，将3bit转换为十进制后，取模5。当然一般最好设计为包含2^w种具体函数形式更好。The seventh step is to design a method for determining the code A to calculate the random function. The determination of the random function is determined by the current group or the first group, the former reduces the amount of calculation, and the latter is more secure. This can be considered as designing two functions f and g, so that A=g(f(am), P), where am represents the corresponding message grouping, P is the public parameter mentioned above, and the binary length of P does not exceed A The binary length of . This allows A to traverse all the concrete forms of the function, and also reduces the redundancy brought about by the equivalent public parameter P. Note that the final intercepted value of the calculated A may be redundant, which can be processed by modulus. For example, if a random function component has 5 specific forms and is encoded with 3 bits, they can be encoded as 0 respectively. , 1, 2, 3, 4, after converting 3bit to decimal, take modulo 5. Of course, it is generally better to design as including^2w kinds of specific function forms.

以下是单个分组的hash函数值的计算方法：根据不同的情况有两种方法确定随机函数，第一，是根据第一个分组确定随机函数，以后一直沿用此随机函数；第二，根据当前分组确定随机函数。第一种方法可以减少计算量，第二种方法则可以增强安全性。首先根据两种情况分别选择输入第一个分组或者当前的明文（消息）分组am，通过A=g（f（am），P）计算出A，在实际中可以简化g函数，让A=f（am）+P，这里的+可以为异或或者模加，其模数不小于算法的具体形式的总数，采用这两种方式可以减少运算量而且具有很明确的遍历性，根据A的数据结构，可以确定函数的具体形式。这样hash函数H的具体形式h_i就确定了，hash值d_i =h_i（am，d_i-1）。d_i-1是前一个分组的hash值，i=1时取初始值。The following is the calculation method of the hash function value of a single group: There are two ways to determine the random function according to different situations. The first is to determine the random function according to the first group, and this random function will be used in the future; Determine the random function. The first method can reduce the amount of computation, and the second method can enhance security. First, according to the two situations, choose to input the first group or the current plaintext (message) group am, and calculate A by A=g(f(am), P). In practice, the g function can be simplified, let A=f (am)+P, where + can be XOR or modular addition, and its modulus is not less than the total number of specific forms of the algorithm. Using these two methods can reduce the amount of calculation and has a very clear ergodicity. According to the data of A The structure can determine the specific form of the function. In this way, the specific form h_i of the hash function H is determined, and the hash value d_i =h_i (am, d_i-1 ). d_i-1 is the hash value of the previous group, and the initial value is taken when i=1.

整个消息生成hash值的过程：对消息进行填充，附加长度信息，然后反复使用上段的方法，以初始hash值或前一个hash值作为输入，对每一个分组利用随机压缩函数来逐一产生hash值，逐步压缩每一个分组直至结束。有些hash函数会在最后一个分组进行进一步的压缩。公开hash值的时候应该同时公开那个公开参数P，不同的P对应不同hash值。The process of generating a hash value for the entire message: padding the message, adding length information, and then repeatedly using the method in the previous paragraph, using the initial hash value or the previous hash value as input, using a random compression function for each group to generate hash values one by one, Compress each packet step by step until the end. Some hash functions perform further compression in the last packet. When the hash value is disclosed, the public parameter P should be disclosed at the same time, and different P corresponds to different hash values.

本发明的安全性优势：一、相对于现有的算法仅仅是输入变量的变化，基于随机函数的hash函数其函数也是变化的，这种变化导致hash计算中的中间结果和最终hash值的变化更加激烈和难于分析；二、算法对于加密解密双方是确定的，但是对于分析者却不确定。现有的公开的密码分析方法均针对确定的算法，而本发明函数本身不确定，使得密码分析很难着手；三、本发明利用一个公开参数P对应不同的hash值，这使得查表攻击之类的蛮力攻击会更难，建表、存储表和查表的时间空间复杂度都会增加。 The security advantages of the present invention: one, compared with the existing algorithm, only the change of the input variable, the function of the hash function based on the random function is also changed, and this change leads to the change of the intermediate result and the final hash value in the hash calculation It is more intense and difficult to analyze; second, the algorithm is certain for both encryption and decryption parties, but not for the analyst. Existing open cryptanalysis methods are all aimed at definite algorithm, and the function of the present invention is uncertain, makes cryptanalysis difficult to set about; Three, the present invention utilizes a public parameter P to correspond to different hash values, and this makes table look-up attack The brute force attack of the class will be more difficult, and the time and space complexity of building tables, storing tables, and looking up tables will increase. the

附图说明Description of drawings

图1是本发明的实施例迭代示意图。Fig. 1 is a schematic diagram of an iterative embodiment of the present invention.

图2是本发明的随机函数构造流程图。Fig. 2 is a flow chart of the random function construction of the present invention.

图3是本发明实施例的hash值计算流程图。Fig. 3 is a flow chart of hash value calculation according to an embodiment of the present invention.

具体实施方式Detailed ways

以下为一个随机hash函数构造的实施例，为了方便和简洁地描述，采用比较简短的算法，并且借鉴已有的算法。The following is an example of constructing a random hash function. For convenience and concise description, a relatively short algorithm is used and existing algorithms are used for reference.

以下按照步骤构造一个随机hash函数：Follow the steps to construct a random hash function:

第一步，确定处理的分组长度512bit。The first step is to determine the processed packet length of 512bit.

第二步，设计填充和附加长度信息的方式，对消息进行填充，第一位填充1，其后位全部为0，将其填充到512n-64，如果恰好为512n-64的形式则填充512bit，同样是第一位填充1，其后位全部为0，n为符合该条件的最小整数，在后面附加64bit是预留用于附加消息长度的，因为现实中绝大多数消息的长度的二进制一般是不会超过64，即长度不超过2⁶⁴。当消息的长度的二进制值超过64，即消息的长度长于2⁶⁴时，取末尾的64bit，当长度不及64时，在前面附加填充0至满64位，最终经过填充和附加后的消息成为分组长度512的倍数。消息长度采用BIG-ENDIAN的存储方式。The second step is to design the way of padding and additional length information, and pad the message. The first bit is filled with 1, and the subsequent bits are all 0. Fill it to 512n-64. If it happens to be in the form of 512n-64, fill it with 512bit , the first bit is also filled with 1, and the subsequent bits are all 0, n is the smallest integer that meets this condition, and the additional 64 bits are reserved for additional message length, because in reality, the length of most messages is binary Generally, it will not exceed 64, that is, the length will not exceed 2⁶⁴ . When the binary value of the length of the message exceeds 64, that is, when the length of the message is longer than²⁶⁴ , take the last 64 bits, and when the length is less than 64, add 0 to full 64 bits in front, and finally the message after filling and appending becomes a packet A multiple of length 512. The message length adopts BIG-ENDIAN storage method.

第三步，设定hash值的长度，这里为了方便取长度值n为160bit。The third step is to set the length of the hash value. Here, the length value n is set to 160 bits for convenience.

第四步，确定hash的初始值，或者初始变量，这个（这些）值是确定的，总长度为160bit。将中间hash结果和最终hash值存储在缓冲区，缓冲区分为5个32bit的寄存器，称为A、B、C、D、E，初始值为A= 0x45670123，B= 0xAB89EFCD，C= 0xDCFE98BA，D= 0x10325476，E= 0xC3D2E1F0。寄存器采用BIG-ENDIAN的存储方式。The fourth step is to determine the initial value of the hash, or the initial variable, this (these) values are determined, and the total length is 160 bits. Store the intermediate hash result and the final hash value in the buffer. The buffer is divided into five 32-bit registers, called A, B, C, D, and E. The initial value is A= 0x45670123, B= 0xAB89EFCD, C= 0xDCFE98BA, D = 0x10325476, E = 0xC3D2E1F0. The register adopts BIG-ENDIAN storage method.

第五步，设计随机的压缩函数H。为了增强安全性，这里选用S盒作为一个安全部件。S盒大则安全，但是会造成计算量和存储的增加，S盒小则不安全，对于软件实现，一般以字节为单位将会方便运算，所以选取8bit，即一个字节作为S盒的输入输出大小。有限域GF（2⁸）上的乘法逆映射具有很好的非线性度、差分均匀性和次数，但是由于代数表达式过于简单而容易遭受插值之类的攻击，可以将其与一个仿射运算相结合。本算法将S盒当做一个随机函数部件，随机取两个S盒，它们都采用逆映射与仿射运算组合，它们的差异在于仿射运算的映射关系不一样。具体变换为：1）将8bit变换为GF（2⁸）上的乘法逆元，额外地，二进制00000000映射为00000000，2）对前面逆运算的结果采用仿射变换如下，由于是具有两种具体形式的随机函数，所以有两个随机选择的二进制仿射变换：The fifth step is to design a random compression function H. In order to enhance security, the S box is selected as a security component here. A large S box is safe, but it will increase the amount of calculation and storage. A small S box is not safe. For software implementation, it is generally convenient to calculate in bytes, so choose 8bit, that is, one byte as the S box. Input and output size. The multiplicative inverse map on the finite field GF(2⁸ ) has good nonlinearity, differential uniformity and degree, but because the algebraic expression is too simple, it is vulnerable to attacks such as interpolation. It can be combined with an affine operation Combine. This algorithm regards the S-box as a random function component, and randomly selects two S-boxes, both of which use the combination of inverse mapping and affine operation. The difference between them is that the mapping relationship of affine operation is different. The specific transformation is: 1) transform 8bit into the multiplicative inverse element on GF(2⁸ ), additionally, the binary 00000000 is mapped to 00000000, 2) the affine transformation is used for the result of the previous inverse operation as follows, because there are two specific A random function of the form , so there are two randomly chosen binary affine transformations:

以上S盒用于将512bit的消息分组am进行代换，即将这个消息分组根据随机S盒进行替换后，然后512bit整体左循环移位S盒长度的一半位数，即4bit，得到Y_q，然后对Y_q都进行4轮处理，每一轮又由20步迭代组成。4轮处理过程结构一样，但所用的基本随机函数不同，分别表示为F₁，F₂，F₃，F₄。每轮的输入为当前处理的消息分组的S盒代换和循环移位的结果Y_q和缓冲区的当前值A，B，C，D，E，输出仍放在缓冲区以替代A，B，C，D，E的旧值，每轮的每一个迭代处理过程如附图一所示，F为随机函数，输入为B、C、D，K_t为加法常量，其中0≤t≤79，t表示迭代的步数。当0≤t≤19时，K_t=0x51827919；当20≤t≤39时，K_t=0x EBA164D9；当40≤t≤59时，K_t=0x211BBC3C；当60≤t≤79时，K_t=0x6A62C1D；<<<_n 代表bit向左循环移动n个位置。n因操作而异。图中田代表modulo 2³²之下的加法。字W_t是由Y_q计算得到的，W_t是32bit长，前16个字（从W₀依次到W₁₅）直接取Y_q依次的16个32bit，其余字（从W₁₆依次到W₇₉），W_t是由多重随机的函数计算出来的，W_t=<<<_I [（W_t-16OPERATOR₁ W_t-14） OPERATOR₂ (W_t-8 OPERATOR₁ W_t-3）]，<<<_I表示左循环移位I位，其中I为随机变量，为10和18，OPERATOR为随机运算符， OPERATOR₁的具体形式为异或、异或后求非运算，OPERATOR₂的具体形式为与、或运算，这样在随机函数的具体形式上比较具有互补性。随机函数F的具体定义如下表：The above S box is used to replace the 512bit message group am, that is, after the message group is replaced according to the random S box, then the 512bit overall left circular shift is half of the length of the S box, that is, 4bit, to obtain Y_q , and then 4 rounds of processing are performed on Y_q , and each round consists of 20 iterations. The structure of the four rounds of processing is the same, but the basic random functions used are different, which are respectively expressed as F₁ , F₂ , F₃ , and F₄ . The input of each round is the result Y_q of the S-box substitution and cyclic shift of the currently processed message group and the current value of the buffer A, B, C, D, E, and the output is still placed in the buffer to replace A, B , the old values of C, D, E, each iteration process of each round is shown in Figure 1, F is a random function, the input is B, C, D, K_t is an additive constant, where 0≤t≤79 , t represents the number of iteration steps. When 0≤t≤19, K_t =0x51827919; when 20≤t≤39, K_t =0x EBA164D9; when 40≤t≤59, K_t =0x211BBC3C; when 60≤t≤79, K_t =0x6A62C1D; <<<_n means the bit moves n positions to the left. n varies by operation. The field in the figure represents addition under modulo 2³² . The word W_t is calculated by Y_q , W_t is 32bit long, the first 16 words (from W₀ to W₁₅ ) directly take 16 32bits of Y_q in sequence, and the rest of the words (from W₁₆ to W₇₉ ), W_t is calculated by multiple random functions, W_t =<<<_I [(W_t-16 OPERATOR₁ W_t-14 ) OPERATOR₂ (W_t-8 OPERATOR₁ W_t-3 )], <<<_I means a left circular shift of I bits, where I is a random variable, which is 10 and 18, OPERATOR is a random operator, the specific form of OPERATOR₁ is XOR, XOR followed by NOT operation, and the specific form of OPERATOR₂ It is an AND or OR operation, which is relatively complementary in the specific form of the random function. The specific definition of the random function F is as follows:

表一随机函数F定义表Table 1 Random function F definition table

注意到，这里F₂和F₄虽然可能的具体形式相同，但是它们是各自独立的选取两个具体函数之一，属于不同函数。为了简化，本示例的随机性尚少，在具体设计的时候可以考虑更多的随机性。Note that although F₂ and F₄ may have the same specific form, they are one of two specific functions selected independently and belong to different functions. For simplicity, the randomness in this example is still small, and more randomness can be considered in the specific design.

第六步，确定随机函数的确定编码的结构。我们逐一考察函数的随机性，加上这些随机性是独立的，所以可以用独立的比特对应：1）随机代换函数，即随机S盒有两种具体形式，需要1bit。2）随机函数F₁，F₂，F₃，F₄分别有两种具体形式，各自需要1bit，一共4bit。3）W_t为多重随机函数，OPERATOR₁、OPERATOR₂和I均是随机因子，都有两种形式，需要3bit。因此，A需要8bit编码，其中第1bit确定S盒，0代表第一个S盒，1代表第二个，第2-5bit分别确定随机函数，0和1分别代表表中按照先后顺序的具体函数形式，第6-8bit确定W_t求解函数。The sixth step is to determine the structure of the definite code of the random function. We examine the randomness of functions one by one, and these randomnesses are independent, so independent bits can be used to correspond: 1) The random substitution function, that is, the random S-box has two specific forms and requires 1 bit. 2) The random functions F₁ , F₂ , F₃ , and F₄ have two specific forms, each requiring 1 bit, and a total of 4 bits. 3) W_t is a multiple random function, OPERATOR₁ , OPERATOR₂ and I are all random factors with two forms and need 3 bits. Therefore, A requires 8-bit encoding, where the first bit determines the S-box, 0 represents the first S-box, 1 represents the second, the 2-5 bits respectively determine the random function, and 0 and 1 represent the specific functions in the table in order Form, the 6th-8th bits determine the W_t solution function.

第七步，设计计算随机函数的确定编码A的方法。具体构造的时候，可以采用两种方法：第一、自然的形式，每一个分组都可以根据分组的消息来决定当前分组的随机函数；第二、为了减少计算量、存储量和复杂度，这里的随机函数在第一个分组的时候确定，随后的分组一直沿用这个函数。第二种方式比较实用，宜采用这种方式。计算A的方法为：将消息am进行上文中的第一个S盒代换，然后将得到的512bit消息均分为两个256bit分组，分别进行左循环移位，左边的分组移位4bit，右边的分组移位6bit，将两部分移位后的数字进行异或运算，得到256bit分组。接着继续进行相同的S盒代换，折半分组，并且将两部分进行左循环移位4bit和6bit，然后异或，结果多轮的上述方式的循环得到8bit的结果，加上公开参数P，得到的结果作为A，然后按照A的结构进行提取，取模，得到的结果用于确定函数的具体形式。这样就可以用设计的hash函数来产生数据摘要。图3是本实施例的计算hash值的流程。The seventh step is to design a method for determining the code A to calculate the random function. In the specific construction, two methods can be used: first, the natural form, each group can determine the random function of the current group according to the group message; second, in order to reduce the amount of calculation, storage and complexity, here The random function of is determined at the time of the first grouping, and the subsequent groupings always use this function. The second method is more practical and should be adopted. The method of calculating A is: replace the message am with the first S box above, and then divide the obtained 512bit message into two 256bit groups, and perform left circular shift respectively, the left group is shifted by 4 bits, and the right The group shifted by 6 bits, and the XOR operation is performed on the shifted numbers of the two parts to obtain a 256-bit group. Then continue to perform the same S-box replacement, divide the grouping in half, and perform a left cyclic shift of 4bit and 6bit on the two parts, and then XOR. As a result, multiple rounds of the above-mentioned cycle get an 8bit result, and add the public parameter P to get The result is taken as A, and then extracted and moduloed according to the structure of A, and the obtained result is used to determine the specific form of the function. In this way, the designed hash function can be used to generate a data summary. FIG. 3 is a flow chart of calculating hash values in this embodiment.

密码分析者想要破译将是非常困难的。即使密码分析者知道了原消息，想要伪造一个相同的消息，原则上说他在这种情况下可以知道hash函数的具体形式，但是他伪造的明文不仅仅要通过计算方法得出相同的确定编码A，同时要得到相同的hash值，显然也是很困难的，在不知道明文的情况下，则他既不知道算法，又不知道消息，伪造的消息也需要同时满足双重的条件，大大增加了破译的困难性。由于hash函数值是随机的，查表分析则需要建立很大的表，建表，查表和存储表，均需要大量的时间和空间。 It would be very difficult for a cryptanalyst to decipher. Even if the cryptanalyst knows the original message and wants to forge the same message, in principle, he can know the specific form of the hash function in this case, but the plaintext he forges must not only obtain the same determination through calculation methods It is obviously very difficult to encode A and get the same hash value at the same time. If he does not know the plaintext, he does not know the algorithm or the message. The forged message also needs to meet the double conditions at the same time, which greatly increases Difficulty in deciphering. Since the value of the hash function is random, a large table needs to be established for table lookup analysis, and a lot of time and space are required for table creation, table lookup, and storage. the

Claims

Translated fromChinese

1.一种抗查表攻击的随机单向哈希函数构造方法，其特征在于该方法应用于信息安全领域，采用随机的、不确定的随机函数来构造hash函数，随机函数具有多个不同的具体函数形式，当消息分组不确定时，随机哈希函数是不确定的，输入消息分组和公开参数P确定时，hash函数是确定的，即消息分组和公开参数P确定hash函数的具体函数形式，采用一个根据消息分组和公开参数P计算出来的确定编码A来决定随机函数的具体形式；一个确定的消息的hash值是随机的，由一个公开参数P确定；具体构造步骤包括：1）确定压缩函数每一次处理的消息分组长度；2）设计填充和附加长度信息的方式，在消息中附加消息的长度信息，将消息填充到压缩函数能够处理的长度；3）权衡安全需求、运算效率和实际实用便利性设定hash值的长度；4）确定hash的初始值；5）设计随机函数作为压缩函数，采用多个随机函数部件来构造随机hash函数，其中的随机函数部件随机选取多个具体函数形式；6）设计随机函数的确定编码A的结构，设定公开参数P的取值范围，其中A是确定随机函数具体形式的编码；7）设计计算随机函数的确定编码A的方法。1. A random one-way hash function construction method resistant to table lookup attacks, characterized in that the method is applied to the field of information security, using random, uncertain random functions to construct hash functions, and the random functions have multiple different The specific function form, when the message grouping is uncertain, the random hash function is uncertain, when the input message grouping and the public parameter P are determined, the hash function is certain, that is, the message grouping and the public parameter P determine the specific functional form of the hash function , using a certain code A calculated according to message grouping and public parameter P to determine the specific form of the random function; the hash value of a certain message is random and determined by a public parameter P; the specific construction steps include: 1) Determine The length of the message packet processed by the compression function each time; 2) Design the way of filling and adding length information, add the length information of the message to the message, and fill the message to the length that the compression function can handle; 3) Balance security requirements, computing efficiency and Practical convenience Set the length of the hash value; 4) Determine the initial value of the hash; 5) Design a random function as a compression function, use multiple random function components to construct a random hash function, and randomly select multiple specific random function components Functional form; 6) Design the structure of the definite code A of the random function, and set the value range of the public parameter P, where A is the code that determines the specific form of the random function; 7) Design the method of calculating the definite code A of the random function.

2.如权利要求1所述的抗查表攻击的随机单向哈希函数构造方法，其特征是：步骤5）中，随机函数部件的构造为选择具有安全性的多个函数作为一个随机函数部件的多种具体形式，这些具体形式在计算量、输出值的范围和冗余方面是等效的。2. The method for constructing random one-way hash functions against table look-up attacks as claimed in claim 1, characterized in that: in step 5), the construction of random function components is to select multiple functions with security as a random function Multiple concrete forms of a component that are equivalent in terms of computational effort, range of output values, and redundancy.

3.如权利要求2所述的抗查表攻击的随机单向哈希函数构造方法，其特征是随机函数采用两种方式确定：第一，第一个分组的消息和公开参数P确定随机压缩函数的具体形式，以后一直沿用这个具体函数；第二，每一个分组的消息和公开参数P确定当前分组所用的随机压缩函数的具体形式。3. the random one-way hash function construction method of anti-table look-up attack as claimed in claim 2 is characterized in that random function adopts two kinds of modes to determine: the first, the message of the first grouping and public parameter P determine random compression The specific form of the function will continue to use this specific function in the future; secondly, the message of each group and the public parameter P determine the specific form of the random compression function used by the current group.

4.如权利要求3所述的抗查表攻击的随机单向哈希函数构造方法，其特征为：压缩前先计算随机函数的确定编码A，依据A确定随机函数的具体形式，然后以这一随机函数的具体形式计算hash值或其中间结果，经过逐步压缩得到最终hash值，公开hash值的同时公开对应的公开参数P。4. the random one-way hash function construction method of anti-table look-up attack as claimed in claim 3 is characterized in that: calculate the definite code A of random function earlier before compression, determine the concrete form of random function according to A, then with this A specific form of a random function calculates the hash value or its intermediate results, and gradually compresses to obtain the final hash value. When the hash value is disclosed, the corresponding public parameter P is disclosed.