技术领域technical field
本发明属于计算机系统软件和应用软件领域,具体涉及一种二进制软件构件及其制作方法。The invention belongs to the field of computer system software and application software, and in particular relates to a binary software component and a manufacturing method thereof.
背景技术Background technique
目前,计算机软件二进制模块中通常包括程序的二进制代码、数据、及相关的符号、字符串等信息,模块在二进制形态上有多种公共的标准格式,比如ELF、PE、NE、a.out等等,各个标准之间互不兼容,大量来自不同供应商的目标程序存在事先不能进行相互的语法、调用规程假定,使得二进制模块不便于复用。且在目标程序生成过程中,存在两个连续步骤即编译器与链接编辑程序,两者同属一个工具链,通常链接编辑程序认为编译器已经把语法上的匹配规则处理妥当,忽略了相关信息的处理,直接用符号名(可能是函数或全局变量)匹配的办法回填各个软件模块之间的符号引用地址。该处理方式存在的问题包括:At present, computer software binary modules usually include program binary code, data, and related symbols, character strings and other information. Modules have a variety of public standard formats in binary form, such as ELF, PE, NE, a.out, etc. etc., the standards are incompatible with each other, and a large number of target programs from different suppliers have mutual syntax and call procedure assumptions that cannot be assumed in advance, making binary modules inconvenient for reuse. Moreover, in the process of generating the target program, there are two consecutive steps, the compiler and the link editor, both of which belong to the same tool chain. Usually, the link editor thinks that the compiler has properly handled the grammatical matching rules and ignores the relevant information. For processing, directly backfill the symbolic reference address between each software module by matching symbolic names (may be functions or global variables). Problems with this approach include:
1、目标程序的符号中缺少语法信息,只有符号名,给后续链接带来不确定性;1. There is a lack of grammatical information in the symbol of the target program, only the symbol name, which brings uncertainty to the subsequent link;
2、模块作为信息载体未受保护,无法辨识其来源;未能明确给出模块间依赖关系。2. The module is not protected as an information carrier, and its source cannot be identified; the dependencies between modules cannot be clearly given.
发明/实用新型内容Invention/utility model content
本发明克服了上述现有模块格式缺少信息,或者信息封装得不合理的缺陷,提供一种二进制软件构件及制作方法,形成可复用的二进制软件构件。The present invention overcomes the defects of lack of information or unreasonable information encapsulation in the above-mentioned existing module format, and provides a binary software component and a manufacturing method to form a reusable binary software component.
本发明的另一目的还在于,更精确地完成模块链接时的绑定。Another object of the present invention is to more accurately complete the binding of modules when linking.
本发明的目的进一步在于,用户可方便地验证是否所持有的软件构件已被篡改,构件发布更为安全。The further object of the present invention is that the user can conveniently verify whether the software components held have been tampered with, and the release of components is more secure.
本发明的技术内容:一种二进制软件构件制作方法,步骤包括:Technical content of the present invention: a binary software component production method, the steps include:
1、从现有目标映像中提取原模块的符号信息、代码、数据以及字符串表等信息;1. Extract the symbol information, code, data and string table information of the original module from the existing target image;
2、添加接口/变量的语法表示,组织接口信息;2. Add the grammatical representation of interfaces/variables and organize interface information;
3、将接口信息和提取信息重新组织封装,生成二进制软件构件。3. Reorganize and package the interface information and extracted information to generate binary software components.
根据原模块所提供函数/变量和所需要的函数的基调,对提取的符号信息进行重新编码。The extracted symbolic information is recoded according to the tone of the functions/variables provided by the original module and the functions required.
根据原目标映像中的符号表,对目标映像中已提供的符号和未提供的符号做明确划分,已提供符号作为构件的提供接口成员,未提供符号划入构件的需求接口成员。According to the symbol table in the original target image, the provided symbols and unprovided symbols in the target image are clearly divided. The provided symbols are used as the provided interface members of the component, and the unprovided symbols are classified into the required interface members of the component.
进一步包含对构件采用MD5消息摘要算法。It further includes adopting the MD5 message digest algorithm for the component.
一种二进制软件构件,该软件构件相当于容器,除基本信息外,包括多个实现体,其特征在于:每个实现体按照一定格式组织一个对应的接口,接口的表示与实现体分离,共同封装在软件构件中。A binary software component, the software component is equivalent to a container, including a plurality of implementation bodies in addition to basic information. Encapsulated in software components.
该二进制软件构件依次存放以下信息:The binary software component sequentially stores the following information:
1、构件基本信息,包括构件名称、版本、适用的目标机器体系结构、构件大小等;1. Basic component information, including component name, version, applicable target machine architecture, component size, etc.;
2、构件的接口信息表,包括接口的名称和标识等;2. The interface information table of the component, including the name and identification of the interface, etc.;
3、构件的实现体表,给出各个实现的名称、位置(起始偏移和长度)和属性信息;3. The implementation body of the component, giving the name, location (start offset and length) and attribute information of each implementation;
4、各实现体的节区内容,包括代码段、数据段、只读数据段、堆栈段等。4. The section area content of each implementation body, including code segment, data segment, read-only data segment, stack segment, etc.
构件基本信息还包括整个构件的签名信息。The basic component information also includes the signature information of the entire component.
构件的接口信息表还接口的成员个数及各个成员的描述。The interface information table of the component also includes the number of members of the interface and the description of each member.
构件的接口信息表进一步包括每个成员的属性信息。The interface information table of the component further includes attribute information of each member.
本发明的技术效果:提供一种二进制软件构件及其制作方法,在该软件构件中,封装了逻辑上耦合关系紧密的实现体(implementation),每个实现体是对一个接口的实现,并组织明确的接口信息,便于不同开发组织提供相同接口的不同实现,同时,也便于同一实现体在不同的应用环境中使用,使软件构件可复用。将接口信息与实现信息组合在同一个文件中发布,进一步使得软件构件具备自描述能力,提高了软件构件的可复用性。Technical effect of the present invention: provide a kind of binary software component and making method thereof, in this software component, encapsulate the realization body (implementation) that coupling relation is close logically, each realization body is the realization to an interface, and organizes Clear interface information is convenient for different development organizations to provide different implementations of the same interface. At the same time, it is also convenient for the same implementation body to be used in different application environments, making software components reusable. Combining the interface information and the implementation information in the same file is published, which further enables the software components to have self-describing capabilities and improves the reusability of the software components.
本发明按目标程序映像所提供的函数/变量和所需要的函数的基调(即返回值类型、函数名、参数类型、参数个数),对符号重新编码,可恢复其语法信息。The present invention recodes symbols according to the functions/variables provided by the target program image and the tone of the required functions (that is, return value type, function name, parameter type, and number of parameters), so as to recover its grammatical information.
通过对构件整体做消息摘要签名处理和适当的结果公布渠道,可检测因病毒、黑客等对构件内容的恶意修改或意外修改,进而提高组装后系统的安全性,软件构件发布更为安全。By performing message digest signature processing on the whole component and appropriate result release channels, it is possible to detect malicious or accidental modification of component content due to viruses, hackers, etc., thereby improving the security of the assembled system and making software component release more secure.
构件接口信息还包括成员信息,可避免原目标程序映象中缺失符号语法信息的缺点。显式提供的语法信息有利于构件组装时的语法层次检查、匹配。The component interface information also includes member information, which can avoid the shortcoming of missing symbol syntax information in the original target program image. The grammatical information provided explicitly is beneficial to the checking and matching of grammatical levels during component assembly.
构件接口信息进一步包括接口的属性描述,可便于指导后续组装过程,如接口成员函数可指定是否此方法允许重入,是否允许中断,是否同步方法(即直到函数返回时,才允许调用方继续执行)等等。The component interface information further includes the attribute description of the interface, which can facilitate the subsequent assembly process. For example, the interface member function can specify whether the method allows reentrancy, whether interruption is allowed, and whether the method is synchronized (that is, the caller is not allowed to continue until the function returns. )etc.
附图说明Description of drawings
下面结合附图,对本发明/实用新型做出详细描述。The present invention/utility model is described in detail below in conjunction with the accompanying drawings.
图1为本发明构件格式示意图:Fig. 1 is a schematic diagram of the component format of the present invention:
图2为本发明构件头部信息格式示意图。Fig. 2 is a schematic diagram of the format of component header information in the present invention.
具体实施方式Detailed ways
本发明为了解决目标程序映像中符号名过于简单的问题,首先提取原二进制软件模块中的符号信息,进行重新编码。In order to solve the problem that the symbol name in the target program image is too simple, the present invention firstly extracts the symbol information in the original binary software module and re-encodes it.
符号名可以从原二进制模块中直接提取。重新编码时,恢复其语法信息。根据原模块所提供(所需要)的函数(变量)基调,用压缩的符号重新组织。这里,基调指各个函数(变量)的返回值类型、名称、参数列表等。压缩的办法是,将原数据类型名称用一个ASCII字符表示。例如:函数Symbolic names can be extracted directly from the original binary module. When recoding, restore its syntax information. Reorganized with compressed symbols according to the tone of functions (variables) provided (required) by the original module. Here, the tone refers to the return value type, name, parameter list, etc. of each function (variable). The way to compress is to represent the original data type name with an ASCII character. For example: function
int func_name(unsigned char paraml,float *res)int func_name(unsigned char paraml, float *res)
可编码为:can be encoded as:
-i9func_nameucPf-i9func_nameucPf
其中字符“i”表示返回之类型为int,字符“9”表示后面的9个字符“func_name”为函数名,字母“uc”表示第一个参数类型为“unsigned char”,字母“P”表示第二个参数为一个指针,字母“f”表示第二个参数的指针类型为指向float的指针。The character "i" indicates that the returned type is int, the character "9" indicates that the following 9 characters "func_name" are the function name, the letter "uc" indicates that the first parameter type is "unsigned char", and the letter "P" indicates that The second parameter is a pointer, and the letter "f" indicates that the pointer type of the second parameter is a pointer to float.
通过此种规格化编码,可避免原模块中仅提供“func_name”作为接口成员名称所引发的二义性,便于更精确地完成模块链接时的绑定。关于接口的语法形式问题,可通过相关的文档(如软件的文档、头文件)获得。Through this standardized encoding, the ambiguity caused by only providing "func_name" as the interface member name in the original module can be avoided, and it is convenient to complete the binding when the module is linked more accurately. Regarding the grammatical form of the interface, it can be obtained through relevant documents (such as software documents, header files).
二进制软件构件制作方法具体做法为:The specific method of making binary software components is as follows:
1、提取原目标映像中有用信息,包括代码、(只读)数据段以及字符串表等信息。根据原二进制模块格式标准,从模块中提取代码段、数据段、只读数据段以及字符串表。1. Extract useful information in the original target image, including information such as code, (read-only) data segment, and string table. Extract the code segment, data segment, read-only data segment and string table from the module according to the original binary module format standard.
2、对接口函数、变量进行分组,形成接口,并命名,构造接口表2. Group interface functions and variables to form interfaces, name them, and construct interface tables
根据原目标映像中的符号表,对目标映像中已提供的符号和未提供的符号做明确划分。已提供符号作为构件的提供接口成员,未提供符号划入构件的需求接口成员。按成员的功能、作用将两类接口成员的内容组织成有意义的接口,给出有意义的接口名称。接口的属性描述也可给出相关信息,如接口成员函数可指定是否此方法允许重入,是否允许中断,是否同步方法(即直到函数返回时,才允许调用方继续执行)等等,以便于构件被第三方使用。如果有多个模块要封装到一个构件中,重复以上步骤。According to the symbol table in the original target image, the provided symbols and unprovided symbols in the target image are clearly divided. Symbols are provided as provided interface members of the component, and symbols not provided are assigned to required interface members of the component. Organize the contents of the two types of interface members into meaningful interfaces according to the functions and functions of the members, and give meaningful interface names. The attribute description of the interface can also give relevant information, such as the interface member function can specify whether the method allows reentrancy, whether to allow interruption, whether to synchronize the method (that is, the caller is not allowed to continue to execute until the function returns), etc., so that Components are used by third parties. If there are multiple modules to be packaged into one component, repeat the above steps.
3、生成目标二进制构件。3. Generate the target binary component.
构造目标文件文件头信息块,打开输出构件文件,写入文件头结构,依次写入各个接口信息,构造文件中接口表。并将第二步中提取的代码段、数据段、只读数据段写入输出文件,最后,将字符串表写入输出文件。Construct the header information block of the target file, open the output component file, write the file header structure, write each interface information in turn, and construct the interface table in the file. And write the code segment, data segment, and read-only data segment extracted in the second step to the output file, and finally, write the string table to the output file.
进一步生成签名信息。对目标程序映像来源的可信性问题,采用公开的MD5消息摘要算法,对转换后的软件构件文件字节流进行消息摘要生成,并将结果回填在软件构件头中固定位置,便于用户检查。关闭输出文件。由于MD5消息摘要算法可对字节流中所有字节做充分混合,所生成的摘要信息简短,且均为可读ASCII码,因而软件构件的用户可方便地验证是否所持有的软件构件已被篡改,构件发布更为安全。通过对构件整体做消息摘要签名处理和适当的结果公布渠道,可检测因病毒、黑客等对构件内容的恶意修改或意外修改。Further generate signature information. For the credibility of the image source of the target program, the public MD5 message digest algorithm is used to generate a message digest for the converted software component file byte stream, and the result is backfilled in a fixed position in the software component header, which is convenient for users to check. Close the output file. Since the MD5 message digest algorithm can fully mix all the bytes in the byte stream, the generated digest information is short and readable in ASCII code, so the user of the software component can easily verify whether the software component held has been If it is tampered with, component release is more secure. Malicious or accidental modification of component content due to viruses, hackers, etc. can be detected by performing message digest signature processing on the entire component and an appropriate result release channel.
参考图1,在新的二进制构件格式中,突出地提供了接口和实现体两类信息。接口信息表达了一个构件所提供的功能及完成所提供功能时需要其它构件提供的服务。实现体部分包含每个接口的具体实现部分。根据这种新的构件格式生成的构件相当于一种容器,将原来的软件模块封装起来。可复用二进制软件构件中依次存放如下信息:Referring to Figure 1, in the new binary component format, two types of information, interface and implementation body, are prominently provided. Interface information expresses the functions provided by a component and the services provided by other components to complete the provided functions. The implementation body part contains the specific implementation part of each interface. The components generated according to this new component format are equivalent to a kind of container, which encapsulates the original software modules. The following information is sequentially stored in the reusable binary software component:
1、构件基本信息:包括构件名称、版本、适用的目标机器体系结构、构件大小、接口信息描述表的偏移、构件实现表的偏移、整个构件的MD5消息摘要。1. Component basic information: including component name, version, applicable target machine architecture, component size, offset of interface information description table, offset of component implementation table, and MD5 message summary of the entire component.
2、构件的接口信息表,包括接口的名称和标识,接口的成员个数及各个成员的描述。每个成员均给出相应属性信息。2. The interface information table of the component, including the name and identification of the interface, the number of members of the interface and the description of each member. Each member gives corresponding attribute information.
3、构件的实现体表,给出各个实现的名称、位置(起始偏移和长度)和属性信息。3. The implementation body of the component, giving the name, position (start offset and length) and attribute information of each implementation.
4、各实现体的节区内容,包括代码段、数据段、只读数据段、堆栈段。其后存放构件实现体内的重定位信息。最后放置构件需求接口的描述。4. The section content of each implementation body, including code section, data section, read-only data section, and stack section. Thereafter, the relocation information in the component implementation body is stored. Finally, place the description of the component requirements interface.
通过构件头部信息和接口信息表,构件实现了自描述。其中,构件头部的信息封装方式,如图2所示,其中魔数为四个字节,作为识别此种文件格式的标志(例如:0xFF、‘C’、‘O’、‘M’)。文件大小字段给出整个构件的大小(按字节计)。体系结构给出目标运行平台的指令集编码。目标位数可以为16、32、64等多种格式。接口数、接口信息表偏移用于定位文件中实现的各个接口信息;实现体数、实现信息表用于定位文件中各个实现体信息的位置。字符串信息表偏移给出文件中各类名称所指字符串的总表在文件中的偏移。构件头部信息结构中还可包含自身大小字节数,以便计算文件结构中后续内容偏移,便于扩充。Through component header information and interface information table, components realize self-description. Among them, the information encapsulation method of the component header is as shown in Figure 2, wherein the magic number is four bytes, as a sign to identify this file format (for example: 0xFF, 'C', 'O', 'M') . The file size field gives the size (in bytes) of the entire artifact. The architecture gives the instruction set encoding of the target operating platform. The target number of digits can be in various formats such as 16, 32, and 64. The number of interfaces and the offset of the interface information table are used to locate the information of each interface implemented in the file; the number of implementations and the table of implementation information are used to locate the position of information of each implementation body in the file. The string information table offset gives the offset in the file of the general list of strings referred to by various names in the file. The component header information structure can also include the number of bytes of its own size, so as to calculate the offset of subsequent content in the file structure and facilitate expansion.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB2004100091079ACN1306400C (en) | 2004-05-20 | 2004-05-20 | Binary system software member and its manufacturing method |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CNB2004100091079ACN1306400C (en) | 2004-05-20 | 2004-05-20 | Binary system software member and its manufacturing method |
| Publication Number | Publication Date |
|---|---|
| CN1581084Atrue CN1581084A (en) | 2005-02-16 |
| CN1306400C CN1306400C (en) | 2007-03-21 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CNB2004100091079AExpired - Fee RelatedCN1306400C (en) | 2004-05-20 | 2004-05-20 | Binary system software member and its manufacturing method |
| Country | Link |
|---|---|
| CN (1) | CN1306400C (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100504903C (en)* | 2007-09-18 | 2009-06-24 | 北京大学 | A Malicious Code Automatic Identification Method |
| CN102110204A (en)* | 2009-12-23 | 2011-06-29 | 英群企业股份有限公司 | Removable device and method for testing an executable file of a computing device |
| CN103777963A (en)* | 2014-02-24 | 2014-05-07 | 武汉大学 | Method for achieving software reuse |
| CN112052006A (en)* | 2020-08-12 | 2020-12-08 | 武汉天喻信息产业股份有限公司 | Software code compiling method and system |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6931623B2 (en)* | 1999-08-30 | 2005-08-16 | Touchnet Information Systems, Inc. | Method of accessing data and logic on existing systems through dynamic construction of software components |
| CA2349905A1 (en)* | 2001-06-07 | 2002-12-07 | Ibm Canada Limited-Ibm Canada Limitee | System and method of mapping between software objects and structured language element based documents |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN100504903C (en)* | 2007-09-18 | 2009-06-24 | 北京大学 | A Malicious Code Automatic Identification Method |
| CN102110204A (en)* | 2009-12-23 | 2011-06-29 | 英群企业股份有限公司 | Removable device and method for testing an executable file of a computing device |
| CN103777963A (en)* | 2014-02-24 | 2014-05-07 | 武汉大学 | Method for achieving software reuse |
| CN112052006A (en)* | 2020-08-12 | 2020-12-08 | 武汉天喻信息产业股份有限公司 | Software code compiling method and system |
| Publication number | Publication date |
|---|---|
| CN1306400C (en) | 2007-03-21 |
| Publication | Publication Date | Title |
|---|---|---|
| JP3973557B2 (en) | Method for compressing / decompressing structured documents | |
| CN1945530A (en) | Arranging system and method for module having dependence | |
| CN113704180B (en) | Lossless firmware extraction method based on embedded device firmware file information feature library | |
| CN1926493A (en) | Method and system for linking certificates to signed files | |
| Aronson et al. | Towards an engineering approach to file carver construction | |
| CN100340938C (en) | File safety detection method | |
| US20240313979A1 (en) | Encoding of data in a hierarchical data structure using hash trees for integrity protection | |
| CN1641569A (en) | Realization method of plug-in function under Java applet | |
| CN112860232B (en) | Component implementation method and device | |
| Ni et al. | ASN1*: provably correct, non-malleable parsing for ASN. 1 DER | |
| CN101030211A (en) | Methods and systems for derivation of missing data objects from test data | |
| EP1783647A2 (en) | Database techniques for storing biochemical data items | |
| CN1581084A (en) | Binary system software member and its manufacturing method | |
| CN100334518C (en) | Document digital nano signing and method of reatizing electron seal and hand writing name signing | |
| CN104376098B (en) | A kind of files in batch method of calibration based on python | |
| CN114201116A (en) | Smart contract deployment method, device, equipment, medium and program product | |
| CN105303122B (en) | The method that the locking of sensitive data high in the clouds is realized based on reconfiguration technique | |
| CN103885875A (en) | Device and method for verifying scripts | |
| CN112528342B (en) | Software protection method based on compiling intermediate result | |
| CN116719531A (en) | Object conversion method, system, media and equipment based on runtime bytecode editing | |
| US9536109B2 (en) | Method and system for administering a secure data repository | |
| CN1877522A (en) | Method for accomplishing embedded system based on function component | |
| CN1896954A (en) | Method for realizing structural dynamic compiler of complicated multi-service | |
| CN110135133B (en) | Microcontroller-oriented compression integrated source code obfuscation method and system | |
| CN1581800A (en) | Method for dynamic support of multi-languages to business management agency in intelligent network platform |
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| C14 | Grant of patent or utility model | ||
| GR01 | Patent grant | ||
| C17 | Cessation of patent right | ||
| CF01 | Termination of patent right due to non-payment of annual fee | Granted publication date:20070321 Termination date:20130520 |