Summary of the invention
In view of this, the present invention provides the conversion method and device of a kind of file format, main purpose is to solve fileThe problem of Office component interface is disposed in format conversion process.
To solve the above-mentioned problems, present invention generally provides following technical solutions:
On the one hand, the embodiment of the invention provides a kind of conversion methods of file format, comprising:
Obtain text document to be turned;
By calling document tool collection that the text document to be turned is converted to web page files;
The web page files are handled, the web page files are converted to the file of preset format.
Further, after obtaining text document to be turned, the method also includes:
The version of the detection text document to be turned;
If the version of the text document is lower than default version, the text document is converted to the text of default versionThis document;
Also, when the text document to be converted is converted to web page files, the text based on the default versionDocument is converted.
Further, described by calling document tool collection that the text document to be turned is converted to web page files packetIt includes:
It is concentrated from the document tool and calls document conversion instruction, binary system text is generated based on the text document to be turnedThe file of shelves format;
The list data in the file of the binary documents format is read, and the list data read is written to netIn page file.
Further, described that the web page files are handled, the web page files are converted to the text of preset formatPart includes:
The page parameter in the web document is adjusted by constructed fuction;
Parameter web document adjusted is associated with Write object, parameter web document adjusted is storedFor the file of preset format.
Further, the document tool collection calls description components interface to realize by executable program.
Further, the file of the preset format is portable document format or picture format.
To achieve the goals above, according to another aspect of the present invention, a kind of storage medium, the storage medium are providedProgram including storage, wherein equipment where controlling the storage medium in described program operation executes text described aboveThe conversion method of part format.
To achieve the goals above, according to another aspect of the present invention, a kind of processor is provided, the processor is used forRun program, wherein described program executes the conversion method of file format described above when running.
On the other hand, the embodiment of the invention also provides a kind of conversion equipments of file format, comprising:
Acquiring unit, for obtaining text document to be turned;
First converting unit, for by calling document tool collection that the text document to be turned is converted to webpage textPart;
The web page files are converted to the text of preset format for handling the web page files by processing unitPart.
Further, described device further include:
Detection unit, for detecting the version of the text document to be turned;
Second converting unit, if the version for the text document is lower than default version, by the text to be turnedThis document is converted to the text document of default version;
Also, when the text document to be converted is converted to web page files, the text based on the default versionDocument is converted.
Further, first converting unit includes:
Calling module calls document conversion instruction for concentrating from the document tool, based on the text text to be turnedShelves generate the file of binary documents format;
Read module, the list data in file for reading the binary documents format, and the table that will be readLattice data are written in web page files.
Further, the processing unit includes:
Module is adjusted, for adjusting the page parameter in the web document by constructed fuction;
Memory module, for parameter web document adjusted to be associated with Write object, after parameter adjustmentWeb document be stored as the file of preset format.
Further, the document tool collection calls description components interface to realize by executable program.
Further, the file of the preset format is portable document format or picture format.
It can be carried out with needing to install Office component on the server of Wi ndows operating system in the prior artThe method of file format conversion is compared, and the embodiment of the present invention is by calling document tool collection that text document to be turned is converted to netWeb page files using web page files as intermediate file, and then are converted to the file of predetermined format, due to web page files by page fileThis format has certain versatility, therefore original document is whether converted to web page files, or web page files are turnedBe changed to the file of other predetermined formats, difficulty is all relatively low, it is only necessary to calling can be common to various operating systems (includingWindows operating system, (SuSE) Linux OS etc.) document tool collection can be completed, solve file format conversion in OfficeThe deployment issue of component, by running executable program collection in editing machine, so that the format conversion of file is not limited toWindows operating system reduces requirement of the file conversion to system environments and software deployment, provides just for the conversion of file formatIt is prompt.
The above description is only an overview of the technical scheme of the present invention, in order to better understand the technical means of the present invention,And it can be implemented in accordance with the contents of the specification, and in order to allow above and other objects of the present invention, feature and advantage canIt is clearer and more comprehensible, the followings are specific embodiments of the present invention.
Specific embodiment
The exemplary embodiment that the present invention will be described in more detail below with reference to accompanying drawings.Although showing the present invention in attached drawingExemplary embodiment, it being understood, however, that may be realized in various forms the present invention without should be by embodiments set forth hereIt is limited.It is to be able to thoroughly understand the present invention on the contrary, providing these embodiments, and can be by the scope of the present inventionIt is fully disclosed to those skilled in the art.
The embodiment of the invention provides a kind of conversion methods of file format, as shown in Figure 1, this method is by calling documentText document to be turned is converted to web page files by tool set, and then is handled web page files, and web page files are converted toThe file of preset format, without installing Office component on the server, so that the format conversion of file is not limited to WindowsOperating system can support other operating system environments, provide step in detail below to this embodiment of the present invention:
101, text document to be converted is obtained.
Wherein, text document to be converted is the file for including user data, with the storage of specific File Format, every classDocument has specific file extension to identify, for example, the extension name of office documents can be .doc .xls or .ppt etc.,The embodiment of the present invention is without limiting.
In Web application development process, it is frequently encountered the problem of file format is converted, for example, by Word2007 filePdf file is converted to, realizes the conversion of file format usually using the interface function that Microsoft Office software carries at present, andThe included interface function of Office software is only applicable to Windows operating system, so that file format conversion is by limitation.?In the embodiment of the present invention, text document to be converted may be saved independently, it is also possible to be contained in some file.For comprising moreThe file of a text document can determine text document to be converted first from multiple text documents, be converted again later.
102, by calling document tool collection that the text document to be turned is converted to web page files.
Wherein, document tool collection can call the interface of Open Xml Power Tools component real by executable programIt is existing, Open Xml Power Tools component contain can by the codes of the Open XML SDK all kinds of common tasks realized andInstruction, such as the conversion of DOCX to HTML/CSS, merging and separation DOCX document, merging and separation PPTX document.It should sayBright, above-mentioned executable program can be the program run in linux system by Mono compiler.
It is understood that above-mentioned Open Xml Power Tools component combines PowerShell and Open XML twoItem technology can be convenient the document process for efficiently completing server end.The also included source of Open Xml Power Tools componentHow code example and related guidance, create PowerShell order line to provide for developer to operate all kinds of Open XMLDocument.
In prior art solutions, on the server of Windows operating system for not disposing Office component, orText document is on the server of non-Windows operating system, due to there is no to dispose Office component, and then can not be by text textShelves are converted to the file of preset format.Different, the embodiment of the present invention is by calling document tool collection by text to be turnedThis document is converted to web page files, using web page files as intermediate file, since this format of web page files has certain lead toWith property, therefore it is lower that original document is converted to web page files difficulty, it is only necessary to which calling can be common to various operating systems (packetInclude Windows operating system, (SuSE) Linux OS etc.) document tool collection can be completed, without merely depend on officeComponent.
103, the web page files are handled, the web page files is converted to the file of preset format.
It is not the file for being preset format needed for user since web page files are equivalent to intermediate file, the present invention is implementedWeb page files, further by handling web page files, are converted to the file of preset format, here preset at format by exampleFile can be portable document format, extend entitled .pdf or picture format, extend entitled .jpg .jpeg etc., the present inventionEmbodiment is to the file of preset format without limiting.
For the embodiment of the present invention, the interface of iTextSharp component is called to realize especially by executable program,ITextSharp component is the tool dynamic base that Portable Document is operated under C# platform, efficiently will be can be convenientWeb page files are converted to the file of preset format.
For a kind of conversion method for file format that the embodiment of the present invention provides, by calling document tool collection will be toThe text document turned is converted to web page files, the intermediate file that web page files are converted as file format, without in serverWeb page files can be directly converted to the text of preset format by handling web page files by upper installation Office componentPart provides conveniently for the file conversion of different-format.With the text for needing to install Office component in the prior art on the serverThe method of the conversion of part format is compared, and the embodiment of the present invention is by calling document tool collection that text document to be turned is converted to netWeb page files using web page files as intermediate file, and then are converted to the file of predetermined format, due to web page files by page fileThis format has certain versatility, therefore original document is whether converted to web page files, or web page files are turnedBe changed to the file of other predetermined formats, difficulty is all relatively low, it is only necessary to calling can be common to various operating systems (includingWindows operating system, (SuSE) Linux OS etc.) document tool collection can be completed, solve file format conversion in OfficeThe deployment issue of component, by running executable program collection in editing machine, so that the format conversion of file is not limited toWindows operating system reduces requirement of the file conversion to system environments and software deployment, provides just for the conversion of file formatIt is prompt.
Below in order to which the conversion method of file format proposed by the present invention a kind of is explained in more detail, especially for logicalIt crosses and calls document tool collection that text document to be turned is converted to web page files and is converted to preset format to by web page filesFile the step of, the embodiment of the invention also provides the conversion method of another file format, as shown in Fig. 2, this methodSpecific step includes:
201, text document to be turned is obtained.
Wherein, text document to be turned is the file for including user data, with the storage of specific File Format, every class textShelves have specific file extension to identify, for example, the extension name of office documents can be .doc .xls or .ppt etc., sheetInventive embodiments are without limiting.
It should be noted that the file format conversion of the embodiment of the present invention is not suitable for all files to be converted, it is such as softThe format conversion of part program file, audio video document.
202, the version of the detection text document to be turned.
Since the version of the corresponding text document of different switching file has different compatibility, due to the text of highest versionDocument has the function of new and version pattern, and new version pattern is easier to be resolved in each platform, so the text of highest versionThis document has higher compatibility, naturally it is also possible to the text document of compatible lowest version.
For the embodiment of the present invention, by the version for detecting text document to be turned, it will be appreciated that the compatibility of text documentProperty, to improve follow-up text document transfer admittance.
If 203, the version of the text document is lower than default version, the text document is converted into default versionText document.
Since there may be certain in actual application for the function and version pattern of the text document of lowest versionLimitation influences whether accuracy of the text document in the conversion process of file format, so that the file after conversion occurs disorderlyPhenomena such as code.
For example, text document to be turned is word2007.doc, detects that the version of text document is word2007, presetVersion is word2010, and the version word2007 of text document to be turned at this time is lower than default version word2010, then by textThe version word2007 of document is converted to default version word2010.
For the embodiment of the present invention, the file conversion command in doc2X tool can be specifically called by executable programText document to be turned is converted to the text document of default version, it should be noted that above-mentioned executable program can be and pass throughThe program that Mono compiler is run in linux system.
When the embodiment of the present invention is lower than default version by the version in text document, text document to be turned is converted toDefault version, default version here be text document function and version pattern can convenient reading and parsing, thusThe accuracy for guaranteeing text document version pattern during file format is converted, is not in mess code phenomenon.
204, it is concentrated from document tool and calls document conversion instruction, the text document to be turned is generated into binary documentsThe file of format.
Wherein, document tool collection can call the interface of Open Xml Power Tools component real by executable programIt is existing, Open Xml Power Tools component contain can by the codes of the Open XML SDK all kinds of common tasks realized andInstruction, such as the conversion of DOCX to HTML/CSS, merging and separation DOCX document, merging and separation PPTX document.
For the embodiment of the present invention, when text document to be converted is converted to web page files, converted based on step 203The text document of default version afterwards is converted, will be to be turned after concentrating calling document conversion instruction from document toolText document generates the file of binary documents format, and it includes to be turned that the file of binary documents format here, which is equivalent to,The file of text document internal structure specifically may include document elements and each element definition etc. in text document, fromAnd recognize the internal structure of the text document of file to be converted, and then text text is read according to the internal structure of text documentData in shelves.
205, the list data in the file of the binary documents format is read, and the list data read is writtenInto web page files.
For the embodiment of the present invention, the data in the file of binary documents format can be read by way of dom treeTable, each node object is corresponding with different document elements in dom tree, due to being stored with text to be turned in the data formInternal structure in document passes through the table in file of each node object to read binary documents format in traversal dom treeThe list data read is further written in web page files by lattice data.
206, the page parameter in the web document is adjusted by constructed fuction.
Wherein, constructed fuction is used to adjust the page parameter in adjustment page documents, and different constructed fuctions is adjustableDifferent page parameters such as adjusts constructed fuction, the constructed fuction for adjusting documentation page back gauge and the adjustment of document file page sizeThe constructed fuction etc. of page layout background color attribute, the embodiment of the present invention is without limiting.
207, parameter web document adjusted is associated with Write object, by parameter web document adjustedIt is stored as the file of preset format.
The embodiment of the present invention is needed after the page parameter that adjustment completes in web document by webpage adjustedDocument is associated with Write object, specifically be can establish one or more Write objects and is associated with web document, passes through writingParameter web document adjusted can be stored as the file of preset format by device object, such as portable document format or pictureFormat.
For the conversion method of another file format provided in an embodiment of the present invention, by calling document tool collection will be toThe text document turned is converted to web page files, the intermediate file that web page files are converted as file format, without in serverWeb page files can be directly converted to the text of preset format by handling web page files by upper installation Office componentPart provides conveniently for the file conversion of different-format.With the text for needing to install Office component in the prior art on the serverThe method of the conversion of part format is compared, and the embodiment of the present invention is by calling document tool collection that text document to be turned is converted to netWeb page files using web page files as intermediate file, and then are converted to the file of predetermined format, due to web page files by page fileThis format has certain versatility, therefore original document is whether converted to web page files, or web page files are turnedBe changed to the file of other predetermined formats, difficulty is all relatively low, it is only necessary to calling can be common to various operating systems (includingWindows operating system, (SuSE) Linux OS etc.) document tool collection can be completed, solve file format conversion in OfficeThe deployment issue of component, by running executable program collection in editing machine, so that the format conversion of file is not limited toWindows operating system reduces requirement of the file conversion to system environments and software deployment, provides just for the conversion of file formatIt is prompt.
To achieve the goals above, according to another aspect of the present invention, the embodiment of the invention also provides a kind of storage JieMatter, the storage medium include the program of storage, wherein equipment where controlling the storage medium in described program operation is heldThe conversion method of row file format described above.
To achieve the goals above, according to another aspect of the present invention, the embodiment of the invention also provides a kind of processor,The processor is for running program, wherein described program executes the conversion method of file format described above when running.
Further, as the realization to method shown in above-mentioned Fig. 1 and Fig. 2, another embodiment of the present invention additionally provides oneThe conversion equipment of kind file format.The Installation practice is corresponding with preceding method embodiment, is easy to read, present apparatus embodimentNo longer the detail content in preceding method embodiment is repeated one by one, it should be understood that the device in the present embodiment canThe corresponding full content realized in preceding method embodiment.The device is for solving Office component in file format conversion processThe problem of interface is disposed, specifically as shown in figure 3, the device includes:
Acquiring unit 31 can be used for obtaining text document to be turned;
First converting unit 32 can be used for by calling document tool collection that the text document to be turned is converted to netPage file;
Processing unit 33, can be used for handling the web page files, and the web page files are converted to default latticeThe file of formula.
For the embodiment provides a kind of conversion equipment of file format, by calling document tool collection will be toThe text document turned is converted to web page files, the intermediate file that web page files are converted as file format, without in serverWeb page files can be directly converted to the text of preset format by handling web page files by upper installation Office componentPart provides conveniently for the file conversion of different-format.With the text for needing to install Office component in the prior art on the serverThe method of the conversion of part format is compared, and the embodiment of the present invention is by calling document tool collection that text document to be turned is converted to netPage file, and then web page files are directly handled, web page files are converted to the file of preset format, can solve fileThe deployment issue of Office component in format conversion, by running executable program collection in editing machine, so that the format of fileConversion is not limited to Windows operating system, can support other operating system environments.
Further, as shown in figure 4, described device further include:
Detection unit 34 can be used for detecting the version of the text document to be turned;
Second converting unit 35, if the version that can be used for the text document is lower than default version, will it is described toThe text document turned is converted to the text document of default version;
Also, when the text document to be converted is converted to web page files, the text based on the default versionDocument is converted.
Further, first converting unit 32 includes:
Calling module 321 can be used for concentrating from the document tool and call document conversion instruction, based on described to be turnedThe file of text document generation binary documents format;
Read module 322 can be used for reading the list data in the file of the binary documents format, and will readTo list data be written in web page files.
Further, the processing unit 33 includes:
Module 331 is adjusted, can be used for adjusting the page parameter in the web document by constructed fuction;
Memory module 332 can be used for for parameter web document adjusted being associated with Write object, by the parameterWeb document adjusted is stored as the file of preset format.
Further, the document tool collection calls description components interface to realize by executable program.
Further, the file of above-mentioned preset format is portable document format or picture format.
For the conversion equipment of another file format provided in an embodiment of the present invention, by calling document tool collection will be toThe text document turned is converted to web page files, the intermediate file that web page files are converted as file format, without in serverWeb page files can be directly converted to the text of preset format by handling web page files by upper installation Office componentPart provides conveniently for the file conversion of different-format.With the text for needing to install Office component in the prior art on the serverThe method of the conversion of part format is compared, and the embodiment of the present invention is by calling document tool collection that text document to be turned is converted to netWeb page files using web page files as intermediate file, and then are converted to the file of predetermined format, due to web page files by page fileThis format has certain versatility, therefore original document is whether converted to web page files, or web page files are turnedBe changed to the file of other predetermined formats, difficulty is all relatively low, it is only necessary to calling can be common to various operating systems (includingWindows operating system, (SuSE) Linux OS etc.) document tool collection can be completed, solve file format conversion in OfficeThe deployment issue of component, by running executable program collection in editing machine, so that the format conversion of file is not limited toWindows operating system reduces requirement of the file conversion to system environments and software deployment, provides just for the conversion of file formatIt is prompt.
The conversion equipment of the file format includes processor and memory, above-mentioned acquiring unit 31, the first converting unit32 and processing unit 33 etc. store in memory as program unit, are executed by processor stored in memory above-mentionedProgram unit realizes corresponding function.
Include kernel in processor, is gone in memory to transfer corresponding program unit by kernel.Kernel can be set oneOr more, the complexity of research staff's program development is reduced by adjusting kernel parameter.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/Or the forms such as Nonvolatile memory, if read-only memory (ROM) or flash memory (flash RAM), memory include that at least one is depositedStore up chip.
The embodiment of the invention provides a kind of storage mediums, are stored thereon with program, real when which is executed by processorThe conversion method of the existing file format.
The embodiment of the invention provides a kind of processor, the processor is for running program, wherein described program operationThe conversion method of file format described in Shi Zhihang.
The embodiment of the invention provides a kind of equipment, equipment include processor, memory and storage on a memory and canThe program run on a processor, processor perform the steps of when executing program
A kind of conversion method of file format, comprising: obtain text document to be turned;
By calling document tool collection that the text document to be turned is converted to web page files;
The web page files are handled, the web page files are converted to the file of preset format.
Further, after obtaining text document to be turned, the method also includes:
The version of the detection text document to be turned;
If the version of the text document is lower than default version, the text document is converted to the text of default versionThis document;
Also, when the text document to be converted is converted to web page files, the text based on the default versionDocument is converted.
Further, described by calling document tool collection that the text document to be turned is converted to web page files packetIt includes:
It is concentrated from the document tool and calls document conversion instruction, binary system text is generated based on the text document to be turnedThe file of shelves format;
The list data in the file of the binary documents format is read, and the list data read is written to netIn page file.
Further, described that the web page files are handled, the web page files are converted to the text of preset formatPart includes:
The page parameter in the web document is adjusted by constructed fuction;
Parameter web document adjusted is associated with Write object, parameter web document adjusted is storedFor the file of preset format.
Further, the document tool collection calls description components interface to realize by executable program.
Further, the file of the preset format is portable document format or picture format.
Equipment herein can be server, PC, PAD, mobile phone etc..
Present invention also provides a kind of computer program products, when executing on data processing equipment, are adapted for carrying out justThe program code of beginningization there are as below methods step: text document to be turned is obtained;By calling document tool collection described wait turnText document be converted to web page files;The web page files are handled, the web page files are converted into preset formatFile.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer programProduct.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the applicationApply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more,The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) producesThe form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present applicationFigure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructionsThe combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programsInstruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produceA raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for realThe device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram orThe function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that countingSeries of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer orThe instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram oneThe step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, netNetwork interface and memory.
Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/Or the forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable JieThe example of matter.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any methodOr technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), movesState random access memory
(DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electrically erasableRead memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), the more function of numberCan CD (DVD) other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices or it is any itsHis non-transmission medium, can be used for storing and can be accessed by a computing device information.As defined in this article, computer-readable JieMatter does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludabilityIt include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrapInclude other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic wantElement.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including elementThere is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that embodiments herein can provide as method, system or computer program product.Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the applicationForm.It is deposited moreover, the application can be used to can be used in the computer that one or more wherein includes computer usable program codeThe shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)Formula.
The above is only embodiments herein, are not intended to limit this application.To those skilled in the art,Various changes and changes are possible in this application.It is all within the spirit and principles of the present application made by any modification, equivalent replacement,Improve etc., it should be included within the scope of the claims of this application.