Background technology
High speed development along with society's science and technology, increasing people can touch computing machine, promoted the extensively universal of computing machine, the fast development of network technology, computing machine becomes people's the most indispensable instrument at one's side gradually, people can utilize computing machine be engaged in and deal with the work in routine work, information data becomes the resource of this most important most worthy of epoch, be dissolved into fully the every aspect in our life, people constantly can experience service fast and easily and the experience that development of computer brings.Meanwhile, this is to be worth due to huge interests that these personal information data and business unit's information data have, and various types of computer crimes are covered the sky and the earth, and to the country and people's property safety, have brought serious threat.In these years, domestic and international computer crime case showed increased, particularly in recent years, various computer crime cases allowed compatriots know gradually, and this vocabulary of computer forensics allows more people be familiar with gradually.
The discovery of electronic evidence is exactly to collect the most original evidence data by investigation and the inspection of the scene of a crime.In fact the discovery technique of electronic evidence belongs to investigative technique.General investigative technique combines and has just formed Technology of Evidence Extract of Computer with computer technology.Take the source of electronic evidence is standard, and Technology of Evidence Extract of Computer can be divided into: unit forensic technologies, network forensics technology and relevant device forensic technologies.And take the process of computer forensics, be standard, Technology of Evidence Extract of Computer can be divided into: evidence discovery technique, evidence technique for fixing, evidence extractive technique, evidence analysis technology and evidence expression technology.
How to guarantee that the evidence that evidence obtaining personnel collected was not modified, this is one of difficult point of computer forensics, also belongs to the fixing category of electronic evidence.Traditional method is the true integrality of being signed to guarantee the inspection of the scene of a crime and being investigated the electronic evidence obtaining by digital signature and eye-witness.At present, the U.S. has worked out detailed standard for the fixing software and hardware of evidence, and the evidence obtaining instrument using has been proposed very high standard and evidence obtaining personnel's industry behavior has all been worked out to strict standard.
China has 500,000,000 netizens nearly at present, ranks the first in the world.Because computing machine is so high in the popularity of China, the various form of crimes of computer crime and the means of crime of utilizing become increasingly complex, traditional evidence obtaining software and forensic technologies can not meet the requirement of modern computer crime survey evidence obtaining, how to obtain accurate and effective computing machine electronic evidence and have brought huge challenge to evidence obtaining personnel.Because the domestic product that has an independent research patent is actually rare, more or rely on outside introduction, this has also formed certain threat to national affairs safety.Adding that current already present evidence obtaining software function is too single, specific aim is not strong, to calculating evidence obtaining work, has brought certain restriction.Therefore to the research of Technology of Evidence Extract of Computer and exploration work, will seem particularly important.
Summary of the invention
The object of this invention is to provide and a kind ofly have that storage medium mirror image extracts, saves from damage, the computerized information evidence-obtaining system of classification of evidence abstraction function.
For achieving the above object, the present invention is by the following technical solutions:
A computerized information evidence-obtaining system, is characterized in that, system comprises: 1) physical disk mirror image ADMINISTRATION SUBSYSTEM; 2) hard disc data Put on file subsystem; 3) document information extracts subsystem; 4) sensitive information extracts subsystem as required; 5) comprehensive inquiry association analysis subsystem.
Described physical disk mirror image ADMINISTRATION SUBSYSTEM is the system based on C/S framework, and server end is deployed under linux system environment, and client deployment is in Windows system.
The major function of described server end comprises: the reception of image file, storage, transmission and carry; Client major function comprises: the extraction of mirror image, upload, download and write, the management of Mirror Info and the classification of instrument of evidence storage.
Described image file is stored in the disk memory array of mirror image server with read-only mode, while browsing the file of checking in disk array, mirror image manager can the form with virtual disk be mounted to the disk mirroring of appointment on server, and virtual disk is mapped on the main frame of assigned ip address to share form.
Described hard disc data Put on file subsystem receives the disk parameter passing over from mirror image server, reads the file under long-range mapping path, and its classification is stored on local file server according to parameter and classifying rules.
Described document information extracts subsystem and comprises index and search module, and the task of disk file index is obtained from tasklist, manually imports file index task and directly obtains from filesys.
Described sensitive information extracts as required subsystem and comprises mail parsing module, IM class parsing module and the large module of network address analysis module three, and by the property field in scanning filesys, judgement type field realizes information extraction.
Described comprehensive inquiry association analysis subsystem comprises pattern layout control module, authority management module, search module and Ajax interactive module.
In described pattern layout control module, user interface is used the plug-in unit of jquery-ui, administrator interfaces has adopted layout and the part plug-in unit of easy-ui, form adopts jquery plug-in unit jqGrid to show, what file was uploaded use is flash plug-in unit, and the dynamic employing jquery at whole interface realizes.
The invention has the beneficial effects as follows:
(1) evidence is preserved conveniently, and evidence extracts between client and mirror image server by the network interconnection, is not subject to spacial influence, and the image file of uploading is not subject to type, the extraction time of original storage medium, the impact in extraction space;
(2) all original image files are all stored with read-only form on mirror image server, thereby guaranteed the safety issue of image file storage, and user's browsing and with realizing by rights management mechanism, having guaranteed the security of data mirror image and the instrument of evidence;
(3) image file is with the storage of DD form, completely corresponding with original storage medium, can be converted into the image file of multiple other form, and other evidence obtaining software easy to use is analyzed;
(4) system is extracted the file with evidence obtaining value on image file according to extracting rule and classifying rules automatically, filtered out the file that evidence obtaining is not worth, reduce the workload that electronic evidence is analyzed, effectively raised the efficiency that electronic evidence is analyzed;
(5) system is stored the file consolidation of extracting from different time, different image file in file server classification, for user carries out providing data basis from the administrative analysis of the electronic evidence of multi-data source, and all electronic data files that system is extracted all can be traced to the source.
Embodiment
Embodiment 1:
1, physical disk mirror image management:
1) selection of bak:
When hard disk is done to mirror image, have step-by-step with by two kinds of files.
The step-by-step of back-up job (being actually by sector) unloading means with file system irrelevant.What data source is, there have to be much, and what target is exactly, and just has much.Even without the subregion of subregion or None-identified, or zoned logic structure is wrong, and (comprising logic error) that can be complete backups on target device together.
By file unloading, be to after file system explanation, only by the mode of file, to extract on target device.This loading program must can be explained corresponding file system, can not extract non-file data simultaneously.Data volume also depends on source device file summation, and file is more, needs the data of transfer just more.In addition, not corresponding one by one on the physical location of source and target in the mode of file unloading, just as All Files being copied into target device from source tray, also needn't be identical with target device on the file organization mode of source tray, physical order, space hold.This unloading can not shifted the free space in disk in the past yet.
Step-by-step unloading is applicable to most complete backup, such as the backup of data before recovering, the backup of non-WINDOWS system, or has the backup of special construction, but requires space larger.Press file unloading and be applicable to the batch documents backup to known file system.
Backup by USB flash disk generates interpretation of result, and the mirror image of DD form is to skip the backup completely to formatted disc subregion that start sectors and unpartitionable space complete.By WINHEX, open existing disk analysis and find, existing subregion is all that the continuous magnetic track on disk divides out the subregion forming.The mirror image that is to say DD form has produced the complete data backup of an existing subregion from startsector to endsector.
2) disk reads
API by MS obtains disk specifying information, and core is DeviceIoControl.
3) communication mode
The communication of internal system is mainly by two kinds of mode: Socket communications and database transfer communication.Socket communication, by self-defining communication protocol, is carried out the data communication between program.The second way is to carry out information interaction by the mode of Query Database, and this real-time communication is poor, but has availability and stronger fault-tolerance.So in task progress control, the management of task makes to communicate in this way.Even in task implementation, because special circumstances cause tasks interrupt, also can be after restarting process, retrieval tasks progress data, and recovery tasks is again to up-to-date state.
2, hard disc data Put on file:
1) common list of media types
IANA is safeguarding a medium type and is character-codedly recording list.Their list opens to the public by internet.
2) database connection pool
In native system, use Java language to pass through JDBC technology access database.JDBC is a kind of scheme of open to the outside world, and it,, for the application programming interface that database application developer, database foreground too development personnel provide a kind of standard, makes developer write complete database application with pure Java language.
The process of java application accessing database is:
1. loading data storehouse driver; 2. by JDBC building database, connect; 3. accessing database, carries out SQL statement; 4. turn-off data storehouse connects.
By setting up a database connection pool and a set of connection, use operating strategy, form connect multiplexing, make like this database connect can to obtain efficient, safe multiplexing, thereby database connects the expense of frequently setting up, closing, can effectively avoid.
Resource pool, has solved resource and has frequently distributed, discharged the problem causing.This model application, to database connection management field, is set up to a database connection pool exactly, a set of efficient connection distribution, usage policy are provided, final goal is realize to connect efficient, safe multiplexing.
Database connection pool connects the database that creates some to be put in connection pool when initialization, and the quantity that these databases connect is set by minimum data storehouse linking number.Whether no matter these databases connect is used, and connection pool all at least has so many numbers of connection by guarantee always.The maximum data storehouse number of connection of connection pool defines the maximum number of connections that this connection pool can occupy, and when application program surpasses maximum number of connections amount to the linking number of connection pool request, these requests will be added in waiting list.The minimum linking number of database connection pool and the setting of maximum number of connections will be considered following several factor:
1. minimum linking number
That the database that connection pool keeps always connects, so if the use amount that application program connects database is little, be wasted having a large amount of database connection resources;
2. maximum number of connections
Be the maximum number of connections that connection pool can be applied for, if database connection request surpasses this number, database connection request below will be added in waiting list, and this can affect database manipulation afterwards;
If 3. minimum linking number and maximum number of connections differ too large
Connection request so at first will be made a profit, and the connection request that surpasses afterwards minimum number of connection is equivalent to sets up a new database connection.But, the database that is greater than minimum linking number is connected to and does not use and can be released at once, it will be placed in connection pool wait for reuse or idle overtime after be released.
3, document information extraction module:
1) use Open-Source Tools Lucene to realize index and the query function of file, use third party's Open-Source Tools pdfbox, poi etc. to realize the extraction of file content.
2) database connection pool
With hard disc data Put on file module.
4, sensitive information extracts as required:
1) connection data bank interface
Database information is stored in ini file in advance, extracts the database account number cipher preserved in ini file and be connected word string information by GetPrivateProfileSectionNames () and GetPrivateProfileString () function.
2) Query Database table filesys
As shown in Figure 2, in Filesys table, property field is 1, represents to show this by filesys to be recorded as information extraction required, therefrom extracts the value of each field, for the work of ensuing sensitive information document analysis is prepared.
3) inquiry parsetable table
Parsetable tables of data is used for partly checking that at data display backstage progress uses, for control end is prepared.
4) dlm (dynamic loading module)
In Dll.ini configuration file, comprise the dll feature card that all can support, the type field that meets the record of sensitive information condition in filesys table is mated, load different dll plug-in units.
5) log interface
The log pattern that utilizes log4cplus to increase income in project records flow process, the abnormal log in present procedure.
6) information extraction
Extractive technique to instant chat software, Mail Clients, browser, eml file-related information.
5, comprehensive inquiry association analysis:
1) pattern layout control module:
Domestic consumer has been used at interface the plug-in unit of common jquery-ui, layout is that oneself is realized, administrator interface has adopted layout and the part plug-in unit of easy-ui, the displaying unification of form has adopted jquery third party's plug-in unit jqGrid, what file was uploaded use is a flash plug-in unit, and the dynamic employing jquery at whole interface realizes.
2) authority management module:
Only have the user of certain authority just can extract and upload image file, meanwhile, different users can browse and use different mirror images and data file, for other mirror image and file thereof, haves no right to access.
3) front end search module:
Ajax submits to search data and jquery to process the displaying returning results, and paging adopts third party's plug-in unit of jquery.
4) Ajax interactive module:
Jquery-ajax builds tables of data and adopts jqgrid plug-in unit.
Backstage adopts java web development technique, hibernate database technology.