The content of the invention
An embodiment of the present invention provides a kind of extracting method of video display drama scene information, can solve to deposit in the prior artThe problem of.
The present invention provides a kind of extracting method of video display drama scene information, this method comprises the following steps:
Step 1, inside and outside, time, the essential information dictionary of weather scene description are established;
Step 2, complete video display drama is read, deletes blank line, the front and rear space character of often row content is removed, based on placeVideo display drama content after reason carries out scene content identification and extraction;
Step 3,500 characters before reading video display drama, judge video display drama scene description using single file describing mode alsoThree row describing modes, if first Chinese character that continuous three row occurs in video display drama content be scape, when, three Chinese characters of people itOne, and scape, when, in the case that three Chinese characters of people all occur once, the describing mode of scene is three row describing modes, is not three rowsDescribing mode is single file describing mode;
Step 4, initialize movie and television play one's own profession and read sequence number N, N=1 is set, an empty list L is initialized, for by successivelyOrder stores identified scene information;
Step 5, judge whether N exceedes video display drama head office number, if it is perform step 9, otherwise read video display dramaNth row content, is character string S;
Step 6, judge whether character string S first characters are Chinese and English numerical character, and if it is the row is probably fieldScene describing, performs step 7, otherwise performs step 8;
Step 7, the scene description mode obtained according to step 3, using different scene information identification and extracting method;
Step 8, according to judging result in step 6, step 7, next line is set to read this position of movie and television play;
Step 9, list L is the whole scene informations for identifying and extracting from video display drama, by list L storages to textIn part, database, identification and extraction scene information processing are completed.
The extracting method of a kind of video display drama scene information in the embodiment of the present invention, by reading line by line in video display dramaHold, automatically identify the video display drama content row with scene description, recycling is based on the matched mode of dictionary, accurately extracts fieldThe essential informations such as play, place, the description of interior outfield, time, weather, high priest in scape information, and reach following effect:
A. automatic identification video display drama Scene describes.
B. the essential information in scene information is accurately extracted.
C. the video display drama scene analysis time is saved, improves scene analysis efficiency and accuracy.
Embodiment
The technical solution in the embodiment of the present invention will be clearly and completely described below, it is clear that described implementationExample is only part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, this area is commonTechnical staff's all other embodiments obtained without making creative work, belong to the model that the present invention protectsEnclose.
An embodiment of the present invention provides a kind of extracting method of video display drama scene information, this method comprises the following steps:
Step 1, inside and outside, time, the essential information dictionary of weather scene description are established.
Step 2, complete video display drama is read, deletes blank line, the front and rear space character of often row content is removed, based on placeVideo display drama content after reason carries out scene content identification and extraction.
Step 3,500 characters before reading video display drama, judge video display drama scene description using single file describing mode alsoThree row describing modes, if first Chinese character that continuous three row occurs in video display drama content be scape, when, three Chinese characters of people itOne, and scape, when, in the case that three Chinese characters of people all occur once, the describing mode of scene is three row describing modes, is not three rowsDescribing mode is single file describing mode.
Step 4, initialize movie and television play one's own profession and read sequence number N, N=1 is set, an empty list L is initialized, for by successivelyOrder stores identified scene information.
Step 5, judge whether N exceedes video display drama head office number, if it is perform step 9, otherwise read video display dramaNth row content, is character string S.
Step 6, judge whether character string S first characters are Chinese and English numerical character, and if it is the row is probably fieldScene describing, performs step 7, otherwise performs step 8.
Step 7, the scene description mode obtained according to step 3, using different scene information identification and extracting method:
Extracting method when video display drama is single file scene description mode below use:
A. the comma in character string S, pause mark, fullstop, colon, TAB tabs are replaced with into space character, forms character stringS0。
B. using space as separator, character string S0 contents are split in character string dimension A.
C. array A is traveled through, reads the element in character string dimension A one by one, and record the scene essential information number identifiedT is measured, performs following operation:
C1. when the element content in character string dimension A is:X field (X is Chinese and English numeral) or first character are Sino-BritishWen Shi, this element describe the play information of scene, take following extracting method:
Situation 1:Element content is X field, the play information using X as scene.
Situation 2:Element content is numeral, the play information using numeral as scene.
Situation 3:Element content is numeral 1- numerals 2 (numeral 1 is connected among numeral 2 with Chinese and English minus sign), by numeral 2Play information as scene.
After the play information for identifying scene, T values are increased by 1, and the element is removed into out character string dimension A.
C2. when the inside and outside description dictionary in step 1 of the element in character string dimension A, using element as in sceneExternal information S1, after the inside and outside scene information for identifying scene, increases by 1, and the element is removed out character string dimension A by T values.
C3. when the time when the element in character string dimension A in step 1 describes dictionary, using element as scene whenBetween description information S2, after the time description information for identifying scene, T values are increased by 1, and the element is removed into out character string dimensionA。
C4. when the weather when the element in character string dimension A in step 1 describes dictionary, the day using element as sceneGas description information S3, after the weather description information for identifying scene, increases by 1, and the element is removed out character string dimension by T valuesA。
D. surplus element connects to form character string using space in array A, the place using the character string as sceneInformation S4, increases by 1 by T values.
E. when the scene essential information quantity T identified is more than or equal to 2, character string S, will as scene description information rowCharacter string S1, S2, S3 and S4 content are put into one and include inside and outside scene properties, time attribute, Weather property, site attribute respectivelyObject in, which is saved in the list L that step 4 creates;When T is less than 2, character string S is retouched not as scene informationState row.
Extracting method when video display drama is three row scene description mode below use:
A. first Chinese character for judging to occur in character string S whether be scape, when, one in three Chinese characters of people, if notIt is that then current line is not scene description, continues to execute step 8;Otherwise subsequent treatment is continued to execute.
B. judge character string S be expert at be followed by it is no also there are 2 row video display drama contents, if it does not exist, then current lineIt is not scene description, continues to execute step 8;Otherwise subsequent treatment is continued to execute.
C. note character string S next lines content is S5, and the second row content is S6 after character string S, judge respectively character string S, S5,The Chinese character of the appearance of S6 first whether be scape, when, people, and scape, when, each of three Chinese characters of people occur once, if conditions are not met,Then current line is not scene description, continues to execute step 8;Otherwise subsequent treatment is continued to execute.
D. following extracting method is used in character string S, S5, S6 respectively:
Situation 1:When the Chinese character of the appearance of character string first be " scape ", interception character first Chinese character behind the scapeString is until end of string, the location information P1 of the new character strings of formation as scene information.
Situation 2:When the Chinese character of the appearance of character string first be " people ", interception character first Chinese character behind the peopleString is until end of string, the people information P2 of the new character strings of formation as scene information.
Situation 3:The Chinese character of the appearance of character string first for " when " when, since when behind first Chinese character intercept characterString forms new character string, following processing is done for new character strings until end of string:
(1) character in character string in addition to Chinese character is replaced with into space and deletes space before and after character string, form characterString S7.
(2) using space as separator, the content of character string S7 is split in character string dimension B.
(3) array B is traveled through, reads the element in character string dimension B one by one, performs following identification operation:
When in the inside and outside description dictionary that element occurs in step 1, the inside and outside scene information P3 using element as scene.
When describing in dictionary the time that element occurs in step 1, the time description information P4 using element as scene.
E. the content of character string P1, P2, P3, P4 are put into one respectively and include site attribute, character attribute, inside and outside sceneAttribute, time attribute object in, which is saved in the list L that step 4 creates.
Step 8, according to judging result in step 6, step 7, next line is set to read this position of movie and television play:
When it is not scene that step 6, which judges character string S, after N=N+1 is set, jumps to step 5 and continue to execute.
When step 7 judges that character string S describes scene mode for single file, after N=N+1 is set, jumps to step 5 and continue to holdOK.
When step 7 judges that character string S describes scene mode for three rows, after N=N+3 is set, jumps to step 5 and continue to holdOK.
When it is not scene that step 7, which judges character string S, after N=N+1 is set, jumps to step 5 and continue to execute.
Step 9, list L is the whole scene informations for identifying and extracting from video display drama, by list L storages to textPart, in database, identification and extraction scene information processing are completed.
It should be understood by those skilled in the art that, the embodiment of the present invention can be provided as method, system or computer programProduct.Therefore, the present invention can use the reality in terms of complete hardware embodiment, complete software embodiment or combination software and hardwareApply the form of example.Moreover, the present invention can use the computer for wherein including computer usable program code in one or moreThe computer program production that usable storage medium is implemented on (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)The form of product.
The present invention be with reference to according to the method for the embodiment of the present invention, the flow of equipment (system) and computer program productFigure and/or block diagram describe.It should be understood that it can be realized by computer program instructions every first-class in flowchart and/or the block diagramThe combination of flow and/or square frame in journey and/or square frame and flowchart and/or the block diagram.These computer programs can be providedThe processors of all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices is instructed to produceA raw machine so that the instruction performed by computer or the processor of other programmable data processing devices, which produces, to be used in factThe device for the function of being specified in present one flow of flow chart or one square frame of multiple flows and/or block diagram or multiple square frames.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing devices with spyDetermine in the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory, which produces, to be included referring toMake the manufacture of device, the command device realize in one flow of flow chart or multiple flows and/or one square frame of block diagram orThe function of being specified in multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices so that countedSeries of operation steps is performed on calculation machine or other programmable devices to produce computer implemented processing, thus in computer orThe instruction performed on other programmable devices is provided and is used for realization in one flow of flow chart or multiple flows and/or block diagram oneThe step of function of being specified in a square frame or multiple square frames.
Although preferred embodiments of the present invention have been described, but those skilled in the art once know basic creationProperty concept, then can make these embodiments other change and modification.So appended claims be intended to be construed to include it is excellentSelect embodiment and fall into all change and modification of the scope of the invention.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the artGod and scope.In this way, if these modifications and changes of the present invention belongs to the scope of the claims in the present invention and its equivalent technologiesWithin, then the present invention is also intended to comprising including these modification and variations.