Detailed Description
The present application will be described in further detail with reference to the following drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the relevant invention and not restrictive of the invention. It should be noted that, for convenience of description, only the portions related to the related invention are shown in the drawings.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
Fig. 1 illustrates anexemplary system architecture 100 to which embodiments of the method, apparatus, electronic device, and computer-readable storage medium for detecting original text of the present application may be applied.
As shown in fig. 1, thesystem architecture 100 may includeterminal devices 101, 102, 103, anetwork 104, and aserver 105. Thenetwork 104 serves as a medium for providing communication links between theterminal devices 101, 102, 103 and theserver 105.Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use theterminal devices 101, 102, 103 to interact with theserver 105 via thenetwork 104 to receive or send messages or the like. Theterminal devices 101, 102, 103 and theserver 105 may be installed with various applications for implementing information communication between the two devices, such as an original text detection application, a data transmission application, an instant messaging application, and the like.
Theterminal apparatuses 101, 102, 103 and theserver 105 may be hardware or software. When theterminal devices 101, 102, 103 are hardware, they may be various electronic devices with display screens, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like; when theterminal devices 101, 102, and 103 are software, they may be installed in the electronic devices listed above, and they may be implemented as multiple software or software modules, or may be implemented as a single software or software module, and are not limited in this respect. When theserver 105 is hardware, it may be implemented as a distributed server cluster composed of multiple servers, or may be implemented as a single server; when the server is software, the server may be implemented as a plurality of software or software modules, or may be implemented as a single software or software module, which is not limited herein.
Theserver 105 may provide various services through various built-in applications, taking an original text detection application that may provide a detection service for detecting whether the original text is an original text as an example, when theserver 105 runs the original text detection application, the following effects may be achieved: firstly, receiving a text to be detected sent by an author throughterminal equipment 101, 102 and 103 through anetwork 104; then, extracting a theme from the text to be detected and extracting a triplet of a principal and a predicate from the text to be detected; and finally, calculating the similarity degree between the subject and the three-tuple of the subject and the predicate element of the text to be detected and the open text, and determining whether the text to be detected is the original text or not based on the similarity degree. Further, theserver 105 may also return corresponding response information to the author according to the detection result.
It should be noted that the text to be detected may be acquired from theterminal devices 101, 102, and 103 through thenetwork 104, or may be stored locally in theserver 105 in advance in various ways. Thus, when theserver 105 detects that such data is already stored locally (e.g., a pending original text detection task remaining before starting processing), it may choose to retrieve such data directly from locally, in which case theexemplary system architecture 100 may also not include theterminal devices 101, 102, 103 and thenetwork 104.
The method for detecting the original text provided in the following embodiments of the present application is generally executed by theserver 105 having stronger computing power and more computing resources, so as to obtain the computing result as soon as possible by fully utilizing the stronger computing power of theserver 105. Accordingly, a device for detecting the original text is also generally provided in theserver 105. However, it should be noted that, when theterminal devices 101, 102, and 103 also have computing capabilities and computing resources meeting the requirements, theterminal devices 101, 102, and 103 may also complete the above-mentioned operations performed by theserver 105 through the original text detection application installed thereon, and then output the same result as theserver 105. Particularly, when there are a plurality of terminal devices having different computation capabilities at the same time, but the original text detection application determines that the terminal device has a strong computation capability and a large amount of computing resources are left, the terminal device may execute the above computation to appropriately reduce the computation pressure of theserver 105, and accordingly, the device for detecting the original text may be provided in theterminal devices 101, 102, and 103. In such a case, theexemplary system architecture 100 may also not include theserver 105 and thenetwork 104.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring to fig. 2, fig. 2 is a flowchart of a method for detecting an original text according to an embodiment of the present application, where theprocess 200 includes the following steps:
step 201: extracting a theme from the acquired text to be detected;
this step is intended to extract a subject from the acquired text to be detected by an executing body (for example, theserver 105 shown in fig. 1) of the method for detecting the original text.
The text to be detected refers to a text which needs to be subjected to original text detection, the text to be detected may be a text which is submitted to the execution main body by an author for original text detection, may be submitted to the assistance of others by the author for original text detection, or may be a designated text included in the detection instruction, and the text is not limited specifically here.
The text to be detected may be obtained in a variety of ways, for example, the execution main body may obtain the text from a local text storage path (for distinguishing the detected text, the text to be detected may be distinguished based on a tag to be detected that is only attached to the text to be detected), may also obtain the text from an author through an authoring terminal (for example, the terminal device shown in fig. 1) via a network, and may also obtain the text from an output port of a text generation service, where the text generation service may be a service that automatically generates a complete text based on given keywords.
The theme of the text to be detected refers to the core meaning expressed by the content of the text to be detected, so that the content of the text to be detected can be known more accurately from the semantic level according to the theme with the core meaning, and the theme can include the intention of the text to be detected, the expressed emotional tendency, the description object and the like. Further, considering that the text to be detected usually includes a plurality of chapters, each chapter includes a plurality of paragraphs, each paragraph usually consists of a plurality of sentences, it is difficult to directly obtain the complete topic of the text to be detected, so that the topics of each chapter, each paragraph and each sentence can be obtained first, and finally summarized according to actual requirements, so as to obtain the complete topic of the text to be detected.
Step 202: extracting a main and predicate object triple from a text to be detected;
on the basis ofstep 201, this step is intended to extract a predicate-object triplet from the text to be detected by the execution main body. The subject-predicate object triple is also called as an SPO triple, and an extraction object of the triple is usually a sentence, which is the smallest unit of an article, that is, S is an abbreviation of subject, and refers to a subject and an action sender in each sentence, and the subject and the action senders are usually born by nouns, pronouns and the like and are generally placed at the beginning of the sentence; p is an abbreviation for predicate, referring to the predicate in each statement, the verb representing the various tenses, usually immediately after the subject; o is an abbreviation of object, refers to an object in each sentence, represents an object of action, is usually assumed by a noun and a pronoun, and generally appears after a verb of a predicate.
Compared with other components (such as a complementary word, a noun phrase, a shape word and the like) in a sentence, the reason why the three triples of the main and predicate objects are extracted from the text to be detected in the step is that the main and predicate objects can accurately express the actual meaning with the least cost, and particularly, in some special sentences, one or two of the three triples of the main and predicate objects can be lacked, and at the moment, the lacked other costs can be supplemented by contacting the upper context and the lower context so as to accurately determine the meaning expressed by the sentence.
Step 203: and calculating the similarity degree between the subject and the subject-predicate triple of the text to be detected and the public text, and determining whether the text to be detected is the original text or not based on the similarity degree.
On the basis ofstep 201 andstep 202, this step is intended to determine whether the text to be detected is the original text based on the calculated similarity degree by the execution main body described above by calculating the similarity degree between the subject and SPO triplet extracted from the text to be detected and the subject and SPO triplet extracted from the open text.
It can be seen that, unlike the similarity calculation mode performed only from the literal in the prior art, the similarity comparison operation performed on the basis of the theme representing the core meaning extracted instep 201 and the SPO triplet extracted instep 202 and expressing the actual content of the text compares the similarity of the expressed meaning and the actual content of the two texts actually compared, so as to more obviously find the "pseudo-original" text in the "soup change without drug change" formula.
The similarity degree provided in this step includes two parameters, namely, a subject and an SPO triple, during calculation, so that there are various ways in a specific calculation manner, for example, directly using the two parameters as parallel influence parameters to obtain the similarity degree according to a weighted calculation method, or using one parameter as a main influence factor and the other as an auxiliary influence factor for helping to improve accuracy rate only when the main influence factor cannot guarantee result accuracy, and the like, and the method is not specifically limited herein.
In order to identify the non-original text subjected to more complicated rewriting operation, the method for detecting the original text provided by the embodiment of the application can more accurately identify whether the two texts have substantial equivalence or similarity in content by comparing the subject and the subject-predicate triple expressing the contents of the text to be detected and the public text, so that the detection result of the original text is more accurate.
Referring to fig. 3, fig. 3 is a flowchart of another method for detecting an original text according to an embodiment of the present application, where theprocess 300 includes the following steps:
step 301: acquiring a detection text;
step 302: splitting a text to be detected into at least one paragraph, and splitting each paragraph into at least one sentence;
step 303: extracting core phrases from each paragraph and each sentence respectively, and taking the core phrases as the subject of the corresponding paragraph or sentence;
for a text to be detected which does not contain chapters, steps 302-303 firstly split the text to be detected into a plurality of paragraphs and each paragraph into a plurality of sentences; then, extracting core phrases expressing the core meanings of the core phrases from each paragraph and each sentence respectively; and finally, taking the extracted core phrase as the subject of the corresponding paragraph and the corresponding sentence. The segmentation and sentence splitting can be realized by combining the characteristics of a Chinese text structure through a Chinese natural language processing tool, and the core phrase expressing the core meaning can be extracted in various ways, for example, the extracted keywords can be used as the core phrase through a keyword extraction technology, and the key degree of each word in the sentence can be respectively determined through a TF-IDF (Term Frequency-Inverse text Frequency index) technology, so that the core phrase can be better extracted.
Further, when there is a need to obtain a complete topic of the text to be detected, the topic extracted from the paragraphs and sentences may be statistically analyzed and summarized, and then the topic of the completed text is sublimated, and this process may also be combined with the topic (including the main topic and the subtitle) of the text to be detected, and when the text to be detected definitely belongs to a part of a larger text set (for example, a certain chapter in a novel, a chapter in a book, etc.), the complete topic of the text to be detected may also be comprehensively determined by combining the topics and context relationships of other parts of the larger text set, so as to improve accuracy.
Step 304: identifying entity texts in the texts to be detected by using an entity identification technology of a knowledge graph;
step 305: extracting an associated text having a main-predicate object relationship with the entity by using a relation extraction technology of a knowledge graph;
step 306: generating a main predicate object triple according to the main body text and the corresponding associated text;
regardingstep 302 in theflow 200, steps 304-306 in this embodiment provide a specific implementation manner: firstly, respectively utilizing an entity recognition technology and a relation extraction technology based on a knowledge graph to obtain each entity text contained in a text to be detected and an associated text having a main-subject relation with the entity text; and then, generating a main predicate object triple according to the main body text and the corresponding associated text. The combination of the knowledge graph is realized because the knowledge graph which records sentence cost and which words generally serve as which components in the article is very helpful to achieve the above purpose, and the corresponding relationship between the entities and each node recorded in the knowledge graph in a mesh manner is also helpful to extract the SPO triples.
Of course, when the above purpose cannot be achieved through the knowledge graph, other technologies capable of achieving similar effects may be adopted instead, and the selection may be flexible according to the actual application scenario.
Step 307: calculating the vector similarity of the vectorized principal-predicate-object triples of the text to be detected and the public text in the same vector description space;
the vectorization predicate triplet refers to a result of converting a text-form predicate triplet in a vector form, that is, converting a conventional text-form predicate triplet in a vector form.
The method comprises the steps that the execution main body calculates and obtains the similarity of the SPO triple of the text to be detected and the open text in the same vector description space through vector-type vectorization predicate element triple.
Step 308: determining the number of similar subjects and the same distribution number of similar subjects between the text to be detected and the open text;
the step aims to use the number of similar subjects and the same distribution number of similar subjects as the measure of the similarity of the subjects between the text to be detected and the open text by the execution main body. The same distribution number of the similar subjects refers to the number of the similar subjects distributed at the same position of the text, so as to reflect the situation that the two texts express the same viewpoint as much as possible.
Step 309: determining the similarity degree between the subject and the subject-predicate triplet of the text to be detected and the open text based on the number of similar subjects, the same distribution number of similar subjects and the vector similarity;
on the basis ofstep 307 and step 308, in this step, the execution subject determines the similarity degree between the subject and the subject-predicate triple of the text to be detected and the publication text, respectively, based on the number of similar subjects, the same distribution number of similar subjects, and the vector similarity.
For the first half ofstep 303 in theprocess 200, this embodiment provides a specific implementation manner throughsteps 307 to 309, that is, similarity calculation between vectors is performed after converting an SPO triplet into a vector form, topic similarity between a text to be detected and a public text is determined by using the number of similar topics and the same distribution number of similar topics, and finally the similarity between the text to be detected and the public text is determined by integrating the topic similarity and the SPO triplet similarity, which is helpful for more accurate and comprehensive results.
Step 310: and determining whether the text to be detected is the original text or not based on the similarity degree.
This step is the same as the second half ofstep 203 in theprocess 200 shown in fig. 2, and for the same contents, please refer to the corresponding parts in the previous embodiment, which is not described herein again.
On the basis of the previous embodiment, the present embodiment provides a specific implementation manner for extracting a theme from a text to be detected throughsteps 302 to 303, provides a specific implementation manner for extracting an SPO triple from the text to be detected throughsteps 304 to 306, and provides a specific implementation manner how to calculate the similarity degree between the text to be detected and the publication text according to the theme and the SPO triple throughsteps 307 to 309.
On the basis of any of the above embodiments, in order to improve the accuracy of the final determination result of whether the final determination result is the original text or not as much as possible, based on the topic similarity and the SPO similarity provided by the above embodiments, the text similarity, the text repetition, the repetition rate, and other parameters may be combined to assist in the judgment from the literal hierarchy, and the following description will be given by taking the text similarity as an example to explain a specific implementation scheme, and other parameters may participate in the same manner:
referring to fig. 4, fig. 4 is a flowchart of another method for detecting an original text according to an embodiment of the present application, where theprocess 300 includes the following steps:
step 401: acquiring the text similarity between a text to be detected and a public text;
the text similarity refers to the text identity rate of the two texts at the sentence level.
Step 402: respectively acquiring a first weight and a second weight which are distributed for the similarity degree and the text similarity in advance;
the first weight is greater than the second weight, that is, the text similarity is used as an auxiliary judgment factor, for example, the first weight is 72% and the second weight is 28%.
Step 403: calculating to obtain comprehensive similarity according to the similarity weighted by the first weight and the text similarity weighted by the second weight;
step 404: determining that the text to be detected is a non-original text in response to the comprehensive similarity exceeding a preset threshold;
step 405: and determining the text to be detected as the original text in response to the comprehensive similarity not exceeding the preset threshold.
In order to deepen understanding, the application also provides a specific implementation scheme by combining a specific application scene:
the high-quality original content community provides high-quality original content for registered users, so that a good original ecology based on the original degree is created, the core of the community is original text detection service realized by a manual and automatic strategy, the original text detection service is borne by a server A of the community, and a communication port is provided for creators of the community.
1) The server A receives an article to be detected uploaded by a creator X;
2) the server A calls a built-in original text detection strategy to respectively extract topic information M from each paragraph and each sentence of the article to be detected, and extracts SPO information N from each sentence by means of a knowledge graph;
the topic information M comprises 100 topics and distribution positions of the topics in the article, and the SPO information N comprises at least 70 different SPO triples with a main guest and a subordinate guest.
3) The server a determines that one publication article closest to the article to be detected on the topic has 14 similar topics by comparing the topic information M with the topic information Mo of the publication, and the number of the 14 similar topics is only 6 in the same distribution position, so that the topic similarity rate is only 14/70-1/5-20% (6/14 < 50%, and therefore not used as an influence factor, and if the proportion of the same distribution number of the similar topics to the similar topics exceeds 50%, the calculation mode of the topic similarity rate is influenced, for example, a fixed 10% is added on the basis of the existing calculation result, and the like);
4) the server A converts the SPO information N into a vector N1 in a preset vector description space, then carries out similarity calculation on the vector N1 and N0 of a public text, and specifically adopts a vector distance as the similarity of two vectors to determine that the similarity of 52 of 70 vectors in the N1 and the SPO triple vector of a public article exceeds 60%, so that the SPO triple similarity rate of 52/70-74.3% is obtained;
5) the server A respectively carries out weighted calculation on the theme similarity, the SPO triple similarity and the conventional text repetition rate according to preset weights according to a weighted calculation method, and the specific calculation is as follows:
74.3% × 0.8+ 20% × 0.15+ 40% (text repetition rate) × 0.05 ═ 59.49%;
6) the server a returns notification information that cannot be issued as the original text to the author X through the communication port by judging 59.49% > 45% (preset similarity threshold of the third original text).
With further reference to fig. 5, as an implementation of the method shown in the above figures, the present application provides an embodiment of an apparatus for detecting an original text, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus may be applied to various electronic devices.
As shown in fig. 5, theapparatus 500 for detecting an original text of the present embodiment may include: the system comprises atheme extraction unit 501, a principal and predicatetriple extraction unit 502 and an originaltext determination unit 503. Thetheme extracting unit 501 is configured to extract a theme from the acquired text to be detected; a predicate elementtriple extracting unit 502 configured to extract a predicate element triple from the text to be detected; the originaltext determining unit 503 is configured to calculate a similarity degree between the subject and the predicate triple of the text to be detected and the public text, and determine whether the text to be detected is the original text based on the similarity degree.
In the present embodiment, in theapparatus 500 for detecting an original text: the specific processing and the technical effects of thetheme extracting unit 501, the main predicatetriple extracting unit 502, and the originaltext determining unit 503 can refer to the related descriptions ofstep 201 and 203 in the corresponding embodiment of fig. 2, which are not described herein again.
In some optional implementations of this embodiment, thetopic extraction unit 501 may be further configured to:
splitting a text to be detected into at least one paragraph, and splitting each paragraph into at least one sentence;
and extracting a core phrase from each paragraph and each sentence respectively, and taking the core phrase as a subject of the corresponding paragraph or sentence.
In some optional implementations of this embodiment, the predicatetriplet extraction unit 502 may be further configured to:
identifying entity texts in the texts to be detected by using an entity identification technology of a knowledge graph;
extracting an associated text having a main-predicate object relationship with the entity by using a relation extraction technology of a knowledge graph;
and generating a main predicate object triple according to the main body text and the corresponding associated text.
In some optional implementation manners of this embodiment, the originaltext determining unit 503 may include a similarity degree calculating subunit configured to calculate a degree of similarity between the subject and the predicate triple of the text to be detected and the respective subject and predicate triplets of the publication, and the similarity degree calculating subunit may be further configured to:
calculating the vector similarity of the vectorized principal-predicate-object triples of the text to be detected and the public text in the same vector description space; the vectorization predicate element triple refers to a vector form conversion result of a predicate element triple in a text form;
determining the number of similar subjects and the same distribution number of similar subjects between the text to be detected and the open text;
and determining the similarity degree between the subject and the subject-predicate triplet of the text to be detected and the open text based on the number of the similar subjects, the same distribution number of the similar subjects and the vector similarity.
In some optional implementations of the present embodiment, the originaltext determining unit 503 may include an original text determining subunit configured to determine whether the text to be detected is the original text based on the similarity degree, and the original text determining subunit may include:
the text similarity obtaining module is configured to obtain the text similarity between the text to be detected and the public text;
and the comprehensive determining module is configured to determine whether the text to be detected is the original text or not based on the similarity degree and the text similarity.
In some optional implementations of this embodiment, the comprehensive determination module may be further configured to:
respectively acquiring a first weight and a second weight which are distributed for the similarity degree and the text similarity in advance; wherein, the first weight is larger than the second weight;
calculating to obtain comprehensive similarity according to the similarity weighted by the first weight and the text similarity weighted by the second weight;
determining that the text to be detected is a non-original text in response to the comprehensive similarity exceeding a preset threshold;
and determining the text to be detected as the original text in response to the comprehensive similarity not exceeding the preset threshold.
In order to identify a non-original text subjected to more complex rewriting operation, the device for detecting an original text provided in the embodiment of the present application identifies more accurately whether the two texts have substantial identity or are relatively similar in content by comparing the subject and subject-predicate triple expressing the contents of the text to be detected and the open text, so that the detection result of the original text is more accurate.
According to an embodiment of the present application, an electronic device and a computer-readable storage medium are also provided.
Fig. 6 shows a block diagram of an electronic device suitable for implementing the method for detecting original text of the embodiments of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 6, the electronic apparatus includes: one ormore processors 601,memory 602, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 6, oneprocessor 601 is taken as an example.
Thememory 602 is a non-transitory computer readable storage medium as provided herein. Wherein the memory stores instructions executable by the at least one processor to cause the at least one processor to perform the method for detecting original text provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the method for detecting original text provided by the present application.
Thememory 602, as a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for detecting an original text in the embodiment of the present application (for example, thesubject extraction unit 501, the predicatetriple extraction unit 502, and the originaltext determination unit 503 shown in fig. 5). Theprocessor 601 executes various functional applications of the server and data processing by executing non-transitory software programs, instructions, and modules stored in thememory 602, that is, implements the method for detecting an original text in the above method embodiment.
Thememory 602 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store various types of data created by the electronic device in performing the method for detecting the original text, and the like. Further, thememory 602 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, thememory 602 optionally includes memory remotely located from theprocessor 601, and these remote memories may be connected over a network to an electronic device adapted to perform the method for detecting the original text. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device adapted to perform the method for detecting an original text may further include: aninput device 603 and anoutput device 604. Theprocessor 601, thememory 602, theinput device 603 and theoutput device 604 may be connected by a bus or other means, and fig. 6 illustrates the connection by a bus as an example.
Theinput device 603 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic apparatus suitable for performing the method for detecting the original text, such as an input device of a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, etc. Theoutput devices 604 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computing system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in the conventional physical host and Virtual Private Server (VPS) service.
In order to identify the non-original text subjected to more complicated rewriting operation, the embodiment of the application identifies whether the two texts have substantial equivalence or similarity in content more accurately by comparing the subject and subject-predicate triple expressing the contents expressed by the text to be detected and the open text, so that the detection result of the original text is more accurate.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.