Disclosure of Invention
In order to overcome the defects of the prior art, one of the purposes of the invention is to provide a overseas trademark retrieval method, which can solve the problem that the traditional database query mode meets the huge retrieval requirement in the overseas trademark retrieval process.
The second objective of the present invention is to provide a computer-readable storage medium, which can solve the problem that the traditional database query method can satisfy the huge search requirement in the overseas trademark search process.
The invention also aims to provide a computer program product which can solve the problem that the traditional database query mode meets the huge search requirement in the overseas trademark search process.
One of the purposes of the invention is realized by adopting the following technical scheme:
a overseas trademark retrieval method comprises the following steps:
receiving information to be retrieved, and receiving trademark English keywords input by a user for overseas trademark retrieval and retrieval rules specified by the user;
searching overseas trademarks, screening out corresponding target trademark names from an operating memory database of a local server according to the trademark English keywords and the search rules, and screening out corresponding target trademark associated information from a distributed database on a hard disk of the local server according to the target trademark names;
and outputting a retrieval result, and outputting the target trademark name and the corresponding target trademark associated information to a user side.
Further, the retrieving of the overseas trademark specifically comprises the following substeps:
performing word segmentation processing, namely judging whether the trademark English keyword contains at least two key words or not, if so, performing word segmentation processing on the trademark English keyword to obtain a plurality of trademark English key words to be retrieved, and if not, taking the trademark English keyword as the trademark English key word to be retrieved without word segmentation processing;
screening a target trademark name, screening a sample trademark name corresponding to the English key vocabulary of the trademark to be searched from a plurality of sample trademark names stored in an operating memory database according to the search rule as the target trademark name, and obtaining a target trademark number corresponding to the target trademark name;
and screening target trademark associated information, and screening the corresponding target trademark associated information according to the distributed database of the target trademark number on the hard disk of the local server.
Further, before the target trademark associated information is screened, precise screening is further included, a first vector score corresponding to the target trademark name is calculated according to a first preset weight and a first occurrence frequency ratio corresponding to the target trademark name, the first occurrence frequency ratio is a ratio of the occurrence frequency of each English letter in the target trademark name in all sample trademark names in the operating memory database to the sum of all English letters in the target trademark name, a second vector score corresponding to the trademark English keyword is calculated according to a second preset weight and a second occurrence frequency ratio corresponding to the trademark English keyword, and the second occurrence frequency ratio is a ratio of the occurrence frequency of each letter in the trademark English in all sample trademark names in the operating memory database to the sum of all English letters in the target trademark name, and calculating a vector score difference value of the first vector score and the second vector score, and reserving the target trademark name with the vector score difference value smaller than a preset vector score difference value threshold value.
Further, if the number of the target trademark names with the vector score difference smaller than the preset vector score difference threshold is larger than ten, sorting the target trademark names with the vector score difference smaller than the preset vector score difference threshold from small to large according to the vector score difference, and reserving the first ten target names in the sorting.
Further, before the information to be retrieved is received, a sample database is created, data cleaning processing, user-defined word segmentation processing and segmentation combination processing are performed on sample trademark data containing sample trademark names and sample trademark associated information to obtain a plurality of sample trademark names and corresponding sample trademark associated information, a sample trademark number is set for each sample trademark name and corresponding sample trademark associated information, all the sample trademark names and corresponding sample trademark numbers are stored in an operating memory database in the local server, and all the sample trademark associated information and corresponding sample trademark numbers are stored in a distributed database on a hard disk of the local server.
Further, in the step of receiving the information to be retrieved, the step of retrieving overseas trademarks is performed on the batch of information to be retrieved, which is received each time and contains a plurality of trademark English keywords, so that a corresponding batch query result containing a plurality of target trademark associated information is obtained.
Further, the retrieval rule is a homophone retrieval rule or an add-subtract word retrieval rule or an inclusion/included retrieval rule or an order-changing retrieval rule.
And further, the method also comprises approximate retrieval, wherein the trademark English keyword is input into a preset examination mechanism model for approximate examination, and an approximate examination result is obtained.
The second purpose of the invention is realized by adopting the following technical scheme:
a computer-readable storage medium having stored thereon a computer program for execution by a processor of a method of overseas brand retrieval as described in the present application.
The third purpose of the invention is realized by adopting the following technical scheme:
a computer program product comprising a computer program which, when executed by a processor, implements a method of overseas brand retrieval as described in the present application.
Compared with the prior art, the invention has the beneficial effects that: the oversea trademark retrieval method comprises the steps of receiving trademark English keywords input by a user and used for oversea trademark retrieval and retrieval rules appointed by the user, screening out corresponding target trademark names from an operation memory database of a local server according to the trademark English keywords and the retrieval rules, screening out corresponding target trademark associated information from a distributed database of a hard disk of the local server according to the target trademark names, finally outputting the target trademark names and the corresponding target trademark associated information to a user terminal, respectively placing the target trademark names and the target trademark associated information in the hard disk and the memory of the local server for screening and matching in the whole calculation process, greatly reducing the memory pressure of the local server, and simultaneously directly dropping the target trademark associated information serving as non-word index fields into a distributed database, the distributed database realizes second-level query of hundred million data on the non-index field, the target trademark name serving as the index field can be quickly hit in the memory after word segmentation, and the whole query process can be completed by directly taking the main key of the hit data.
The foregoing description is only an overview of the technical solutions of the present invention, and in order to make the technical solutions of the present invention more clearly understood and to implement them in accordance with the contents of the description, the following detailed description is given with reference to the preferred embodiments of the present invention and the accompanying drawings. The detailed description of the present invention is given in detail by the following examples and the accompanying drawings.
Detailed Description
The present invention will be further described with reference to the accompanying drawings and the detailed description, and it should be noted that any combination of the embodiments or technical features described below can be used to form a new embodiment without conflict.
As shown in fig. 1, the method for searching for overseas trademarks in the present application specifically includes the following steps:
the method comprises the steps of creating a sample database, carrying out data cleaning processing, user-defined word segmentation processing and segmentation combination processing on sample trademark data containing sample trademark names and sample trademark associated information to obtain a plurality of sample trademark names and corresponding sample trademark associated information, setting a sample trademark number for each sample trademark name and corresponding sample trademark associated information, storing all the sample trademark names and corresponding sample trademark numbers in an operating memory database in a local server, and storing all the sample trademark associated information and corresponding sample trademark numbers in a distributed database on a hard disk of the local server. The data processing specifically includes: the sample trademark data is normalized and feature extracted, a data feature model is built in advance, multi-dimensional query can be rapidly achieved, and similar vector features can be directly searched in the features during multi-dimensional query. The user-defined word segmentation processing specifically comprises the following steps: through a preset custom word segmentation device, the sample trademark data is segmented according to business characteristics, for example, similar query of homophones needs to extract vowel parts in the trademark name, and a plurality of phrases are combined to flow into the next stage. The segment combination specifically comprises: in the link, the word segmentation phrases and the trademark data in the previous steps are received, sample trademark associated information of the non-index query field in the sample trademark data is stored in a distributed database, and the stored index is the unique number of the trademark; meanwhile, the word segmentation result and the unique number can reside in the memory; and directly storing the sample trademark name serving as the index query field into an operating memory database. The trademark-related data in the present embodiment is information related to a corresponding trademark, for example, bibliographic information such as the applicant and the applicant address of the trademark.
Receiving information to be searched, and receiving trademark English keywords input by a user for overseas trademark search and search rules specified by the user. The search rule is a homophone search rule, an add-subtract word search rule, an inclusion/included search rule or an order-changing search rule.
Searching overseas trademarks, screening out corresponding target trademark names from an operating memory database of the local server according to the trademark English keywords and the search rules, and screening out corresponding target trademark associated information from a distributed database on a hard disk of the local server according to the target trademark names. The method specifically comprises the following steps:
performing word segmentation processing, namely judging whether the trademark English keyword contains at least two key words or not, if so, performing word segmentation processing on the trademark English keyword to obtain a plurality of trademark English key words to be retrieved, and if not, taking the trademark English keyword as the trademark English key word to be retrieved without word segmentation processing;
screening a target trademark name, screening a sample trademark name corresponding to the English key vocabulary of the trademark to be searched from a plurality of sample trademark names stored in an operating memory database according to the search rule as the target trademark name, and obtaining a target trademark number corresponding to the target trademark name;
and screening target trademark associated information, and screening the corresponding target trademark associated information according to the distributed database of the target trademark number on the hard disk of the local server.
Preferably, the method further comprises a precise screening step before the screening of the target trademark associated information, wherein a first vector score corresponding to the target trademark name is calculated according to a first preset weight corresponding to the target trademark name and a first occurrence ratio, the first number of occurrences is compared to the number of occurrences of each english letter in the target brand name in all sample brand names in the operating memory database, calculating a second vector score corresponding to the trademark English keyword according to a second preset weight and a second occurrence frequency ratio corresponding to the trademark English keyword, and the second occurrence frequency is the frequency of occurrence of each letter in the trademark English keyword in all sample trademark names in the operating memory database, the vector score difference value of the first vector score and the second vector score is calculated, and the target trademark name with the vector score difference value smaller than a preset vector score difference value threshold value is reserved. If the number of the target trademark names with the vector score difference smaller than the preset vector score difference threshold value is larger than ten, sorting the target trademark names with the vector score difference smaller than the preset vector score difference threshold value according to the vector score difference from small to large, and reserving the first ten target names in the sorting. For the above calculation process of the vector fraction, the following is exemplified:
assuming that the trademark english keyword inputted by the user for overseas trademark search is "THIS IS NIKI" and the designated search rule is a homophone search rule, "THIS IS NIKI" is first participled as: the three key words, namely THIS, IS and NIKI are used as English key words of the trademark to be retrieved, and a first target trademark name corresponding to THIS, a second target trademark name corresponding to IS and a third target trademark name corresponding to NIKI are respectively screened from the operation memory database according to retrieval rules. Calculating a second vector vocabulary of the three vocabularies of the THIS, the IS and the NIKI respectively, wherein each vector score IS 1/(1+ math.exp (-2 × second preset weight-second occurrence time ratio), and further obtaining a vector score of the THIS, the IS and the NIKI, for example, when calculating the vector score of the NIKI, assuming that the sample tradenames stored in the operation memory database according to the NIKI are NIKE, NGAO and NIKI, the number of occurrences of the letters IS N (3) G (1) K (2) O (1) I (3) E (1) a (1), the corresponding vector score IS 1/1+ math.exp (-2 ═ 0.8- (1 × 0.25+ 1.25 +0.166+0.083)) (N (3) G (1) K (2) O (1) a (1), and the corresponding vector score IS 32 + math.2 ═ 0.8- (1 × 0.25 × 0.539 + 2.36) (+ 3) G + 2.36.36) (+ 3) NIKE) (+ 368) G + 3) G + 2.36.2) N + 3) G + 3) b) (368 (9) G + 3) N) (9) b) (9) G + 2) N + 3) 3 x 0.083)) ═ 1.1225789459298634; assuming the user input is an NIKI, the NIKI output is hit.
And outputting a retrieval result, and outputting the target trademark name and the corresponding target trademark associated information to a user side.
In this embodiment, in the step of receiving information to be retrieved, the step of retrieving overseas trademarks is performed on batch information to be retrieved, which is received each time and contains a plurality of trademark english keywords, to obtain a corresponding batch query result containing a plurality of target trademark associated information; the method specifically comprises the following steps: manually and once importing an xls or txt file by a user, wherein the file comprises about 1w or more brand names to be applied (namely the brand English keywords) which are arranged in a line-changing mode; and (4) carrying out approximate query on the imported trademark English keywords and all overseas trademarks in the database in sequence, and sequencing according to the query quantity results. When the trademark database has the similarity value of completely identical trademark names as k, and when the trademark database has the similarity value of approximate trademark names as m, the total similarity value z is k + m. And discarding trademarks with z exceeding 100, sequencing each trademark name z to be inquired from small to large, and then sequentially exporting the sequenced trademark name z to a user for viewing, wherein the trademarks in front of the sequencing are the trademark names which can be applied by the user and have higher success rate, and meanwhile, the trademark z is marked as a similar value, and generally, trademarks with z below 10 can be manually checked again.
In addition, the method also comprises an approximate query step, wherein the trademark English keyword is input into a preset examination mechanism model for approximate examination, and an approximate examination result is obtained. The method specifically comprises the following steps: in addition to the conventional word segmentation and inclusion relation, in the embodiment, a search with higher accuracy can be performed according to a manual search formula of overseas trademark examiners, such as American trademark examiners, and all history files in a trademark library, and a search suggestion that a user is closer to the judgment of the examiners is given. Take the united states as an example: each inquiry record code of the American examiner is public and is stored in the American trademark office database, and all the examination records are trained through a machine learning algorithm to accurately learn the examination rules. The us trademark review rules specifically include the following: 1. the applied trademark is contained by the current living trademark 2, the applied trademark is the current living trademark 3, the applied trademark is completely the same as the current living trademark after adding or subtracting words 4, the pronunciation of the applied trademark is the same as the pronunciation of the current living trademark 5, the applied trademark is completely the same as the current living trademark after exchanging the word sequence 6, the applied trademark contains sensitive words, and sensitive words such as celebrities, place names, political religions and the like are required to be removed (the situation that a black name single library cannot be applied is required to be established). And (3) acquiring the rule by using XSearch Search Summary data, automatically querying 1-6 by applying an approximate query rule, completing a blacklist related to the American trademark in the rule 7, and automatically prompting that the trademark applied by the user belongs to the xx category blacklist if the trademark applied by the user is in the blacklist. In addition, the newly applied trademark can be avoided being rejected according to the trademark rejection records in the overseas database. For example: the user wants to apply for a trademark: DREAMO, which can be registered by general trademark inquiry, but inquires about the trademark of the prior application DREAMO because of the approximate rejection of a certain trademark through a database, and then considers that the trademark can cause the rejection for the same reason; the censorship of dead trademarks and all prior approximate trademarks is also taken as a reference basis for trademark inquiry to improve the accuracy of trademark inquiry.
The present invention also provides a computer-readable storage medium having stored thereon a computer program for execution by a processor of a method of overseas trademark retrieval as described in the present application.
The invention also provides a computer program product comprising a computer program which, when executed by a processor, implements a method of overseas brand retrieval as described in the present application.
The oversea trademark retrieval method comprises the steps of receiving trademark English keywords input by a user and used for oversea trademark retrieval and retrieval rules appointed by the user, screening out corresponding target trademark names from an operation memory database of a local server according to the trademark English keywords and the retrieval rules, screening out corresponding target trademark associated information from a distributed database of a hard disk of the local server according to the target trademark names, finally outputting the target trademark names and the corresponding target trademark associated information to a user terminal, respectively placing the target trademark names and the target trademark associated information in the hard disk and the memory of the local server for screening and matching in the whole calculation process, greatly reducing the memory pressure of the local server, and simultaneously directly dropping the target trademark associated information serving as non-word index fields into a distributed database, the distributed database realizes second-level query of hundred million data on the non-index field, the target trademark name serving as the index field can be quickly hit in the memory after word segmentation, and the whole query process can be completed by directly taking the main key of the hit data.
The foregoing is merely a preferred embodiment of the invention and is not intended to limit the invention in any manner; those skilled in the art can readily practice the invention as shown and described in the drawings and detailed description herein; however, those skilled in the art should appreciate that they can readily use the disclosed conception and specific embodiments as a basis for designing or modifying other structures for carrying out the same purposes of the present invention without departing from the scope of the invention as defined by the appended claims; meanwhile, any changes, modifications, and evolutions of the equivalent changes of the above embodiments according to the actual techniques of the present invention are still within the protection scope of the technical solution of the present invention.