Movatterモバイル変換


[0]ホーム

URL:


CN111274811A - Address text similarity determining method and address searching method - Google Patents

Address text similarity determining method and address searching method
Download PDF

Info

Publication number
CN111274811A
CN111274811ACN201811375413.2ACN201811375413ACN111274811ACN 111274811 ACN111274811 ACN 111274811ACN 201811375413 ACN201811375413 ACN 201811375413ACN 111274811 ACN111274811 ACN 111274811A
Authority
CN
China
Prior art keywords
address
text
address text
similarity
similarity calculation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811375413.2A
Other languages
Chinese (zh)
Other versions
CN111274811B (en
Inventor
刘楚
谢朋峻
郑华飞
李林琳
司罗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding LtdfiledCriticalAlibaba Group Holding Ltd
Priority to CN201811375413.2ApriorityCriticalpatent/CN111274811B/en
Priority to TW108129457Aprioritypatent/TW202020688A/en
Priority to PCT/CN2019/119149prioritypatent/WO2020103783A1/en
Publication of CN111274811ApublicationCriticalpatent/CN111274811A/en
Application grantedgrantedCritical
Publication of CN111274811BpublicationCriticalpatent/CN111274811B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Landscapes

Abstract

The invention discloses an address text similarity determining method and an address searching method, wherein an address text comprises a plurality of address elements with the levels ranging from high to low, and the method comprises the following steps: acquiring an address text pair with similarity to be determined; and inputting the address text pair into a preset address text similarity calculation model so as to output the similarity of the two address texts included in the address text pair. The invention improves the accuracy of the similarity calculation of the address text.

Description

Address text similarity determining method and address searching method
Technical Field
The invention relates to the field of artificial intelligence, in particular to an address text similarity determining method, an address searching method and computing equipment.
Background
In some address-sensitive industries or departments, such as police, courier, logistics, electronic maps, etc., a standard address library is usually maintained inside the system. In the use scenario of the address data, the description which is not uniform with the standard address library often exists, for example, the spoken address at the time of 110 alarm is far from the standard address in the public security system. At this time, an effective and fast method is needed to map the non-standard address text to the corresponding or similar address in the standard address library, wherein how to judge the similarity degree of the two pieces of address text is crucial.
The common address text similarity has the following calculation modes:
1. the similarity degree of the two texts is calculated by using the editing distance, and the semantic connotation of the texts is omitted in the mode, for example: the edit distance between "Alibap" and "Alima" is the same as the edit distance between "Alibap" and "Alima mother", but the semantic similarity between "Alibap" and "Alima mother" should be semantically greater than "Alima".
2. The semantic similarity is used for calculating the similarity between two sections of texts, such as word2vec, and the method is suitable for all text fields and does not aim at the address text independently. When applied to address text, the accuracy is not high enough.
3. The address text is decomposed into a plurality of address elements, the weights of the address elements of all levels are manually specified and then weighted and summed, and the defects that the weights of all the address levels cannot be automatically generated aiming at a data set and automation cannot be well realized are overcome.
Disclosure of Invention
In view of the above problems, the present invention has been made to provide an address text similarity determination method and an address search method that overcome or at least partially solve the above problems.
According to an aspect of the present invention, there is provided an address text similarity determination method, the address text including a plurality of address elements arranged from high to low in rank, the method including:
acquiring an address text pair with similarity to be determined;
inputting the address text pair into a preset address text similarity calculation model to output the similarity of two address texts included in the address text pair;
the address text similarity calculation model is obtained by training based on a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair.
Optionally, in the address text similarity determining method according to the present invention, the address text similarity calculation model includes a word embedding layer, a text encoding layer, and a similarity calculation layer, and the step of training the address text similarity calculation model includes: inputting the first, second and third address texts of each piece of training data into a word embedding layer to obtain corresponding first, second and third word vector sets; inputting the first, second and third word vector sets into a text coding layer to obtain corresponding first, second and third text vectors; calculating a first similarity of the first text vector and the second text vector and a second similarity of the first text vector and the third text vector by using a similarity calculation layer; and adjusting the network parameters of the address text similarity calculation model according to the first similarity and the second similarity.
Optionally, in the address text similarity determining method according to the present invention, the network parameter includes: parameters of a word embedding layer and/or parameters of a text encoding layer.
Optionally, in the address text similarity determining method according to the present invention, each word vector set in the first, second, and third word vector sets includes a plurality of word vectors, and each word vector corresponds to one address element in the address text.
Optionally, in the address text similarity determining method according to the present invention, the Word embedding layer employs a Glove model or a Word2Vec model.
Optionally, in the address text similarity determining method according to the present invention, the first similarity and the second similarity include at least one of a euclidean distance, a cosine similarity, or a Jaccard coefficient.
Optionally, in the method for determining similarity of address texts according to the present invention, the adjusting network parameters of the address text similarity calculation model according to the first and second similarities includes: calculating a loss function value according to the first similarity and the second similarity; and adjusting the network parameters of the address text similarity calculation model by using a back propagation algorithm until the loss function value is lower than a preset value or the training times reach a preset number.
Optionally, in the address text similarity determining method according to the present invention, the loss function value is: and (4) Loss ═ Margin- (first similarity-second similarity), wherein the Loss is a Loss function value, and Margin is a hyperparameter.
Optionally, in the address text similarity determining method according to the present invention, the text encoding layer includes at least one of an RNN model, a CNN model, or a DBN model.
According to another aspect of the present invention, there is provided an address search method including:
acquiring one or more candidate address texts corresponding to the address text to be inquired;
inputting an address text to be inquired and a candidate address text into a preset address text similarity calculation model to obtain the similarity of the address text and the candidate address text, wherein the address text similarity calculation model is obtained by training based on a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
and determining the candidate address text with the maximum similarity as the target address text corresponding to the address text to be inquired.
According to another aspect of the present invention, there is provided an address search apparatus including:
the query module is suitable for acquiring one or more candidate address texts corresponding to the address texts to be queried;
the first similarity calculation module is suitable for inputting the address text to be inquired and the candidate address text into a preset address text similarity calculation model to obtain the similarity of the address text and the candidate address text, wherein the address text similarity calculation model is obtained by training a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
and the output module is suitable for determining the candidate address text with the maximum similarity as the target address text corresponding to the address text to be inquired.
According to another aspect of the present invention, there is provided an apparatus for training an address text similarity calculation model, the address text including a plurality of address elements arranged in a high-to-low order, the address text similarity calculation model including a word embedding layer, a text encoding layer, and a similarity calculation layer, the apparatus comprising:
the training data set comprises a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
the word vector acquisition module is suitable for inputting the first, second and third address texts of each piece of training data into the word embedding layer to obtain corresponding first, second and third word vector sets;
the text vector acquisition module is suitable for inputting the first, second and third word vector sets into a text coding layer to obtain corresponding first, second and third text vectors;
the second similarity calculation module is suitable for calculating the first similarities of the first text vector and the second similarities of the first text vector and the third text vector by utilizing the similarity calculation layer;
and the parameter adjusting module is suitable for adjusting the network parameters of the address text similarity calculation model according to the first similarity and the second similarity.
According to another aspect of the present invention, there is provided a computing device comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs including instructions for performing a method according to any of the methods described above.
Since the address text naturally contains hierarchical relationships, address elements of different levels play different roles in address similarity calculation. The embodiment of the invention automatically learns the weights of the address elements in different levels by utilizing the hierarchical relationship in the address text, avoids the subjectivity of manually assigning the weights, and has the self-adaptive capacity to the target data source, thereby accurately calculating the similarity of the two address texts.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 shows a schematic diagram of anaddress search system 100 according to one embodiment of the invention;
FIG. 2 shows a schematic diagram of acomputing device 200, according to one embodiment of the invention;
FIG. 3 illustrates a flow diagram of amethod 300 for training an address text similarity calculation model according to one embodiment of the invention;
FIG. 4 illustrates a schematic diagram of an address textsimilarity calculation model 400 according to one embodiment of the invention;
FIG. 5 illustrates a flow diagram of anaddress search method 500 according to one embodiment of the invention;
FIG. 6 is a diagram illustrating anapparatus 600 for training an address text similarity calculation model according to an embodiment of the present invention;
fig. 7 shows a schematic diagram of anaddress search apparatus 700 according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
First, some terms or terms appearing in the description of the embodiments of the present invention are applicable to the following explanations:
address text: such as "a shangzhou wen xi lu 969 a rilibaba", "a greater river college of greater chang 1 of the greater river of the maxi town of penshan district, meenshan, sichuan, etc. The address text includes a plurality of address elements arranged in a high-to-low order.
Address element: elements constituting each granularity of the address text, such as "hangzhou wen xi lu No. 969 a, where" hangzhou "represents a city," wen xi lu "represents a road," 969 "represents a road number, and where" ariba "represents a Point of Interest (POI).
Address level: the area corresponding to the address element in the address has a size-containing relationship, i.e. the level element has a corresponding address level, for example: province > city > district > street/community > way > building.
The address similarity is as follows: the similarity between two sections of address texts takes the value from 0 to 1, the larger the value is, the higher the possibility that two addresses are the same location is, the two sections of texts take the value 1 and represent the same address, and the two sections of addresses have no relation when the value is 0.
Partial order relationship: regions in an address have a hierarchical relationship of size inclusion, such as: province > city > district > street/community > way > building.
Since the address text naturally contains a hierarchical relationship, i.e., the partial order relationship described above, address elements of different levels play different roles in address similarity calculation. The embodiment of the invention automatically generates the weights of the address elements with different levels by utilizing the hierarchical relationship in the address text, and the weights are implicitly embodied in the network parameters of the address text similarity calculation model, thereby accurately calculating the similarity degree of the two address texts.
FIG. 1 shows a schematic diagram of anaddress search system 100 according to one embodiment of the invention. As shown in fig. 1, theaddress search system 100 includes auser terminal 110 and acomputing device 200.
Theuser terminal 110 is a terminal device used by a user, and may specifically be a personal computer such as a desktop computer and a notebook computer, or may also be a mobile phone, a tablet computer, a multimedia device, an intelligent wearable device, and the like, but is not limited thereto.Computing device 200 is used to provide services touser terminal 110, and may be implemented as a server, such as an application server, a Web server, or the like; but may also be implemented as a desktop computer, a notebook computer, a processor chip, a mobile phone, a tablet computer, etc., but is not limited thereto.
In an embodiment of the present invention, thecomputing device 200 may be used to provide address search services to the user, for example, thecomputing device 200 may be a server of an electronic map application, but it will be understood by those skilled in the art that thecomputing device 200 may be any device capable of providing address search services to the user, and is not limited to a server of an electronic map application.
In one embodiment, theaddress search system 100 also includes adata storage 120. Thedata storage 120 may be a relational database such as MySQL, ACCESS, etc., or a non-relational database such as NoSQL, etc.; thedata storage device 120 may be a local database residing in thecomputing device 200, or may be disposed at a plurality of geographic locations as a distributed database, such as HBase, in short, thedata storage device 120 is used for storing data, and the present invention is not limited to the specific deployment and configuration of thedata storage device 120. Thecomputing device 200 may connect with thedata storage 120 and retrieve data stored in thedata storage 120. For example, thecomputing device 200 may directly read the data in the data storage 120 (when thedata storage 120 is a local database of the computing device 200), or may access the internet in a wired or wireless manner and obtain the data in thedata storage 120 through a data interface.
In the embodiment of the present invention, thedata storage device 120 stores therein a standard address library, and the address text in the standard address library is standard address text (complete and accurate address text). In the address search service, a user inputs a query address text (query) through theuser terminal 110, and generally, the user inputs a incomplete and inaccurate address text. Theuser terminal 110 sends the query to thecomputing device 200, and the address search device in thecomputing device 200 recalls a batch of candidate address texts, usually several to several thousand address texts, by searching the standard address library. And then the address searching device calculates the correlation degree between the candidate address texts and the query, wherein the address similarity is important reference information of the correlation degree, and after the address similarity between the query and all the candidate address texts is respectively calculated, the candidate address text with the maximum similarity is determined as a target address text corresponding to the address text to be queried, and the target address text is returned to the user.
Specifically, the address search means may calculate the similarity between the address text to be queried and the candidate address text using the address text similarity calculation model. Correspondingly, thecomputing device 200 may further include a training device for the address text similarity calculation model, and thedata storage device 120 further stores a training address library, which may be the same as or different from the standard address library, where the training address library includes a plurality of address texts, and the training device trains the address text similarity calculation model by using the address texts in the training address library.
FIG. 2 shows a block diagram of acomputing device 200, according to one embodiment of the invention. As shown in FIG. 2, in a basic configuration 202, acomputing device 200 typically includes a system memory 206 and one or more processors 204. A memory bus 208 may be used for communication between the processor 204 and the system memory 206.
Depending on the desired configuration, the processor 204 may be any type of processing, including but not limited to: a microprocessor (μ P), a microcontroller (μ C), a Digital Signal Processor (DSP), or any combination thereof. The processor 204 may include one or more levels of cache, such as a level onecache 210 and a level twocache 212, aprocessor core 214, and registers 216.Example processor cores 214 may include Arithmetic Logic Units (ALUs), Floating Point Units (FPUs), digital signal processing cores (DSP cores), or any combination thereof. The example memory controller 218 may be used with the processor 204, or in some implementations the memory controller 218 may be an internal part of the processor 204.
Depending on the desired configuration, system memory 206 may be any type of memory, including but not limited to: volatile memory (such as RAM), non-volatile memory (such as ROM, flash memory, etc.), or any combination thereof. System memory 106 may include an operating system 220, one or more applications 222, andprogram data 224. The application 222 is actually a plurality of program instructions that direct the processor 204 to perform corresponding operations. In some embodiments, application 222 may be arranged to cause processor 204 to operate withprogram data 224 on an operating system.
Computing device 200 may also include an interface bus 240 that facilitates communication from various interface devices (e.g., output devices 242, peripheral interfaces 244, and communication devices 246) to the basic configuration 202 via the bus/interface controller 230. The example output device 242 includes a graphics processing unit 248 and an audio processing unit 250. They may be configured to facilitate communication with various external devices, such as a display or speakers, via one or more a/V ports 252. Example peripheral interfaces 244 can include a serial interface controller 254 and aparallel interface controller 256, which can be configured to facilitate communications with external devices such as input devices (e.g., keyboard, mouse, pen, voice input device, touch input device) or other peripherals (e.g., printer, scanner, etc.) via one or more I/O ports 258. An example communication device 246 may include a network controller 260, which may be arranged to facilitate communications with one or moreother computing devices 262 over a network communication link via one ormore communication ports 264.
A network communication link may be one example of a communication medium. Communication media may typically be embodied by computer readable instructions, data structures, program modules, and may include any information delivery media, such as carrier waves or other transport mechanisms, in a modulated data signal. A "modulated data signal" may be a signal that has one or more of its data set or its changes made in such a manner as to encode information in the signal. By way of non-limiting example, communication media may include wired media such as a wired network or private-wired network, and various wireless media such as acoustic, Radio Frequency (RF), microwave, Infrared (IR), or other wireless media. The term computer readable media as used herein may include both storage media and communication media.
In thecomputing device 200 according to the invention, the application 222 comprises training means 600 and address search means 700 of an address text similarity calculation model. Theapparatus 600 includes a plurality of program instructions that may direct the processor 104 to perform themethod 300 of training an address text similarity calculation model. Theapparatus 700 includes a plurality of program instructions that may direct the processor 104 to perform theaddress search method 600.
FIG. 3 shows a flow diagram of amethod 300 for training an address text similarity calculation model according to one embodiment of the invention. Themethod 300 is suitable for execution in a computing device, such as thecomputing device 200 described above. As shown in fig. 3, themethod 300 begins at step S310. In step S310, a training data set is obtained, where the training data set includes a plurality of pieces of training data, and each piece of training data includes 3 pieces of address texts, which are a first address text, a second address text, and a third address text, respectively. Each address text comprises a plurality of address elements with the levels ranging from high to low, and the first n levels of the address elements of the first address text and the second address text are the same; the first (n-1) levels of address elements of the first address text and the third address text are the same, and the nth level of address elements are different. Here, the value range of N is (1, N), N is the number of address levels included in the address text, for example, the address text includes 5 address levels, which are: and the province, city, district, road and road number, the value of N is 5. Of course, n may also adopt other value ranges according to specific application scenarios.
In the embodiment of the present invention, each piece of training data is a triplet { target _ addr, pos _ addr, neg _ addr } formed by 3 address texts, where target _ addr corresponds to the first address text, pos _ addr corresponds to the second address text, and neg _ addr corresponds to the third address text. { target _ addr, pos _ addr } constitutes a pair of positive sample pairs, and { target _ addr, neg _ addr } constitutes a pair of negative sample pairs.
In one embodiment, the training data set is obtained as follows:
firstly, an original address text is obtained from a training address library (or a standard address library), the original address text is analyzed, and character strings of the address text are segmented and formatted into address elements. For example, the address text "No. 1 floor No. 7 in the ariba west park No. 1 in one west way 969 in hangzhou city in zhejiang may be divided into" prov (province) ═ city (city) ═ department in hangzhou city) ═ load (road) ═ west way roadno (road number) ═ 969 poi ═ ariba west park hosueno (floor number) ═ 1 floor floorno (floor number) ═ 7 floor no (room number) > 910 ". Specifically, the above analysis may be completed by combining a word segmentation model and a named entity model, and the embodiment of the present invention does not limit the specific word segmentation model and named entity model, and those skilled in the art may reasonably select the word segmentation model and named entity model according to needs.
Then, the address texts formatted as address elements are aggregated (deduplicated and sorted) according to the address elements of different levels, so as to form the following table:
Figure BDA0001870606810000101
Figure BDA0001870606810000111
and finally, combining the aggregated data in the table into positive and negative sample pairs of training data according to different address levels, wherein the output format is as follows: { target _ addr, pos _ addr, neg _ addr }. As described above, { target _ addr, pos _ addr } constitutes a pair of positive sample pairs, and { target _ addr, neg _ addr } constitutes a pair of negative sample pairs. It should be noted that a pair of positive sample pairs may correspond to multiple pairs of negative sample pairs, that is, one target _ addr corresponds to one pos _ addr, and the target _ addr may correspond to multiple neg _ addr.
The specific operation is as follows:
(1) selecting an address text, for example: the royal city in Zhejiang province, discrict, Yunjian, road ad, Wen-xi, road ad, 969, poi, Alibara Xixi Yun district;
(2) traversing all address levels, e.g. province- > city- > district- > road, finding the same and different address elements respectively at each address level as the current address element, constituting positive and negative sample pairs respectively with the current address element, e.g.:
at the province level, a good example of an aribab Xixi park, No. 969 Wenyu district, Hangzhou city, Zhejiang: 37150, Yuan Ye Yijia Garden No. 245 in State area, Zhejiang Ningbo city, Zhejiang province, 1; the negative example is: shanghai hong Qiao No. 2550 International airport of Shanghai hong Qiao in Changning district of Shanghai city.
At the city level, a good example of an aribab Xixi park No. 969 Wenyu district in Yunhong city, Hangzhou, Zhejiang: Wen-West road 1008 Zhejiang socialist college from the Yunjun of Hangzhou city, Zhejiang; the negative example is: 37150of Ningbo city in Zhejiang province, and 525 # garden road in State region for household.
At district level, a good example of an aribab Xixi park, No. 969, Wen Yixi district, Hangzhou, Zhejiang: high education road No. 248 Saiyin International Square in Hangzhou city, Hangzhou, Zhejiang; the negative example is: southern mountain school of China Art institute No. 218 south mountain of Hangzhou city, Zhejiang.
After the training data set is acquired, themethod 300 proceeds to step S320. Before describing the processing procedure of step S320, the structure of the address text similarity calculation model according to the embodiment of the present invention will be described.
Referring to fig. 4, an address textsimilarity calculation model 400 according to an embodiment of the present invention includes: aword embedding layer 410, atext encoding layer 420, and asimilarity calculation layer 430. Theword embedding layer 410 is adapted to convert each address element in the address text into a word vector, and combine each word vector into a set of word vectors corresponding to the address text; thetext encoding layer 420 is adapted to encode a set of word vectors corresponding to the address text as text vectors; thesimilarity calculation layer 430 is adapted to calculate a similarity between two text vectors, and characterize the similarity between address texts by using the similarity between the text vectors.
In step S320, the first address text, the second address text, and the third address text in each piece of training data are respectively input to the word embedding layer for processing, so as to obtain a first word vector set corresponding to the first address text, a second word vector set corresponding to the second address text, and a third word vector set corresponding to the third address text.
The Word embedding layer (embedding layer) can convert each Word in a sentence into a digital vector (Word vector), and the weight of the embedding layer can be obtained by pre-calculating the text co-occurrence information of a massive corpus, for example, by adopting a Glove algorithm, or by adopting a CBOW and skip-gram algorithm in Word2 Vec. These algorithms are based on the fact that: in the case that different text representations of the same latent semantic can repeatedly appear in the same context, the word-to-context prediction is performed by using the context and the relation between the words, or the words are predicted through the context, so that the latent semantic of each word is obtained. In the embodiment of the invention, the parameters of the word embedding layer can be obtained by utilizing a corpus to carry out training independently; the word embedding layer and the text coding layer can also be trained together, so that the parameters of the word embedding layer and the parameters of the text coding layer are obtained simultaneously. The following description takes the example of training the word embedding layer and the text encoding layer together.
Specifically, the address text comprises a plurality of formatted address elements, after the address text is input into the word embedding layer, the word embedding layer converts each address element in the address text into a word vector as a word, so as to obtain a plurality of word vectors, and then combines the word vectors into a word vector set.
In one implementation, the word vector set is represented as a list, i.e., a word vector list, each list item in the word vector list corresponds to a word vector, and the number of items in the list is the number of address elements in the address text. In another implementation, the word vector set is represented as a matrix, that is, a word vector matrix, each column of the matrix corresponds to a word vector, and the number of columns of the matrix is the number of address elements in the address text.
After obtaining the set of word vectors, themethod 300 proceeds to step S330. In step S330, the first word vector set, the second word vector set, and the third word vector set are respectively input to the text encoding layer for processing, so that the first word vector set is encoded as a first text vector, the second word vector set is encoded as a second text vector, and the third word vector set is encoded as a third text vector.
The text coding layer is implemented by using a Deep Neural Network (DNN) model, for example, a Recurrent Neural Network (RNN) model, a Convolutional Neural Network (CNN) model, or a Deep Belief Network (DBN) model. The embedding output of the address sentence text with indefinite length is encoded into a sentence vector with fixed length through DNN, and at the moment, target _ addr, pos _ addr and neg _ addr are respectively converted into vector _ A, vector _ B and vector _ C. vector _ a is the first text vector, vector _ B is the second text vector, and vector _ C is the third text vector.
Taking RNN as an example, a word vector sequence corresponding to the address text may be regarded as a time sequence, word vectors in the word vector sequence are sequentially input into RNN, and a finally output vector is a text vector (sentence vector) corresponding to the address text.
Taking the CNN as an example, inputting a word vector matrix corresponding to the address text into the CNN, processing the CNN through a plurality of convolution layers and pooling layers, and finally converting the two-dimensional feature map into a one-dimensional feature vector through a full-connection layer, wherein the feature vector is a text vector corresponding to the address text.
After the text vector is obtained, themethod 300 proceeds to step S340. In step S340, a first similarity between the first text vector and the second text vector and a second similarity between the first text vector and the third text vector are calculated by using the similarity calculation layer. In this way, the first similarity may represent a similarity between the first address text and the second address text, and the second similarity may represent a similarity between the first address text and the third address text.
Various similarity distance calculation methods can be selected, for example: euclidean distance, cosine similarity, Jaccard coefficient, etc. In this embodiment, the similarity between vector _ a and vector _ B is referred to as SIM _ AB, and the similarity between vector _ a and vector _ C is referred to as SIM _ AC.
Finally, in step S350, network parameters of the word embedding layer and the text encoding layer are adjusted according to the first similarity and the second similarity. The method specifically comprises the following steps: calculating a loss function value according to the first similarity and the second similarity; and adjusting network parameters of the word embedding layer and the text coding layer by using a back propagation algorithm until the loss function value is lower than a preset value or the training times reach a preset number.
The loss function is a triplet loss function, and the distance between the positive sample pairs can be shortened and the distance between the negative sample pairs can be shortened by using the triplet loss function. The loss function may be specifically expressed as: loss ═ Margin- (SIM _ AB-SIM _ AC). The target min (loss) of the network is optimized by using a back propagation algorithm, so that the network actively learns the parameters to make the target _ addr closer to the pos _ addr in the semantic space and further from the neg _ addr.
The Margin is a hyper-parameter which indicates that a training target needs to ensure that a certain distance is kept between the SIM _ AB and the SIM _ AC so as to increase the discrimination of the model, and the value of the Margin can be repeatedly adjusted according to the data condition and the actual task until the effect is optimal.
After the training process is completed, a similarity calculation model for calculating the similarity between the two sections of address texts is finally obtained. Based on the similarity calculation model, the embodiment of the invention also provides an address text similarity determination method, which comprises the following steps:
1) acquiring an address text pair with similarity to be determined;
2) and inputting the address text pair into a trained address text similarity calculation model to output the similarity of the two address texts included in the address text pair.
In addition, the similarity calculation model can be applied to various scenes in which the similarity of the address text needs to be calculated, and can be applied to address standardization in the fields of public security, express delivery, logistics, electronic maps and the like. In these scenarios, the address search service can be provided for the user by using the address text similarity calculation model of the embodiment of the present invention.
FIG. 5 shows a flow diagram of anaddress search method 500 according to one embodiment of the invention. Referring to FIG. 5, themethod 500 includes steps S510 to S530.
In step S510, one or more candidate address texts corresponding to the address texts to be queried are obtained. In the address search service, a user inputs a query address text (query) through a user terminal, and generally, the input of the user is a incomplete and inaccurate address text. The user terminal sends the query to the computing device, and an address searching device in the computing device recalls a batch of candidate address texts after searching the standard address library, wherein the number of the candidate address texts is usually several to thousands.
In step S520, the address text to be queried and the candidate address text are input to a preset address text similarity calculation model to obtain the similarity between the two, wherein the address text similarity calculation model is obtained by training according to themethod 300. In this step, the similarity between the address text to be queried and each candidate address file is calculated respectively.
After the similarity between the address text to be queried and all candidate address texts is obtained, themethod 500 proceeds to step S530. In step S530, the candidate address text with the maximum similarity is determined as the target address text corresponding to the address text to be queried, and the target address text is returned to the user.
Fig. 6 is a schematic diagram of atraining apparatus 600 for an address text similarity calculation model according to an embodiment of the present invention. The address text similarity calculation model includes a word embedding layer, a text encoding layer, and a similarity calculation layer, and thetraining apparatus 600 includes:
the obtainingmodule 610 is adapted to obtain a training data set, where the training data set includes a plurality of pieces of training data, and each piece of training data includes first, second, and third address texts, where address elements of first n levels of the first and third address texts are the same, address elements of first (n-1) levels of the first and third address texts are the same, and address elements of nth level are different. The obtainingmodule 610 is specifically configured to execute the method of step S310, and for processing logic and functions of the obtainingmodule 610, reference may be made to the related description of step S310, which is not described herein again.
The wordvector obtaining module 620 is adapted to input the first, second, and third address texts of each piece of training data into the word embedding layer to obtain corresponding first, second, and third word vector sets. The wordvector obtaining module 620 is specifically configured to execute the method in step S320, and for processing logic and functions of the wordvector obtaining module 620, reference may be made to the related description in step S320, which is not described herein again.
The textvector obtaining module 630 is adapted to input the first, second, and third word vector sets into the text encoding layer to obtain corresponding first, second, and third text vectors. The textvector obtaining module 630 is specifically configured to execute the method in step S330, and for processing logic and functions of the wordvector obtaining module 630, reference may be made to the related description in step S330, which is not described herein again.
The second similarity calculation module 640 is adapted to calculate the first similarities of the first and second text vectors and the second similarities of the first and third text vectors by using the similarity calculation layer. The second similarity calculation module 640 is specifically configured to execute the method in step S340, and for processing logic and functions of the second similarity calculation module 640, reference may be made to the related description in step S340, which is not described herein again.
And theparameter adjusting module 650 is adapted to adjust the network parameters of the word embedding layer and the text encoding layer according to the first similarity and the second similarity. Thereference module 650 is specifically configured to execute the method of step S350, and for the processing logic and functions of the secondsimilarity calculation module 650, reference may be made to the related description of step S350, which is not repeated herein.
Fig. 7 shows a schematic diagram of anaddress search apparatus 700 according to an embodiment of the present invention. Referring to fig. 7, anaddress search apparatus 700 includes:
thequery module 710 is adapted to obtain one or more candidate address texts corresponding to the address texts to be queried;
the first similarity calculation module 720 is adapted to input the address text to be queried and the candidate address text into a preset address text similarity calculation model to obtain the similarity between the address text and the candidate address text, wherein the address text similarity calculation model is obtained by training thetraining device 600;
and the output module 730 is adapted to determine the candidate address text with the maximum similarity as the target address text corresponding to the address text to be queried.
The various techniques described herein may be implemented in connection with hardware or software or, alternatively, with a combination of both. Thus, the methods and apparatus of the present invention, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as removable hard drives, U.S. disks, floppy disks, CD-ROMs, or any other machine-readable storage medium, wherein, when the program is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the invention.
In the case of program code execution on programmable computers, the computing device will generally include a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. Wherein the memory is configured to store program code; the processor is configured to execute the multilingual spam-text recognition method of the present invention according to instructions in said program code stored in the memory.
By way of example, and not limitation, readable media may comprise readable storage media and communication media. Readable storage media store information such as computer readable instructions, data structures, program modules or other data. Communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. Combinations of any of the above are also included within the scope of readable media.
In the description provided herein, algorithms and displays are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with examples of this invention. The required structure for constructing such a system will be apparent from the description above. Moreover, the present invention is not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.

Claims (13)

1. An address text similarity determination method, the address text including a plurality of address elements arranged from high to low in rank, the method comprising:
acquiring an address text pair with similarity to be determined;
inputting the address text pair into a preset address text similarity calculation model to output the similarity of two address texts included in the address text pair;
the address text similarity calculation model is obtained by training based on a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair.
2. The method of claim 1, wherein the address text similarity calculation model includes a word embedding layer, a text encoding layer, and a similarity calculation layer, and the step of training the address text similarity calculation model includes:
inputting the first, second and third address texts of each piece of training data into a word embedding layer to obtain corresponding first, second and third word vector sets;
inputting the first, second and third word vector sets into a text coding layer to obtain corresponding first, second and third text vectors;
calculating a first similarity of the first text vector and the second text vector and a second similarity of the first text vector and the third text vector by using a similarity calculation layer;
and adjusting the network parameters of the address text similarity calculation model according to the first similarity and the second similarity.
3. The method of claim 2, wherein the network parameters comprise: parameters of a word embedding layer and/or parameters of a text encoding layer.
4. The method of claim 2, wherein each word vector set in the first, second, and third word vector sets comprises a plurality of word vectors, each word vector corresponding to an address element in the address text.
5. The method of claim 2, wherein the Word embedding layer employs a Glove model or a Word2Vec model.
6. The method of claim 2, wherein the first and second similarities comprise at least one of euclidean distance, cosine similarity, or Jaccard coefficients.
7. The method of claim 2, wherein said adjusting network parameters of said address text similarity calculation model according to said first and second similarity comprises:
calculating a loss function value according to the first similarity and the second similarity;
and adjusting the network parameters of the address text similarity calculation model by using a back propagation algorithm until the loss function value is lower than a preset value or the training times reach a preset number.
8. The method of claim 7, wherein the loss function value is:
loss ═ Margin- (first similarity-second similarity)
Wherein, Loss is the Loss function value, and Margin is the over parameter.
9. The method of claim 2, wherein the text encoding layer comprises at least one of an RNN model, a CNN model, or a DBN model.
10. An address search method, comprising:
acquiring one or more candidate address texts corresponding to the address text to be inquired;
inputting an address text to be inquired and a candidate address text into a preset address text similarity calculation model to obtain the similarity of the address text and the candidate address text, wherein the address text similarity calculation model is obtained by training based on a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
and determining the candidate address text with the maximum similarity as the target address text corresponding to the address text to be inquired.
11. An address search apparatus, comprising:
the query module is suitable for acquiring one or more candidate address texts corresponding to the address texts to be queried;
the first similarity calculation module is suitable for inputting the address text to be inquired and the candidate address text into a preset address text similarity calculation model to obtain the similarity of the address text and the candidate address text, wherein the address text similarity calculation model is obtained by training a training data set comprising a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
and the output module is suitable for determining the candidate address text with the maximum similarity as the target address text corresponding to the address text to be inquired.
12. An apparatus for training an address text similarity calculation model, the address text including a plurality of address elements arranged from high to low in order, the address text similarity calculation model including a word embedding layer, a text encoding layer, and a similarity calculation layer, the apparatus comprising:
the training data set comprises a plurality of pieces of training data, each piece of training data at least comprises a first address text, a second address text and a third address text, wherein the first n levels of address elements of the first address text and the second address text are the same to form a positive sample pair, and the first (n-1) levels of address elements of the first address text and the third address text are the same and the nth level of address elements are different to form a negative sample pair;
the word vector acquisition module is suitable for inputting the first, second and third address texts of each piece of training data into the word embedding layer to obtain corresponding first, second and third word vector sets;
the text vector acquisition module is suitable for inputting the first, second and third word vector sets into a text coding layer to obtain corresponding first, second and third text vectors;
the second similarity calculation module is suitable for calculating the first similarities of the first text vector and the second similarities of the first text vector and the third text vector by utilizing the similarity calculation layer;
and the parameter adjusting module is suitable for adjusting the network parameters of the address text similarity calculation model according to the first similarity and the second similarity.
13. A computing device, comprising:
one or more processors;
a memory; and
one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-10.
CN201811375413.2A2018-11-192018-11-19Address text similarity determining method and address searching methodActiveCN111274811B (en)

Priority Applications (3)

Application NumberPriority DateFiling DateTitle
CN201811375413.2ACN111274811B (en)2018-11-192018-11-19Address text similarity determining method and address searching method
TW108129457ATW202020688A (en)2018-11-192019-08-19Method for determining address text similarity, address searching method, apparatus, and device
PCT/CN2019/119149WO2020103783A1 (en)2018-11-192019-11-18Method for determining address text similarity, address searching method, apparatus, and device

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201811375413.2ACN111274811B (en)2018-11-192018-11-19Address text similarity determining method and address searching method

Publications (2)

Publication NumberPublication Date
CN111274811Atrue CN111274811A (en)2020-06-12
CN111274811B CN111274811B (en)2023-04-18

Family

ID=70773096

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201811375413.2AActiveCN111274811B (en)2018-11-192018-11-19Address text similarity determining method and address searching method

Country Status (3)

CountryLink
CN (1)CN111274811B (en)
TW (1)TW202020688A (en)
WO (1)WO2020103783A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112070429A (en)*2020-07-312020-12-11深圳市跨越新科技有限公司Address merging method and system
CN112559658A (en)*2020-12-082021-03-26中国科学技术大学Address matching method and device
CN112579919A (en)*2020-12-092021-03-30小红书科技有限公司Data processing method and device and electronic equipment
CN113204612A (en)*2021-04-242021-08-03上海赛可出行科技服务有限公司Network appointment similar address identification method based on priori knowledge
CN114048797A (en)*2021-10-202022-02-15盐城金堤科技有限公司Method, device, medium and electronic equipment for determining address similarity
CN114254139A (en)*2021-12-172022-03-29北京百度网讯科技有限公司 Data processing method, sample acquisition method, model training method and device
CN114764482A (en)*2021-01-122022-07-19阿里巴巴集团控股有限公司Position recommendation information obtaining method and device, electronic equipment and storage medium
CN115952779A (en)*2023-03-132023-04-11中规院(北京)规划设计有限公司Position name calibration method and device, computer equipment and storage medium
CN116306627A (en)*2023-02-092023-06-23北京海致星图科技有限公司Multipath fusion address similarity calculation method, device, storage medium and equipment

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN111783419B (en)*2020-06-122024-02-27上海东普信息科技有限公司Address similarity calculation method, device, equipment and storage medium
CN111753516B (en)*2020-06-292024-04-16平安国际智慧城市科技股份有限公司Text check and repeat processing method and device, computer equipment and computer storage medium
CN111881677A (en)*2020-07-282020-11-03武汉大学 Address matching algorithm based on deep learning model
CN114254645B (en)*2020-09-222024-12-10北京百灵互联科技有限公司 An artificial intelligence-assisted writing system
CN112632406B (en)*2020-10-102024-04-09咪咕文化科技有限公司 Query method, device, electronic device and storage medium
CN113779370B (en)*2020-11-032023-09-26北京京东振世信息技术有限公司Address retrieval method and device
KR20220098314A (en)*2020-12-312022-07-12센스타임 인터내셔널 피티이. 리미티드. Training method and apparatus for neural network and related object detection method and apparatus
CN113468881B (en)*2021-07-232024-02-27浙江大华技术股份有限公司Address standardization method and device
CN113626730B (en)*2021-08-022024-12-03同盾科技有限公司 Similar address screening method, device, computing device and storage medium
CN115705360A (en)*2021-08-162023-02-17鼎富智能科技有限公司 A method and device for normalized matching of address text
CN114490923B (en)*2021-11-292025-02-14腾讯科技(深圳)有限公司 Training method, device, equipment and storage medium for similar text matching model
CN114154511A (en)*2021-12-092022-03-08阳光保险集团股份有限公司Semantic similarity calculation and model training method, device, equipment and storage medium
CN114372094B (en)*2021-12-232025-02-11中国电信股份有限公司 Method and device for querying user duplicate addresses
CN114970525B (en)*2022-06-142023-06-27城云科技(中国)有限公司 A method, device and readable storage medium for recognizing text and events
CN115292619A (en)*2022-08-052022-11-04江苏满运物流信息有限公司Address query method, system, device and storage medium
CN116150625B (en)*2023-03-082024-03-29华院计算技术(上海)股份有限公司Training method and device for text search model and computing equipment
CN117725909B (en)*2024-02-182024-05-14四川日报网络传媒发展有限公司Multi-dimensional comment auditing method and device, electronic equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20120323968A1 (en)*2011-06-142012-12-20Microsoft CorporationLearning Discriminative Projections for Text Similarity Measures
WO2016127904A1 (en)*2015-02-132016-08-18阿里巴巴集团控股有限公司Text address processing method and apparatus
CN105930413A (en)*2016-04-182016-09-07北京百度网讯科技有限公司Training method for similarity model parameters, search processing method and corresponding apparatuses
CN107239442A (en)*2017-05-092017-10-10北京京东金融科技控股有限公司A kind of method and apparatus of calculating address similarity
CN107609461A (en)*2017-07-192018-01-19阿里巴巴集团控股有限公司The training method of model, the determination method, apparatus of data similarity and equipment
CN108536657A (en)*2018-04-102018-09-14百融金融信息服务股份有限公司 Method and system for processing similarity of address text filled in by humans
CN108805583A (en)*2018-05-182018-11-13连连银通电子支付有限公司Electric business fraud detection method, device, equipment and medium based on address of cache

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106557574B (en)*2016-11-232020-02-04广东电网有限责任公司佛山供电局Target address matching method and system based on tree structure
CN108804398A (en)*2017-05-032018-11-13阿里巴巴集团控股有限公司The similarity calculating method and device of address text

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20120323968A1 (en)*2011-06-142012-12-20Microsoft CorporationLearning Discriminative Projections for Text Similarity Measures
WO2016127904A1 (en)*2015-02-132016-08-18阿里巴巴集团控股有限公司Text address processing method and apparatus
CN105930413A (en)*2016-04-182016-09-07北京百度网讯科技有限公司Training method for similarity model parameters, search processing method and corresponding apparatuses
CN107239442A (en)*2017-05-092017-10-10北京京东金融科技控股有限公司A kind of method and apparatus of calculating address similarity
CN107609461A (en)*2017-07-192018-01-19阿里巴巴集团控股有限公司The training method of model, the determination method, apparatus of data similarity and equipment
CN108536657A (en)*2018-04-102018-09-14百融金融信息服务股份有限公司 Method and system for processing similarity of address text filled in by humans
CN108805583A (en)*2018-05-182018-11-13连连银通电子支付有限公司Electric business fraud detection method, device, equipment and medium based on address of cache

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
SEGLA KPODJEDO 等: "Using local similarity measures to efficiently address approximate graph matching"*
宋子辉;: "自然语言理解的中文地址匹配算法"*
罗明;黄海量;: "一种基于有限状态机的中文地址标准化方法"*
郑爱武;: "基于地址语义及树状分析的用电地址自纠错模型研究"*

Cited By (14)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN112070429A (en)*2020-07-312020-12-11深圳市跨越新科技有限公司Address merging method and system
CN112070429B (en)*2020-07-312024-03-15深圳市跨越新科技有限公司Address merging method and system
CN112559658A (en)*2020-12-082021-03-26中国科学技术大学Address matching method and device
CN112559658B (en)*2020-12-082022-12-30中国科学技术大学 A method and device for address matching
CN112579919A (en)*2020-12-092021-03-30小红书科技有限公司Data processing method and device and electronic equipment
CN112579919B (en)*2020-12-092023-04-21小红书科技有限公司Data processing method and device and electronic equipment
CN114764482A (en)*2021-01-122022-07-19阿里巴巴集团控股有限公司Position recommendation information obtaining method and device, electronic equipment and storage medium
CN113204612A (en)*2021-04-242021-08-03上海赛可出行科技服务有限公司Network appointment similar address identification method based on priori knowledge
CN113204612B (en)*2021-04-242024-05-03上海赛可出行科技服务有限公司Priori knowledge-based network about vehicle similar address identification method
CN114048797A (en)*2021-10-202022-02-15盐城金堤科技有限公司Method, device, medium and electronic equipment for determining address similarity
CN114254139A (en)*2021-12-172022-03-29北京百度网讯科技有限公司 Data processing method, sample acquisition method, model training method and device
CN116306627A (en)*2023-02-092023-06-23北京海致星图科技有限公司Multipath fusion address similarity calculation method, device, storage medium and equipment
CN115952779A (en)*2023-03-132023-04-11中规院(北京)规划设计有限公司Position name calibration method and device, computer equipment and storage medium
CN115952779B (en)*2023-03-132023-09-29中规院(北京)规划设计有限公司Position name calibration method and device, computer equipment and storage medium

Also Published As

Publication numberPublication date
WO2020103783A1 (en)2020-05-28
CN111274811B (en)2023-04-18
TW202020688A (en)2020-06-01

Similar Documents

PublicationPublication DateTitle
CN111274811B (en)Address text similarity determining method and address searching method
CN106570148B (en)A kind of attribute extraction method based on convolutional neural networks
WO2023138188A1 (en)Feature fusion model training method and apparatus, sample retrieval method and apparatus, and computer device
Luo et al.Online learning of interpretable word embeddings
CN114357105B (en)Pre-training method and model fine-tuning method of geographic pre-training model
WO2020062770A1 (en)Method and apparatus for constructing domain dictionary, and device and storage medium
CN104572631B (en)The training method and system of a kind of language model
CN113821588B (en)Text processing method, device, electronic equipment and storage medium
JP2022169743A (en)Information extraction method and device, electronic equipment, and storage medium
CN113449084A (en)Relationship extraction method based on graph convolution
CN116991877B (en)Method, device and application for generating structured query statement
CN116797195A (en)Work order processing method, apparatus, computer device, and computer readable storage medium
CN113868351A (en) An address clustering method, device, electronic device and storage medium
CN116561319A (en) Text clustering method, text clustering device and text clustering system
CN116245139B (en)Training method and device for graph neural network model, event detection method and device
CN117743549A (en)Information query method, device and computer equipment
CN108229572B (en)Parameter optimization method and computing equipment
CN117312325A (en)Knowledge distillation-based quantization index construction method, device and equipment
CN115129871B (en)Text category determining method, apparatus, computer device and storage medium
CN120030146A (en) A transformer fault root cause analysis method based on large model iterative reasoning
CN114297235A (en) Risk address identification method, system and electronic device
CN114842920A (en) A molecular property prediction method, device, storage medium and electronic device
CN116956867A (en)Text similarity calculation method, device, equipment and storage medium
CN116431788A (en) A Semantic Retrieval Method for Cross-modal Data
CN112417131A (en)Information recommendation method and device

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp