Movatterモバイル変換


[0]ホーム

URL:


CN118013023B - Scientific and technological literature recommendation method and device, electronic equipment and storage medium - Google Patents

Scientific and technological literature recommendation method and device, electronic equipment and storage medium
Download PDF

Info

Publication number
CN118013023B
CN118013023BCN202410188813.1ACN202410188813ACN118013023BCN 118013023 BCN118013023 BCN 118013023BCN 202410188813 ACN202410188813 ACN 202410188813ACN 118013023 BCN118013023 BCN 118013023B
Authority
CN
China
Prior art keywords
literature
document
node
vector
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410188813.1A
Other languages
Chinese (zh)
Other versions
CN118013023A (en
Inventor
张运良
王莉军
李琳娜
郭晓琪
王力
高雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute Of Scientific And Technical Information Of China
Original Assignee
Institute Of Scientific And Technical Information Of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute Of Scientific And Technical Information Of ChinafiledCriticalInstitute Of Scientific And Technical Information Of China
Priority to CN202410188813.1ApriorityCriticalpatent/CN118013023B/en
Publication of CN118013023ApublicationCriticalpatent/CN118013023A/en
Application grantedgrantedCritical
Publication of CN118013023BpublicationCriticalpatent/CN118013023B/en
Activelegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Classifications

Landscapes

Abstract

The invention provides a scientific and technological literature recommending method, a device, electronic equipment and a storage medium, and belongs to the technical field of artificial intelligence, wherein the method comprises the following steps: acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data; constructing a heterogeneous graph based on the multi-type data; inputting the heterogeneous graph into a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model; and recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature. According to the scientific and technological literature recommending method provided by the invention, research direction data and traditional metadata are combined into the heterogeneous graph with refined data, so that the fact that more refined description data is used as deep knowledge association between hub mining literature and authors is realized, and the accuracy of scientific and technological literature recommending is improved.

Description

Scientific and technological literature recommendation method and device, electronic equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a scientific literature recommendation method, a device, an electronic device, and a storage medium.
Background
The knowledge in the technical field is continuously updated and developed, and the technical literature is one of the most direct information sources, so that the latest important information such as research results, new findings, technical development and the like can be known through the technical literature recommendation. Therefore, scientific literature recommendations provide a wide and reliable source of information for students, helping them to work more systematically and effectively in scientific research.
The existing scientific and technological literature recommendation method adopts a content filtering algorithm and a recommendation algorithm based on a network structure diagram. However, the algorithm of content filtering only detects the similarity between items with the same attribute or characteristic, and the new documents, documents lacking in keywords and other content characteristics or primary scholars without much research experience cannot effectively recommend the documents, namely the primary scholars face the recommended cold start problem; the recommendation algorithm based on the network structure diagram can mine deep relation among entities, but can not consider content relevance and user relevance to a great extent. Therefore, the existing scientific literature recommendation method can lead to lower accuracy of scientific literature recommendation.
Disclosure of Invention
The invention provides a scientific and technological literature recommending method, a device, electronic equipment and a storage medium, which are used for solving the problem that in the prior art, the accuracy of scientific and technological literature recommendation is low.
In a first aspect, the present invention provides a scientific literature recommendation method, including:
Acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data;
constructing a heterogeneous graph based on the multi-type data;
inputting the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents;
And recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature.
In one embodiment, the performing the scientific literature recommendation based on each author embedded vector and each first literature embedded vector to obtain a scientific literature recommendation result corresponding to each author of each scientific literature includes:
extracting text information characteristics of text contents of each scientific literature to obtain a plurality of second literature embedded vectors;
fusing the first document embedding vectors and the second document embedding vectors to obtain a plurality of target document embedding vectors;
And carrying out similarity calculation based on the embedded vectors of the target documents and the embedded vectors of the authors to obtain technical document recommendation results corresponding to the authors of the technical documents.
In one embodiment, when a plurality of target document embedding vectors are obtained by fusing the first document embedding vectors and the second document embedding vectors, the following steps are performed for the first document embedding vector and the second document embedding vector corresponding to each scientific document:
Determining a first document weight value of a current first document embedding vector and a second document weight value of a current second document embedding vector;
performing product calculation on the current first document embedding vector and the first document weight value to obtain a first weighted document vector;
performing product calculation on the current second document embedding vector and the second document weight value to obtain a second weighted document vector;
and carrying out summation calculation on the first weighted literature vector and the second weighted literature vector to obtain a target literature embedded vector corresponding to the current technical literature.
In one embodiment, when similarity calculation is performed based on each target document embedding vector and each author embedding vector to obtain a scientific document recommendation result corresponding to an author of each scientific document, the following steps are performed for each author embedding vector corresponding to each scientific document:
respectively carrying out cosine similarity calculation on the current author embedded vector and each target document embedded vector to obtain a plurality of similarity values;
sorting the similarity values in a descending order to obtain a sorting result;
sequentially extracting a first preset number of similar values from the first similar value in the sequencing result;
And generating a scientific and technological literature recommending table corresponding to the author of the current scientific and technological literature according to the first preset number of similar values.
In one embodiment, the target heterograph learning model is obtained by:
Acquiring text contents of a plurality of sample documents, and extracting information based on the text contents of the sample documents to obtain multi-type sample data;
constructing a sample heterogeneous map based on the multi-type sample data;
obtaining a domain category label of each document node in the sample heterogeneous graph;
performing feature conversion on each node data in the sample heterogeneous graph to obtain initial features of each node data;
determining a first document vector sample corresponding to each document node in the sample heterogeneous graph and a first author vector sample corresponding to each author node based on the meta-path view and each initial characteristic;
Determining a second document vector sample corresponding to each document node in the sample heterogeneous diagram and a second author vector sample corresponding to each author node based on the network mode view and each initial characteristic;
carrying out weighted summation calculation based on a preset comparison loss function and a preset classification loss function to obtain a first objective function of each document node;
Determining a preset contrast loss function as a second objective function of each author node;
And training the initial heterogeneous graph learning model based on the first literature vector sample, the second literature vector sample, the first objective function and the domain category label corresponding to each literature node, and the first author vector sample, the second author vector sample and the second objective function corresponding to each author node to obtain a target heterogeneous graph learning model.
In one embodiment, when determining a first document vector sample corresponding to each document node in the sample heterogeneous graph based on the meta-path view and each initial feature, performing the following steps for each document node in the sample heterogeneous graph:
Constructing a second preset number of meta paths for the current document node;
Respectively aggregating initial characteristics corresponding to all nodes on each element path to obtain path aggregation characteristics of each element path;
determining path weight of each element path;
and performing embedding calculation based on the path aggregation characteristics and the path weights of the element paths to obtain a first document vector sample of the current document node.
In one embodiment, when determining a second document vector sample corresponding to each document node in the sample heterogeneous graph based on the network mode view and each initial feature, performing the following steps for each document node in the sample heterogeneous graph:
Determining a plurality of neighbor nodes of the current document node;
dividing each neighbor node into a plurality of types of neighbor node combinations;
Respectively determining the attention value of each neighbor node in each neighbor node combination to the current document node;
Based on the attention value of each neighbor node in each neighbor node combination to the current document node, respectively aggregating initial features corresponding to all neighbor nodes in each neighbor node combination to obtain neighbor aggregation features of each neighbor node combination;
Determining the type weight of each neighbor node combination to the current document node;
And performing embedded calculation based on the neighbor aggregation characteristics and the type weights of the neighbor node combinations to obtain a second document vector sample of the current document node.
In a second aspect, the present invention further provides a scientific literature recommending apparatus, including:
the information extraction module is used for acquiring text contents of a plurality of technical documents and extracting information based on the text contents of the technical documents to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data;
The construction module is used for constructing a heterogeneous graph based on the multi-type data;
The heterogeneous diagram learning module is used for inputting the heterogeneous diagram into a target heterogeneous diagram learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous diagram learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents;
And the recommending module is used for recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining the technical literature recommending results corresponding to the authors of the technical literature.
In a third aspect, the present invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of any of the scientific literature recommendation methods described above when the program is executed.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the scientific literature recommendation methods described above.
According to the technical literature recommendation method, the device, the electronic equipment and the storage medium, finer research direction data of the literature are combined with the traditional metadata to jointly construct the heterogeneous graph with the fine data, the heterogeneous graph is input into the target heterogeneous graph learning model, a plurality of author embedded vectors and a plurality of first literature embedded vectors output by the target heterogeneous graph learning model are obtained, the finer description data is used as deep knowledge association between hub mining literature and authors, a more author personalized literature recommendation mode is achieved, and therefore accuracy of technical literature recommendation is improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a scientific literature recommendation method provided by the invention;
FIG. 2 is a schematic structural diagram of a scientific literature recommending device provided by the invention;
Fig. 3 is a schematic structural diagram of an electronic device provided by the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," and the like in this specification are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that embodiments of the invention may be practiced otherwise than as specifically illustrated or described herein.
The following describes a scientific literature recommendation method, a device, an electronic device and a storage medium provided by the invention with reference to fig. 1-3.
Fig. 1 is a schematic flow chart of a scientific literature recommendation method provided by the invention.
As shown in fig. 1, the scientific literature recommendation method provided by the invention includes, but is not limited to, the following steps:
Step 100: acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data;
step 200: constructing a heterogeneous graph based on the multi-type data;
Step 300: inputting the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model;
Step 400: and recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature.
It should be noted that, the technical literature recommending method provided by the embodiment of the invention is realized based on the technical literature recommending device, the technical literature recommending method combines more refined description data of the literature with traditional metadata by using a modeling tool of a heterogeneous graph, constructs a heterogeneous graph structure with refined data together, combines the idea based on content filtering, fuses text information and graph structure information, realizes characteristic representation of literature entities, and further completes technical literature recommending for authors belonging to the literature. Therefore, the technical literature recommending device is taken as an executing subject to describe the technical literature recommending method in the embodiment of the invention.
Specifically, the scientific literature recommending device acquires text contents of a plurality of scientific literature.
In the technical literature field, the technical literature refers to literature materials such as books, journal articles, conference papers, reports, and the like, which relate to scientific research and technical development processes, results, and theoretical analysis. Typical scientific literature resources are integrated DataBase systems (DataBase SYSTEMS AND Logic Programming, DBLP) of computer-like english literature and international computer association (Association for Computing Machinery, ACM). For better auxiliary recommendation, the scientific literature also needs to have corresponding domain categories, such as DBLP the scientific literature of the dataset is divided into four categories representing domains: databases, data mining, machine learning, information retrieval, scientific literature for ACM datasets is divided into three categories representing fields: database, wireless communication, data mining.
Further, the technical literature recommending device extracts information based on the obtained text content of each technical literature to obtain multi-type data, wherein the multi-type data comprises literature metadata and literature research direction data, the literature metadata at least comprises traditional entities such as literature data, meeting data, journal data, organization data, author data, title data, abstract data, keyword data and the like, the literature research direction data at least comprises fine granularity labels such as research questions, method models, instrument equipment, measurement indexes, data materials, theoretical principles and the like, that is, the literature metadata mainly describes attributes and characteristics of the technical literature, and the literature research direction data mainly describes research direction content and related data related to the literature.
Since the document metadata and the document study direction data differ in nature and distribution characteristics, the extraction method of the document metadata and the extraction method of the document study direction data differ. Literature metadata is usually directly matched from text content by means of pattern matching and rule-based methods, while research direction data is more dependent on text mining and deep learning techniques and is extracted from title data and abstract data.
Further, the scientific and technological literature recommending device performs normalization processing on the multi-type data to obtain normalized data, and builds a heterogeneous graph based on the normalized data, wherein each node in the heterogeneous graph represents the multi-type data, and each node data in the heterogeneous graph represents the essence content normalized in the multi-type data. It should be noted that, the multi-type data representation has multiple types of data, but multiple types of data exist in each type of data, so that the multi-type data represented by the embodiment of the invention includes the same type of data and different types of data, and correspondingly, all nodes in the heterogeneous graph also include the same type of nodes and different types of nodes.
It should be noted that, the heterogeneous graph is a complex graph structure, which includes multiple types of nodes and edges to simulate the complex relationship between each node. The heterogeneous graph is constructed by integrating data from different sources, different naming rules and standards may be adopted by different data sources, and normalization processing can unify the data from different sources under a common standard, so that data integration is smoother, data collision and information loss caused by data inconsistency are avoided, and therefore, nodes and node data in the heterogeneous graph can be unique and understandable by performing normalization processing on the multi-type data.
Further, the technical literature recommending device inputs the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first literature embedded vectors which are output by the target heterogeneous graph learning model, wherein the target heterogeneous graph learning model is trained based on multi-type sample data extracted from text contents of a plurality of sample literatures.
Further, the scientific literature recommending device extracts text information features of text contents of each scientific literature to obtain a plurality of second literature embedding vectors.
Further, the scientific and technological literature recommending device recommends the scientific and technological literature based on the author embedded vectors, the first literature embedded vectors and the second literature embedded vectors, and obtains scientific and technological literature recommending results corresponding to authors of the scientific and technological literature.
According to the scientific and technological literature recommending method provided by the invention, finer research direction data of the literature is combined with the traditional metadata to jointly construct the heterogeneous graph with the fine data, the heterogeneous graph is input into the target heterogeneous graph learning model, a plurality of author embedded vectors and a plurality of first literature embedded vectors output by the target heterogeneous graph learning model are obtained, the finer description data is used as deep knowledge correlation between the hub mining literature and authors, a more personalized literature recommending mode of the authors is realized, and therefore accuracy of recommending the scientific and technological literature is improved.
Further, based on step 400, the recommending of the technical literature based on the embedded vectors of each author and the embedded vectors of each first literature, obtaining the recommending result of the technical literature corresponding to each author of the technical literature, includes:
extracting text information characteristics of text contents of each scientific literature to obtain a plurality of second literature embedded vectors;
fusing the first document embedding vectors and the second document embedding vectors to obtain a plurality of target document embedding vectors;
And carrying out similarity calculation based on the embedded vectors of the target documents and the embedded vectors of the authors to obtain technical document recommendation results corresponding to the authors of the technical documents.
Specifically, the scientific literature recommending device extracts text information features of text contents of each scientific literature to obtain a plurality of second literature embedding vectors.
It should be noted that, by extracting the features of the text content of the scientific literature, in order to supplement the semantic missing information in the heterogeneous graph structure, the semantic missing information in the heterogeneous graph structure can be supplemented when the literature embedding vector fusion is performed subsequently.
It should be further noted that, text information feature extraction is performed on text content of a scientific literature, a comparison learning mode of literature titles and keywords can be adopted to fine tune the pre-trained language model Scibert, and the model is further used to perform vectorization representation on text information such as literature titles, abstract texts and the like. The training mode of the model and the application mode of the model belong to a common text information feature extraction mode, and are not described in detail herein.
Further, the scientific literature recommending device fuses the first literature embedding vectors and the second literature embedding vectors to obtain a plurality of target literature embedding vectors.
Further, the scientific and technological literature recommending device performs similarity calculation based on the embedded vectors of the target literature and the embedded vectors of the authors to obtain scientific and technological literature recommending results corresponding to the authors of the scientific and technological literature.
According to the embodiment of the invention, the first literature embedded vector output by the target heterogeneous graph learning model and the second literature embedded vector extracted by the text information feature are fused to obtain the target literature embedded vector, and further similarity calculation is performed based on the target literature embedded vector and the author embedded vector to obtain the scientific literature recommendation result corresponding to the author of the scientific literature, so that more refined description data is used as deep knowledge association between the hub mining literature and the author, a more personalized literature recommendation mode of the author is realized, and the accuracy of the scientific literature recommendation is improved.
Further, when a plurality of target document embedding vectors are obtained by fusing the first document embedding vectors and the second document embedding vectors, the following steps are performed for the first document embedding vector and the second document embedding vector corresponding to each scientific document:
Determining a first document weight value of a current first document embedding vector and a second document weight value of a current second document embedding vector;
performing product calculation on the current first document embedding vector and the first document weight value to obtain a first weighted document vector;
performing product calculation on the current second document embedding vector and the second document weight value to obtain a second weighted document vector;
and carrying out summation calculation on the first weighted literature vector and the second weighted literature vector to obtain a target literature embedded vector corresponding to the current technical literature.
Specifically, the scientific literature recommending device determines a first literature weight value of a current first literature embedded vector and a second literature weight value of a current second literature embedded vector, wherein the first literature weight value and the second literature weight value are set according to actual conditions, and the first literature weight value and the second literature weight value can be the same.
Further, the scientific literature recommending device performs product calculation on the current first literature embedded vector and the first literature weight value to obtain a first weighted literature vector.
Further, the scientific literature recommending device performs product calculation on the current second literature embedded vector and the second literature weight value to obtain a second weighted literature vector.
Further, the technical literature recommending device performs summation calculation on the first weighted literature vector and the second weighted literature vector to obtain a target literature embedded vector corresponding to the current technical literature.
The first document weight values of the first document embedding vectors are the same, and the second document weight values of the second document embedding vectors are the same.
According to the embodiment of the invention, the first document embedding vector is weighted through the first document weight value, the second document embedding vector is weighted through the second document weight value, the weighted vectors are further summed to obtain the target document embedding vector, fusion of the first document embedding vector and the second document embedding vector is realized, semantic missing information in a heterogeneous graph structure can be supplemented, the richness and the accuracy of semantic representation are improved, deep semantic relation and knowledge structure among documents are facilitated to be revealed, the accuracy and the reliability of subsequent scientific and technological document recommendation are improved, and the combination of text-based content filtering and graph-based representation learning methods is used for scientific and technological document recommendation, so that the problems of single recommendation result, cold start and the like are solved.
Further, when similarity calculation is performed based on each target document embedding vector and each author embedding vector to obtain a scientific document recommendation result corresponding to each author of each scientific document, the following steps are performed for each author embedding vector corresponding to each scientific document:
respectively carrying out cosine similarity calculation on the current author embedded vector and each target document embedded vector to obtain a plurality of similarity values;
sorting the similarity values in a descending order to obtain a sorting result;
sequentially extracting a first preset number of similar values from the first similar value in the sequencing result;
And generating a scientific and technological literature recommending table corresponding to the author of the current scientific and technological literature according to the first preset number of similar values.
Specifically, the scientific literature recommending device respectively performs cosine similarity calculation on the current author embedded vector and each target literature embedded vector to obtain a plurality of similarity values.
Further, the scientific literature recommending device performs descending order sorting on the similarity values to obtain sorting results.
Further, the scientific and technological literature recommending device sequentially extracts a first preset number of similar values from the first similar value in the sorting result, wherein the first preset number of similar values, namely the first N similar values, are extracted from the sorting result, and the first preset number is set according to actual conditions.
Further, the technical literature recommending device generates a technical literature recommending table corresponding to the author of the current technical literature according to the first preset number of similar values.
According to the embodiment of the invention, the similarity calculation is carried out through each author embedded vector and each target document embedded vector to obtain a plurality of similarity values, the similarity values are further subjected to descending order to obtain an ordering result, a first preset number of similarity values are extracted from the ordering result, and a scientific and technological document recommendation table corresponding to each author is generated according to the first preset number of similarity values, so that more refined description data is used as deep knowledge association between hub mining documents and authors, a more personalized document recommendation mode of the authors is realized, and the accuracy of the scientific and technological document recommendation is improved.
Further, the target heterogeneous graph learning model is obtained through the following steps:
Acquiring text contents of a plurality of sample documents, and extracting information based on the text contents of the sample documents to obtain multi-type sample data;
constructing a sample heterogeneous map based on the multi-type sample data;
obtaining a domain category label of each document node in the sample heterogeneous graph;
performing feature conversion on each node data in the sample heterogeneous graph to obtain initial features of each node data;
determining a first document vector sample corresponding to each document node in the sample heterogeneous graph and a first author vector sample corresponding to each author node based on the meta-path view and each initial characteristic;
Determining a second document vector sample corresponding to each document node in the sample heterogeneous diagram and a second author vector sample corresponding to each author node based on the network mode view and each initial characteristic;
carrying out weighted summation calculation based on a preset comparison loss function and a preset classification loss function to obtain a first objective function of each document node;
Determining a preset contrast loss function as a second objective function of each author node;
And training the initial heterogeneous graph learning model based on the first literature vector sample, the second literature vector sample, the first objective function and the domain category label corresponding to each literature node, and the first author vector sample, the second author vector sample and the second objective function corresponding to each author node to obtain a target heterogeneous graph learning model.
Specifically, the scientific and technological literature recommending device acquires text contents of a plurality of sample literatures, and performs information extraction based on the text contents of each sample literature to obtain multi-type sample data, wherein the multi-type sample data comprises a literature metadata sample and a literature research direction data sample, and the literature metadata sample at least comprises author sample data and literature sample data.
Further, the scientific literature recommending device constructs a sample heterogeneous map based on the multi-type sample data.
Further, the scientific and technological literature recommending device acquires the domain category labels of all literature nodes in the sample heterogeneous graph.
Further, the scientific and technological literature recommending device performs feature conversion on each node data in the sample heterogeneous graph to obtain initial features of each node data.
In one embodiment, the formulation of the initial characteristics of each node data is as follows:
wherein i characterizes node i; phii represents the type to which the node i belongs; sigma () characterizes the activation function; The mapping matrix of the type to which the node i belongs is characterized, that is, a mapping matrix is preset for each specific type, and the mapping matrix can be determined based on optimization preset criteria such as minimizing projection errors and maximizing inter-class differences; representing the bias item of the type to which the node i belongs; hi∈Rd×1 characterizes the initial features (which can be considered projection features) of node i.
Further, the scientific and technological literature recommending device determines a first literature vector sample corresponding to each literature node in the sample heterogeneous graph and a first author vector sample corresponding to each author node based on the meta path view and each initial feature.
Further, the scientific and technological literature recommending device determines second literature vector samples corresponding to each literature node in the sample heterogeneous graph and second author vector samples corresponding to each author node based on the network mode view and each initial feature.
It should be noted that, in the model training process, the first document vector sample determined based on the meta-path view and the second document vector sample determined based on the network mode view may be preset with a preset contrast loss function, so as to perform contrast learning on document vectors generated by different views. Likewise, a first author vector sample determined based on the meta-path view and a second author vector sample determined based on the network mode view may be preset with a preset contrast loss function to perform contrast learning on author vectors generated by different views.
Further, the scientific literature recommending device performs weighted summation calculation based on a preset comparison loss function and a preset classification loss function to obtain a first objective function of each literature node, wherein the preset comparison loss function and the preset classification loss function are set according to actual conditions, and the weight value of the preset comparison loss function and the weight value of the preset classification loss function are set according to the actual conditions. And the objective function of each document node for optimizing the model prediction performance is a first objective function fused with a preset contrast loss function and a preset classification loss function, and vector expression is more suitable for node classification tasks through superposition of two loss parts.
Further, the scientific literature recommending device determines a preset contrast loss function as a second objective function of each author node. It should be noted that, the objective function of each author node for optimizing the model prediction performance is a preset contrast loss function, that is, the second objective function, and since the author nodes are not classified, there is no need to superimpose the classification loss part.
Further, the scientific literature recommending device trains an initial heterogeneous graph learning model based on the first literature vector sample, the second literature vector sample, the first objective function and the domain category label corresponding to each literature node, and the first author vector sample, the second author vector sample and the second objective function corresponding to each author node to obtain a target heterogeneous graph learning model.
It should be noted that, since the first objective function of the document node is different from the second objective function of the author node, the training process for the vector samples of the document node and the vector samples of the author node is not the same. Therefore, it can be considered that there are two sub-models in the initial heterograph learning model, which are trained on the vector samples of the document node and the vector samples of the author node, respectively. Because the vector samples of the document nodes and the vector samples of the author nodes are trained in the initial heterogeneous diagram learning model of the large model, the author embedded vector and the document embedded vector which are output by the target heterogeneous diagram learning model after training in the actual application process have the same dimension so as to facilitate the subsequent similarity calculation.
It should be further noted that, by means of contrast learning, the vector generated by the two views aims at making the vector representation based on the network mode view and the vector representation based on the meta-path view tend to agree on the same document node or the same author node, and at the same time making the document node of the same type or the author node of the same type more similar in the vector representation. Therefore, in the practical application process, the vector effects generated by the two views are approximate, and because the network mode view is more common, the author embedded vector and the document embedded vector output by the target heterogeneous graph learning model are determined based on the network mode view.
According to the embodiment of the invention, based on text contents of a plurality of sample documents, a sample heterogeneous diagram is constructed, feature conversion is carried out on each node data in the sample heterogeneous diagram to obtain initial features of each node data, a first document vector sample and a first author vector sample are further generated through a network mode view, a second document vector sample and a second author vector sample are further generated through a meta path view, a first objective function of a document node and a second objective function of an author node are further determined, the initial heterogeneous diagram learning model is further trained based on the plurality of vector samples and corresponding objective functions, the target heterogeneous diagram learning model is obtained through data training, and better similarity is shown on different types of nodes while the document vector or the author vector tends to be consistent on the same type of nodes, so that embedded representation learning of the document node and the author node in the heterogeneous diagram is realized, and the generalization performance of the model is improved.
Further, when determining a first document vector sample corresponding to each document node in the sample heterogeneous graph based on the meta-path view and each initial feature, performing the following steps for each document node in the sample heterogeneous graph:
Constructing a second preset number of meta paths for the current document node;
Respectively aggregating initial characteristics corresponding to all nodes on each element path to obtain path aggregation characteristics of each element path;
determining path weight of each element path;
and performing embedding calculation based on the path aggregation characteristics and the path weights of the element paths to obtain a first document vector sample of the current document node.
The meta path is a specific path in the network mode tg= (a, R), and may be denoted as a1→a2→alj+1. The meta-path defines a composite path from entity a to a, where R represents the combined operation of the relationships. The meta-paths capture higher-order relationships of nodes in the heterogeneous graph, and different meta-paths describe semantic relationships at different angles. For example, in an academic heterogeneous graph, a meta-path "author-paper-author" indicates that authors collaborate to write a paper, i.e., a co-author relationship, and "author-paper-meeting-paper-author" indicates that authors published papers at the same meeting, i.e., a co-meeting relationship. Both can be used to represent relationships between authors, but characterize different semantic features, the former emphasizing those authors who write a large number of papers with multiple authors, and the latter emphasizing authors who published many papers on well-known conferences.
It should be further noted that, when the analysis is performed in the meta-path view, the embedding representation process for the document node and the embedding representation process for the author node are consistent, so only the embedding representation process for the document node will be described below, and the embedding representation process for the author node will not be described in detail.
Specifically, the scientific literature recommending device constructs a second preset number of meta paths for the current literature node, wherein the second preset number is set according to actual conditions.
Further, the scientific and technological literature recommending device respectively aggregates initial characteristics corresponding to all nodes on each element path to obtain path aggregation characteristics of each element path.
In one embodiment, the path aggregation characteristics of each meta-path are formulated as follows:
Wherein i characterizes literature node i; phin characterizes the meta-path; Characterizing all nodes on the meta-path phin; j represents node j on meta-path phin; hi characterizes the initial features of document node i; hj characterizes the initial characteristics of node j; di characterizes the degree of literature node i; dj characterizes the degree of node j; and (3) characterizing the path aggregation characteristics of the meta-pathn corresponding to the document node i.
Further, the scientific and technological literature recommending device determines the path weight of each element path, and further, the scientific and technological literature recommending device performs embedding calculation based on the path aggregation characteristics and the path weights of each element path to obtain a first literature vector sample of the current literature node.
In one embodiment, the path weight of each meta-path is formulated as follows:
Wherein n characterizes n meta paths; phin characterizes the meta-path; characterizing a weight matrix of the element path; Path weights for the meta-paths are characterized.
Further, the formulation of the weight matrix of the meta-path is as follows:
Wherein V characterizes a collection of literature nodes; Characterizing path aggregation characteristics of the meta paths; wmp∈Rd×d characterizes a learnable parameter; bmp∈Rd×1 represents a learnable parameter; amp represents a semantic level attention vector, which can be dynamically obtained by calculating and weighting input information; A weight matrix characterizing the meta-path.
Further, the formulation of the first document vector sample for the document node is as follows:
Wherein i characterizes literature node i; n represents n element paths; Characterizing path aggregation characteristics of the meta paths; Path weights for the meta-paths are characterized.
According to the embodiment of the invention, analysis is carried out based on the meta-path view, initial features corresponding to all nodes on each meta-path are aggregated, path aggregation features of each meta-path are obtained, the path weights of each meta-path are further combined, embedding calculation is carried out, a first document vector sample of document nodes is obtained, feature aggregation and path weight calculation are carried out on a plurality of meta-paths, and more abundant and accurate node embedding vectors can be generated, so that the expressive capacity and effect of heterogeneous graph learning are improved.
Further, when determining a second document vector sample corresponding to each document node in the sample heterogeneous graph based on the network mode view and each initial feature, performing the following steps for each document node in the sample heterogeneous graph:
Determining a plurality of neighbor nodes of the current document node;
dividing each neighbor node into a plurality of types of neighbor node combinations;
Respectively determining the attention value of each neighbor node in each neighbor node combination to the current document node;
Based on the attention value of each neighbor node in each neighbor node combination to the current document node, respectively aggregating initial features corresponding to all neighbor nodes in each neighbor node combination to obtain neighbor aggregation features of each neighbor node combination;
Determining the type weight of each neighbor node combination to the current document node;
And performing embedded calculation based on the neighbor aggregation characteristics and the type weights of the neighbor node combinations to obtain a second document vector sample of the current document node.
The network mode view is from the graph structure itself, and considers the direct connection relationship between nodes and the global graph topology feature. Such views focus mainly on direct connections and path information between nodes, aimed at capturing neighborhood relationships and global positions of nodes on the graph structure, often used to represent structured features between nodes.
It should be further noted that, when analysis is performed in the network mode view, the embedding representation process for the document node and the embedding representation process for the author node are identical, and therefore, only the embedding representation process for the document node will be described below.
Specifically, the scientific literature recommending device determines a plurality of neighbor nodes of the current literature node.
It should be noted that in the network mode view, the document node is connected to multiple types of neighbor nodes, and also connected to the same type of neighbor nodes, and the contribution of different types of neighbor nodes to the embedding of the document node is different, and the contribution of neighbor nodes with the same type is also different, so that the attention mechanism can be used in combination with the node level and the type level to aggregate each type of neighbor nodes.
Further, the scientific literature recommending device divides each neighbor node into a plurality of types of neighbor node combinations.
Further, the scientific literature recommending device respectively determines the attention value of each neighbor node in each neighbor node combination to the current literature node.
In one embodiment, the formulation of the attention value of each neighbor node to the current document node in them -type neighbor node combination is as follows:
wherein exp () characterizes an exponential function; leakyReLU () characterizes the activation function; i represents a document node i; Characterizingm type neighbor node combinations; j represents neighbor node combinations of phim typesNeighboring node j in (a); The node level attention vector of the phim type is characterized, and the node level attention vector can be dynamically obtained by calculating and weighting input information; hi characterizes the initial features of document node i; hj represents the initial characteristics of the neighbor node j; the i characterizes the join operation; The attention value of a neighbor node j to a document node i in a neighbor node combination of the type phim is characterized.
Further, the scientific and technological literature recommending device respectively aggregates initial features corresponding to all neighbor nodes in each neighbor node combination based on the attention value of each neighbor node in each neighbor node combination to the current literature node, and obtains neighbor aggregation features of each neighbor node combination.
In one embodiment, the formulation of the neighbor aggregation characteristics for each neighbor node combination is as follows:
wherein σ () is an activation function; i represents a document node i; Characterizing a phim type neighbor node combination; j represents neighbor node combinations of phim typesNeighboring node j in (a); hj represents the initial characteristics of the neighbor node j; Representing the attention value of a neighbor node j to a document node i in a neighbor node combination of the phim type; characterizing neighbor aggregation features for a phim type of combination of neighbor nodes.
Further, the scientific and technological literature recommending device determines the type weight of each neighbor node combination to the current literature node, and further, the scientific and technological literature recommending device performs embedded calculation based on the neighbor aggregation characteristics and the type weight of each neighbor node combination to obtain a second literature vector sample of the current literature node.
In one embodiment, the formulation of the type weights for each neighbor node combination to the current document node is as follows:
Wherein s characterizes s types; phim represents a neighbor node combination; Characterizing a weight matrix of the neighbor node combination; The type weights of neighboring node combinations to the current document node are characterized.
Further, the formula of the weight matrix of the neighbor node combination is as follows:
Wherein V characterizes a collection of literature nodes; Characterizing neighbor aggregation characteristics of neighbor node combinations; wsc∈Rd×d characterizes a learnable parameter; bsc∈Rd×1 represents a learnable parameter; asc represents a type-level attention vector, which can be dynamically obtained by calculating and weighting input information; and characterizing a weight matrix of the neighbor node combination.
Further, the second document vector sample of the document node is formulated as follows:
wherein i characterizes literature node i; s characterizes s types; characterizing neighbor aggregation characteristics of neighbor node combinations; The type weights of neighboring node combinations to the current document node are characterized.
The embodiment of the invention analyzes based on the network mode view, aggregates the initial characteristics corresponding to all neighbor nodes in each neighbor node combination based on the attention value of each neighbor node in each neighbor node combination to obtain the neighbor aggregation characteristics of each neighbor node combination, further combines each neighbor node combination to carry out embedding calculation on the type weight of the current document node to obtain a second document vector sample of the document node, can effectively capture complex relations in a heterogeneous information network, and generates high-quality and high-efficiency embedding representation, thereby improving the expressive capacity and effect of heterogeneous graph learning.
Further, the invention also provides a scientific and technological literature recommending device.
Referring to fig. 2, fig. 2 is a schematic structural diagram of a scientific literature recommending device provided by the invention.
The scientific literature recommending device comprises:
The information extraction module 210 is configured to obtain text contents of a plurality of technical documents, and extract information based on the text contents of each technical document to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data;
A construction module 220 for constructing a heterogram based on the multi-type data;
The heterogeneous graph learning module 230 is configured to input the heterogeneous graph to a target heterogeneous graph learning model, and obtain a plurality of author embedded vectors and a plurality of first document embedded vectors output by the target heterogeneous graph learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents;
And the recommending module 240 is configured to recommend technical documents based on the author embedded vectors and the first document embedded vectors, and obtain technical document recommending results corresponding to authors of the technical documents.
According to the technical literature recommending device provided by the invention, finer research direction data of the literature is combined with the traditional metadata to jointly construct the heterogeneous graph with the fine data, the heterogeneous graph is input into the target heterogeneous graph learning model, a plurality of author embedded vectors and a plurality of first literature embedded vectors output by the target heterogeneous graph learning model are obtained, the finer description data is used as deep knowledge correlation between the hub mining literature and authors, a more personalized literature recommending mode of the authors is realized, and therefore accuracy of technical literature recommending is improved.
Further, the recommendation module 240 further includes:
extracting text information characteristics of text contents of each scientific literature to obtain a plurality of second literature embedded vectors;
fusing the first document embedding vectors and the second document embedding vectors to obtain a plurality of target document embedding vectors;
And carrying out similarity calculation based on the embedded vectors of the target documents and the embedded vectors of the authors to obtain technical document recommendation results corresponding to the authors of the technical documents.
Further, the recommendation module 240 further includes:
Determining a first document weight value of a current first document embedding vector and a second document weight value of a current second document embedding vector;
performing product calculation on the current first document embedding vector and the first document weight value to obtain a first weighted document vector;
performing product calculation on the current second document embedding vector and the second document weight value to obtain a second weighted document vector;
and carrying out summation calculation on the first weighted literature vector and the second weighted literature vector to obtain a target literature embedded vector corresponding to the current technical literature.
Further, the recommendation module 240 further includes:
respectively carrying out cosine similarity calculation on the current author embedded vector and each target document embedded vector to obtain a plurality of similarity values;
sorting the similarity values in a descending order to obtain a sorting result;
sequentially extracting a first preset number of similar values from the first similar value in the sequencing result;
And generating a scientific and technological literature recommending table corresponding to the author of the current scientific and technological literature according to the first preset number of similar values.
Further, the scientific literature recommending device further includes:
Acquiring text contents of a plurality of sample documents, and extracting information based on the text contents of the sample documents to obtain multi-type sample data;
constructing a sample heterogeneous map based on the multi-type sample data;
obtaining a domain category label of each document node in the sample heterogeneous graph;
performing feature conversion on each node data in the sample heterogeneous graph to obtain initial features of each node data;
determining a first document vector sample corresponding to each document node in the sample heterogeneous graph and a first author vector sample corresponding to each author node based on the meta-path view and each initial characteristic;
Determining a second document vector sample corresponding to each document node in the sample heterogeneous diagram and a second author vector sample corresponding to each author node based on the network mode view and each initial characteristic;
carrying out weighted summation calculation based on a preset comparison loss function and a preset classification loss function to obtain a first objective function of each document node;
Determining a preset contrast loss function as a second objective function of each author node;
And training the initial heterogeneous graph learning model based on the first literature vector sample, the second literature vector sample, the first objective function and the domain category label corresponding to each literature node, and the first author vector sample, the second author vector sample and the second objective function corresponding to each author node to obtain a target heterogeneous graph learning model.
Further, the scientific literature recommending device further includes:
Constructing a second preset number of meta paths for the current document node;
Respectively aggregating initial characteristics corresponding to all nodes on each element path to obtain path aggregation characteristics of each element path;
determining path weight of each element path;
and performing embedding calculation based on the path aggregation characteristics and the path weights of the element paths to obtain a first document vector sample of the current document node.
Further, the scientific literature recommending device further includes:
Determining a plurality of neighbor nodes of the current document node;
dividing each neighbor node into a plurality of types of neighbor node combinations;
Respectively determining the attention value of each neighbor node in each neighbor node combination to the current document node;
Based on the attention value of each neighbor node in each neighbor node combination to the current document node, respectively aggregating initial features corresponding to all neighbor nodes in each neighbor node combination to obtain neighbor aggregation features of each neighbor node combination;
Determining the type weight of each neighbor node combination to the current document node;
And performing embedded calculation based on the neighbor aggregation characteristics and the type weights of the neighbor node combinations to obtain a second document vector sample of the current document node.
It should be noted that, in the specific operation, the scientific literature recommending device provided by the present invention may execute the scientific literature recommending method described in any of the foregoing embodiments, and this embodiment is not described in detail.
Fig. 3 is a schematic structural diagram of an electronic device provided by the present invention, and as shown in fig. 3, the electronic device may include: processor 310, communication interface (Communications Interface) 320, memory 330 and communication bus 340, wherein processor 310, communication interface 320 and memory 330 communicate with each other via communication bus 340. The processor 310 may invoke logic instructions in the memory 330 to perform a scientific literature recommendation method comprising: acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data; constructing a heterogeneous graph based on the multi-type data; inputting the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents; and recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature.
Further, the logic instructions in the memory 330 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, are capable of performing the scientific literature recommendation method provided by the above embodiments, the method comprising: acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data; constructing a heterogeneous graph based on the multi-type data; inputting the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents; and recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature.
In still another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the scientific literature recommendation method provided by the above embodiments, the method including: acquiring text contents of a plurality of technical documents, and extracting information based on the text contents of the technical documents to obtain multi-type data; the multi-type data includes literature metadata and literature study direction data; the document metadata includes at least author data and document data; constructing a heterogeneous graph based on the multi-type data; inputting the heterogeneous graph to a target heterogeneous graph learning model to obtain a plurality of author embedded vectors and a plurality of first document embedded vectors which are output by the target heterogeneous graph learning model; the target heterogeneous graph learning model is obtained by training based on multi-type sample data extracted from text contents of a plurality of sample documents; and recommending the technical literature based on the embedded vectors of the authors and the embedded vectors of the first literature, and obtaining technical literature recommending results corresponding to the authors of the technical literature.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (9)

CN202410188813.1A2024-02-202024-02-20Scientific and technological literature recommendation method and device, electronic equipment and storage mediumActiveCN118013023B (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN202410188813.1ACN118013023B (en)2024-02-202024-02-20Scientific and technological literature recommendation method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN202410188813.1ACN118013023B (en)2024-02-202024-02-20Scientific and technological literature recommendation method and device, electronic equipment and storage medium

Publications (2)

Publication NumberPublication Date
CN118013023A CN118013023A (en)2024-05-10
CN118013023Btrue CN118013023B (en)2024-10-01

Family

ID=90955705

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN202410188813.1AActiveCN118013023B (en)2024-02-202024-02-20Scientific and technological literature recommendation method and device, electronic equipment and storage medium

Country Status (1)

CountryLink
CN (1)CN118013023B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN118939786B (en)*2024-07-222025-01-24中国科学院青藏高原研究所 A scientific literature recommendation method integrating multi-dimensional features of metadata

Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113282612A (en)*2021-07-212021-08-20中国人民解放军国防科技大学Author conference recommendation method based on scientific cooperation heterogeneous network analysis

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
KR101713831B1 (en)*2016-07-262017-03-09한국과학기술정보연구원Apparatus for recommending document and method for recommending document
CN112148776B (en)*2020-09-292024-05-03清华大学 Academic relationship prediction method and device based on neural network with semantic information

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN113282612A (en)*2021-07-212021-08-20中国人民解放军国防科技大学Author conference recommendation method based on scientific cooperation heterogeneous network analysis

Also Published As

Publication numberPublication date
CN118013023A (en)2024-05-10

Similar Documents

PublicationPublication DateTitle
AU2011269676B2 (en)Systems of computerized agents and user-directed semantic networking
CN101364239B (en) A classification catalog automatic construction method and related system
Liu et al.A new method for knowledge and information management domain ontology graph model
Jotheeswaran et al.OPINION MINING USING DECISION TREE BASED FEATURE SELECTION THROUGH MANHATTAN HIERARCHICAL CLUSTER MEASURE.
RistoskiExploiting semantic web knowledge graphs in data mining
CN103778206A (en)Method for providing network service resources
CN118013023B (en)Scientific and technological literature recommendation method and device, electronic equipment and storage medium
Park et al.Automatic extraction of user’s search intention from web search logs
Yu et al.Embedding text-rich graph neural networks with sequence and topical semantic structures
De Bonis et al.Graph-based methods for Author Name Disambiguation: a survey
Kwapong et al.A knowledge graph approach to mashup tag recommendation
Menéndez et al.A genetic graph-based clustering approach to biomedical summarization
Wang et al.EEUPL: Towards effective and efficient user profile linkage across multiple social platforms
Xia et al.Content-irrelevant tag cleansing via bi-layer clustering and peer cooperation
CN115630141B (en) A Retrieval Method for Scientific and Technological Experts Based on Community Query and High-Dimensional Vector Retrieval
Wang et al.User profile linkage across multiple social platforms
Hai et al.Improving the Efficiency of Semantic Image Retrieval Using a Combined Graph and SOM Model
Li et al.Research on hot news discovery model based on user interest and topic discovery
Xiao et al.Group Feature Aggregation for Web Service Recommendations
Che et al.A feature and deep learning model recommendation system for mobile application
Liang et al.DeepDiveAI: Identifying AI-Related Documents in Large Scale Literature Dataset
CN120086414B (en)File retrieval method and system based on cloud computing
Ergashev et al.Resource2Box: Learning To Rank Resources in Distributed Search Using Box Embedding
Park et al.Extracting search intentions from web search logs
SunResearch on Digital Book Resource Recommendation Algorithm Based on Knowledge Graph

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
GR01Patent grant
GR01Patent grant

[8]ページ先頭

©2009-2025 Movatter.jp