Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

An industrial-grade implementation of DSSM

License

NotificationsYou must be signed in to change notification settings

Chiang97912/dssm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An industrial-grade implementation of the paper:Learning Deep Structured Semantic Models for Web Search using Clickthrough Data

Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. DSSM project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them.

This model can be used as a search engine that helps people find out their desired document even with searching a query that:

  1. is abbreviation of the document words;
  2. changed the order of the words in the document;
  3. shortened words in the document;
  4. has typos;
  5. has spacing issues.

Install

DSSM is dependent on PyTorch. Two ways to install DSSM:

Install DSSM from Pypi:

pip install dssm

Install DSSM from the Github source:

git clone https://github.com/Chiang97912/dssm.gitcd dssmpython setup.py install

Usage

Train

fromdssm.modelimportDSSMqueries= ['...']# query list, words need to be segmented in advance, and tokens should be spliced with spaces.documents= ['...']# document list, words need to be segmented in advance, and tokens should be spliced with spaces.model=DSSM('dssm-model',device='cuda:0',lang='en')model.fit(queries,documents)

Test

fromdssm.modelimportDSSMfromsklearn.metrics.pairwiseimportcosine_similaritytext_left='...'text_right='...'model=DSSM('dssm-model',device='cpu')vectors=model.encode([text_left,text_right])score=cosine_similarity([vectors[0]], [vectors[1]])print(score)

Dependencies

  • Python version 3.6
  • Numpy version 1.19.5
  • PyTorch version 1.9.0

About

An industrial-grade implementation of DSSM

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp