- Notifications
You must be signed in to change notification settings - Fork7
An industrial-grade implementation of DSSM
License
Chiang97912/dssm
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
An industrial-grade implementation of the paper:Learning Deep Structured Semantic Models for Web Search using Clickthrough Data
Latent semantic models, such as LSA, intend to map a query to its relevant documents at the semantic level where keyword-based matching often fails. DSSM project queries and documents into a common low-dimensional space where the relevance of a document given a query is readily computed as the distance between them.
This model can be used as a search engine that helps people find out their desired document even with searching a query that:
- is abbreviation of the document words;
- changed the order of the words in the document;
- shortened words in the document;
- has typos;
- has spacing issues.
DSSM is dependent on PyTorch. Two ways to install DSSM:
Install DSSM from Pypi:
pip install dssm
Install DSSM from the Github source:
git clone https://github.com/Chiang97912/dssm.gitcd dssmpython setup.py install
fromdssm.modelimportDSSMqueries= ['...']# query list, words need to be segmented in advance, and tokens should be spliced with spaces.documents= ['...']# document list, words need to be segmented in advance, and tokens should be spliced with spaces.model=DSSM('dssm-model',device='cuda:0',lang='en')model.fit(queries,documents)
fromdssm.modelimportDSSMfromsklearn.metrics.pairwiseimportcosine_similaritytext_left='...'text_right='...'model=DSSM('dssm-model',device='cpu')vectors=model.encode([text_left,text_right])score=cosine_similarity([vectors[0]], [vectors[1]])print(score)
Python
version 3.6Numpy
version 1.19.5PyTorch
version 1.9.0
About
An industrial-grade implementation of DSSM
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Uh oh!
There was an error while loading.Please reload this page.