cvxgrp/pymdePublic

NotificationsYou must be signed in to change notification settings
Fork27
Star567

Minimum-distortion embedding with PyTorch

License

Apache-2.0 license

567 stars 27 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.github/workflows		.github/workflows
docs		docs
docs_src		docs_src
examples		examples
images		images
pymde		pymde
.flake8		.flake8
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Repository files navigation

PyMDE

The official documentation for PyMDE is available atwww.pymde.org.

This repository accompanies the monographMinimum-Distortion Embedding.

PyMDE is a Python library for computing vector embeddings for finite sets ofitems, such as images, biological cells, nodes in a network, or any otherabstract object.

What sets PyMDE apart from other embedding libraries is that it provides asimple but general framework for embedding, calledMinimum-DistortionEmbedding (MDE). With MDE, it is easy to recreate well-known embeddings and tocreate new ones, tailored to your particular application.

PyMDE is competitivein runtime with more specialized embedding methods. With a GPU, it can beeven faster.

Overview

PyMDE can be enjoyed by beginners and experts alike. It can be used to:

visualize datasets, small or large;
generate feature vectors for supervised learning;
compress high-dimensional vector data;
draw graphs (in up to orders of magnitude less time than packages like NetworkX);
create custom embeddings, with custom objective functions and constraints (such as having uncorrelated feature columns);
and more.

PyMDE is very young software, under active development. If you run into issues,or have any feedback, please reach out byfiling a Githubissue.

This README gives a very brief overview of PyMDE. Make sure to read theofficial documentation atwww.pymde.org, which has in-depth tutorialsand API documentation.

Installation

PyMDE is available on the Python Package Index, and on Conda Forge.

To install with pip, use

pip install pymde

Alternatively, to install with conda, use

conda install -c pytorch -c conda-forge pymde

PyMDE has the following requirements:

Python >= 3.7
numpy >= 1.17.5
scipy
torch >= 1.7.1
torchvision >= 0.8.2
pynndescent
requests

Getting started

Getting started with PyMDE is easy. For embeddings that work out-of-the box, we provide two main functions:

pymde.preserve_neighbors

which preserves the local structure of original data, and

pymde.preserve_distances

which preserves pairwise distances or dissimilarity scores in the originaldata.

Arguments. The input to these functions is the original data, representedeither as a data matrix in which each row is a feature vector, or as a(possibly sparse) graph encoding pairwise distances. The embedding dimension isspecified by theembedding_dim keyword argument, which is2 by default.

Return value. The return value is anMDE object. Calling theembed()method on this object returns an embedding, which is a matrix(torch.Tensor) in which each row is an embedding vector. For example, if theoriginal input is a data matrix of shape(n_items, n_features), then theembedding matrix has shape(n_items, embeddimg_dim).

We give examples of using these functions below.

Preserving neighbors

The following code produces an embedding of the MNIST dataset (images ofhandwritten digits), in a fashion similar to LargeVis, t-SNE, UMAP, and otherneighborhood-based embeddings. The original data is a matrix of shape(70000, 784), with each row representing an image.

importpymdemnist=pymde.datasets.MNIST()embedding=pymde.preserve_neighbors(mnist.data,verbose=True).embed()pymde.plot(embedding,color_by=mnist.attributes['digits'])

Unlike most other embedding methods, PyMDE can compute embeddings that satisfyconstraints. For example:

embedding=pymde.preserve_neighbors(mnist.data,constraint=pymde.Standardized(),verbose=True).embed()pymde.plot(embedding,color_by=mnist.attributes['digits'])

The standardization constraint enforces the embedding vectors to be centeredand have uncorrelated features.

Preserving distances

The functionpymde.preserve_distances is useful when you're more interestedin preserving the gross global structure instead of local structure.

Here's an example that produces an embedding of an academic coauthorshipnetwork, from Google Scholar. The original data is a sparse graph on roughly40,000 authors, with an edge between authors who have collaborated on at leastone paper.

importpymdegoogle_scholar=pymde.datasets.google_scholar()embedding=pymde.preserve_distances(google_scholar.data,verbose=True).embed()pymde.plot(embedding,color_by=google_scholar.attributes['coauthors'],color_map='viridis',background_color='black')

More collaborative authors are colored brighter, and are near the center of theembedding.

Example notebooks

We have severalexample notebooks that show how to use PyMDE on real (and synthetic) datasets.

Citing

To cite our work, please use the following BibTex entry.

@article{agrawal2021minimum,  author  = {Agrawal, Akshay and Ali, Alnur and Boyd, Stephen},  title   = {Minimum-Distortion Embedding},  journal = {arXiv},  year    = {2021},}

PyMDE was designed and developed byAkshay Agrawal.

About

Minimum-distortion embedding with PyTorch

pymde.org

Releases6

v0.2.0 Latest

Jul 1, 2025

+ 5 releases

Contributors10

Languages

Python98.6%
Other1.4%

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

PyMDE

Overview

Installation

Getting started

Preserving neighbors

Preserving distances

Example notebooks

Citing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Contributors10

Uh oh!

Languages

Movatterモバイル変換

License

cvxgrp/pymde

Folders and files

Latest commit

History

Repository files navigation

PyMDE

Overview

Installation

Getting started

Preserving neighbors

Preserving distances

Example notebooks

Citing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases6

Contributors10

Uh oh!

Languages