- Notifications
You must be signed in to change notification settings - Fork255
pyclustering is a Python, C++ data mining library.
License
annoviko/pyclustering
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Please be aware that the `pyclustering` library is no longer supported as of 2021 due to personal reasons. There will be no further maintenance, issue addressing, or feature development for this repository.
For continued usage, I recommend seeking alternative solutions.
Thank you for your understanding.
pyclustering is a Python, C++ data mining library (clusteringalgorithm, oscillatory networks, neural networks). The library providesPython and C++ implementations (C++ pyclustering library) of each algorithm ormodel. C++ pyclustering library is a part of pyclustering and supported forLinux, Windows and MacOS operating systems.
Version: 0.11.dev
License: The 3-Clause BSD License
E-Mail:pyclustering@yandex.ru
Documentation:https://pyclustering.github.io/docs/0.10.1/html/
Homepage:https://pyclustering.github.io/
PyClustering Wiki:https://github.com/annoviko/pyclustering/wiki
Required packages: scipy, matplotlib, numpy, Pillow
Python version: >=3.6 (32-bit, 64-bit)
C++ version: >= 14 (32-bit, 64-bit)
Each algorithm is implemented using Python and C/C++ language, if your platform is not supported then Pythonimplementation is used, otherwise C/C++. Implementation can be chosen by ccore flag (by default it is always'True' and it means that C/C++ is used), for example:
# As by default - C/C++ part of the library is usedxmeans_instance_1=xmeans(data_points,start_centers,20,ccore=True);# The same - C/C++ part of the library is used by defaultxmeans_instance_2=xmeans(data_points,start_centers,20);# Switch off core - Python is usedxmeans_instance_3=xmeans(data_points,start_centers,20,ccore=False);
Installation using pip3 tool:
$ pip3 install pyclustering
Manual installation from official repository using Makefile:
# get sources of the pyclustering library, for example, from repository$ mkdir pyclustering$cd pyclustering/$ git clone https://github.com/annoviko/pyclustering.git.# compile CCORE library (core of the pyclustering library).$cd ccore/$ make ccore_64bit# build for 64-bit OS# $ make ccore_32bit # build for 32-bit OS# return to parent folder of the pyclustering library$cd ../# install pyclustering library$ python3 setup.py install# optionally - test the library$ python3 setup.pytest
Manual installation using CMake:
# get sources of the pyclustering library, for example, from repository$ mkdir pyclustering$cd pyclustering/$ git clone https://github.com/annoviko/pyclustering.git.# generate build files.$ mkdir build$ cmake ..# build pyclustering-shared target depending on what was generated (Makefile or MSVC solution)# if Makefile has been generated then$ make pyclustering-shared# return to parent folder of the pyclustering library$cd ../# install pyclustering library$ python3 setup.py install# optionally - test the library$ python3 setup.pytest
Manual installation using Microsoft Visual Studio solution:
- Clone repository from:https://github.com/annoviko/pyclustering.git
- Open folder pyclustering/ccore
- Open Visual Studio project ccore.sln
- Select solution platform: x86 or x64
- Build pyclustering-shared project.
- Add pyclustering folder to python path or install it using setup.py
# install pyclustering library$ python3 setup.py install# optionally - test the library$ python3 setup.pytest
In case of any questions, proposals or bugs related to the pyclustering please contact topyclustering@yandex.ru or create an issue here.
Branch | master | 0.10.dev | 0.10.1.rel |
---|---|---|---|
Build (Linux, MacOS) | |||
Build (Win) | |||
Code Coverage |
If you are using pyclustering library in a scientific paper, please, cite the library:
Novikov, A., 2019. PyClustering: Data Mining Library. Journal of Open Source Software, 4(36), p.1230. Available at:http://dx.doi.org/10.21105/joss.01230.
BibTeX entry:
@article{Novikov2019, doi = {10.21105/joss.01230}, url = {https://doi.org/10.21105/joss.01230}, year = 2019, month = {apr}, publisher = {The Open Journal}, volume = {4}, number = {36}, pages = {1230}, author = {Andrei Novikov}, title = {{PyClustering}: Data Mining Library}, journal = {Journal of Open Source Software}}
Clustering algorithms and methods (module pyclustering.cluster):
Algorithm | Python | C++ |
---|---|---|
Agglomerative | ✓ | ✓ |
BANG | ✓ | |
BIRCH | ✓ | |
BSAS | ✓ | ✓ |
CLARANS | ✓ | |
CLIQUE | ✓ | ✓ |
CURE | ✓ | ✓ |
DBSCAN | ✓ | ✓ |
Elbow | ✓ | ✓ |
EMA | ✓ | |
Fuzzy C-Means | ✓ | ✓ |
GA (Genetic Algorithm) | ✓ | ✓ |
G-Means | ✓ | ✓ |
HSyncNet | ✓ | ✓ |
K-Means | ✓ | ✓ |
K-Means++ | ✓ | ✓ |
K-Medians | ✓ | ✓ |
K-Medoids | ✓ | ✓ |
MBSAS | ✓ | ✓ |
OPTICS | ✓ | ✓ |
ROCK | ✓ | ✓ |
Silhouette | ✓ | ✓ |
SOM-SC | ✓ | ✓ |
SyncNet | ✓ | ✓ |
Sync-SOM | ✓ | |
TTSAS | ✓ | ✓ |
X-Means | ✓ | ✓ |
Oscillatory networks and neural networks (module pyclustering.nnet):
Model | Python | C++ |
---|---|---|
CNN (Chaotic Neural Network) | ✓ | |
fSync (Oscillatory network based on Landau-Stuart equation and Kuramoto model) | ✓ | |
HHN (Oscillatory network based on Hodgkin-Huxley model) | ✓ | ✓ |
Hysteresis Oscillatory Network | ✓ | |
LEGION (Local Excitatory Global Inhibitory Oscillatory Network) | ✓ | ✓ |
PCNN (Pulse-Coupled Neural Network) | ✓ | ✓ |
SOM (Self-Organized Map) | ✓ | ✓ |
Sync (Oscillatory network based on Kuramoto model) | ✓ | ✓ |
SyncPR (Oscillatory network for pattern recognition) | ✓ | ✓ |
SyncSegm (Oscillatory network for image segmentation) | ✓ | ✓ |
Graph Coloring Algorithms (module pyclustering.gcolor):
Algorithm | Python | C++ |
---|---|---|
DSatur | ✓ | |
Hysteresis | ✓ | |
GColorSync | ✓ |
Containers (module pyclustering.container):
Algorithm | Python | C++ |
---|---|---|
KD Tree | ✓ | ✓ |
CF Tree | ✓ |
The library contains examples for each algorithm and oscillatory network model:
Clustering examples:pyclustering/cluster/examples
Graph coloring examples:pyclustering/gcolor/examples
Oscillatory network examples:pyclustering/nnet/examples
Data clustering by CURE algorithm
frompyclustering.clusterimportcluster_visualizer;frompyclustering.cluster.cureimportcure;frompyclustering.utilsimportread_sample;frompyclustering.samples.definitionsimportFCPS_SAMPLES;# Input data in following format [ [0.1, 0.5], [0.3, 0.1], ... ].input_data=read_sample(FCPS_SAMPLES.SAMPLE_LSUN);# Allocate three clusters.cure_instance=cure(input_data,3);cure_instance.process();clusters=cure_instance.get_clusters();# Visualize allocated clusters.visualizer=cluster_visualizer();visualizer.append_clusters(clusters,input_data);visualizer.show();
Data clustering by K-Means algorithm
frompyclustering.cluster.kmeansimportkmeans,kmeans_visualizerfrompyclustering.cluster.center_initializerimportkmeans_plusplus_initializerfrompyclustering.samples.definitionsimportFCPS_SAMPLESfrompyclustering.utilsimportread_sample# Load list of points for cluster analysis.sample=read_sample(FCPS_SAMPLES.SAMPLE_TWO_DIAMONDS)# Prepare initial centers using K-Means++ method.initial_centers=kmeans_plusplus_initializer(sample,2).initialize()# Create instance of K-Means algorithm with prepared centers.kmeans_instance=kmeans(sample,initial_centers)# Run cluster analysis and obtain results.kmeans_instance.process()clusters=kmeans_instance.get_clusters()final_centers=kmeans_instance.get_centers()# Visualize obtained resultskmeans_visualizer.show_clusters(sample,clusters,final_centers)
Data clustering by OPTICS algorithm
frompyclustering.clusterimportcluster_visualizerfrompyclustering.cluster.opticsimportoptics,ordering_analyser,ordering_visualizerfrompyclustering.samples.definitionsimportFCPS_SAMPLESfrompyclustering.utilsimportread_sample# Read sample for clustering from some filesample=read_sample(FCPS_SAMPLES.SAMPLE_LSUN)# Run cluster analysis where connectivity radius is bigger than realradius=2.0neighbors=3amount_of_clusters=3optics_instance=optics(sample,radius,neighbors,amount_of_clusters)# Performs cluster analysisoptics_instance.process()# Obtain results of clusteringclusters=optics_instance.get_clusters()noise=optics_instance.get_noise()ordering=optics_instance.get_ordering()# Visualize ordering diagramanalyser=ordering_analyser(ordering)ordering_visualizer.show_ordering_diagram(analyser,amount_of_clusters)# Visualize clustering resultsvisualizer=cluster_visualizer()visualizer.append_clusters(clusters,sample)visualizer.show()
Simulation of oscillatory network PCNN
frompyclustering.nnet.pcnnimportpcnn_network,pcnn_visualizer# Create Pulse-Coupled neural network with 10 oscillators.net=pcnn_network(10)# Perform simulation during 100 steps using binary external stimulus.dynamic=net.simulate(50, [1,1,1,0,0,0,0,1,1,1])# Allocate synchronous ensembles from the output dynamic.ensembles=dynamic.allocate_sync_ensembles()# Show output dynamic.pcnn_visualizer.show_output_dynamic(dynamic,ensembles)
Simulation of chaotic neural network CNN
frompyclustering.clusterimportcluster_visualizerfrompyclustering.samples.definitionsimportSIMPLE_SAMPLESfrompyclustering.utilsimportread_samplefrompyclustering.nnet.cnnimportcnn_network,cnn_visualizer# Load stimulus from file.stimulus=read_sample(SIMPLE_SAMPLES.SAMPLE_SIMPLE3)# Create chaotic neural network, amount of neurons should be equal to amount of stimulus.network_instance=cnn_network(len(stimulus))# Perform simulation during 100 steps.steps=100output_dynamic=network_instance.simulate(steps,stimulus)# Display output dynamic of the network.cnn_visualizer.show_output_dynamic(output_dynamic)# Display dynamic matrix and observation matrix to show clustering phenomenon.cnn_visualizer.show_dynamic_matrix(output_dynamic)cnn_visualizer.show_observation_matrix(output_dynamic)# Visualize clustering results.clusters=output_dynamic.allocate_sync_ensembles(10)visualizer=cluster_visualizer()visualizer.append_clusters(clusters,stimulus)visualizer.show()
Cluster allocation on FCPS dataset collection by DBSCAN:
Cluster allocation by OPTICS using cluster-ordering diagram:
Partial synchronization (clustering) in Sync oscillatory network:
Cluster visualization by SOM (Self-Organized Feature Map)
About
pyclustering is a Python, C++ data mining library.
Topics
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Contributors10
Uh oh!
There was an error while loading.Please reload this page.