- Notifications
You must be signed in to change notification settings - Fork32
🌶️ An ecosystem in Python for working with the Biological Expression Language (BEL)
License
pybel/pybel
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
PyBEL is a pure Python package for parsing and handling biological networks encoded intheBiological Expression Language(BEL).
It facilitates data interchange between data formats likeNetworkX,Node-Link JSON,JGIF, CSV, SIF,Cytoscape,CX,INDRA, andGraphDati; database systemslike SQL andNeo4J; and web services likeNDEx,BioDati Studio, andBEL Commons. It alsoprovides exports for analytical tools likeHiPathia,Drug2ways andSPIA;machine learning tools likePyKEEN andOpenBioLink; and others.
Its companion package,PyBEL Tools, contains asuite of functions and pipelines for analyzing the resulting biological networks.
We realize that we have a name conflict with the python wrapper for the cheminformatics package, OpenBabel. If you'relooking for their python wrapper, seehere.
If you find PyBEL useful for your work, please consider citing:
[1] | Hoyt, C. T.,et al. (2017).PyBEL: a Computational Framework for Biological Expression Language.Bioinformatics, 34(December), 1–2. |
PyBEL can be installed easily fromPyPI with the following code inyour favorite shell:
$ pip install pybel
or from the latest code onGitHub with:
$ pip install git+https://github.com/pybel/pybel.git
See theinstallation documentation for more advancedinstructions. Also, check the change log atCHANGELOG.rst.
More examples can be found in thedocumentation and in thePyBEL Notebooks repository.
This example illustrates how the a BEL document from theHuman Brain Pharmacome project can be loaded and compiled directly from GitHub.
>>>importpybel>>>url='https://raw.githubusercontent.com/pharmacome/conib/master/hbp_knowledge/proteostasis/kim2013.bel'>>>graph=pybel.from_bel_script_url(url)
Other functions for loading BEL content from many formats can be found in theI/O documentation.Note that PyBEL can handleBEL 1.0andBEL 2.0+simultaneously.
After you have a BEL graph, there are numerous ways to save it. Thepybel.dump
function knowshow to output it in many formats based on the file extension you give. For all of the possibilities,check theI/O documentation.
>>>importpybel>>>graph= ...>>># write as BEL>>>pybel.dump(graph,'my_graph.bel')>>># write as Node-Link JSON for network viewers like D3>>>pybel.dump(graph,'my_graph.bel.nodelink.json')>>># write as GraphDati JSON for BioDati>>>pybel.dump(graph,'my_graph.bel.graphdati.json')>>># write as CX JSON for NDEx>>>pybel.dump(graph,'my_graph.bel.cx.json')>>># write as INDRA JSON for INDRA>>>pybel.dump(graph,'my_graph.indra.json')
TheBELGraph
object has several "dispatches" which are properties that organize its various functionalities.One is theBELGraph.summarize
dispatch, which allows for printing summaries to the console.
These examples will use theRAS Model from EMMAA,so you'll have to be sure topip install indra
first. The graph can be acquired and summarized withBELGraph.summarize.statistics()
as in:
>>>importpybel>>>graph=pybel.from_emmaa('rasmodel',date='2020-05-29-17-31-58')# Needs>>>graph.summarize.statistics()----------------------------------------NamerasmodelVersion2020-05-29-17-31-58NumberofNodes126NumberofNamespaces5NumberofEdges206NumberofAnnotations4NumberofCitations1NumberofAuthors0NetworkDensity1.31E-02NumberofComponents1NumberofWarnings0----------------------------------------
The number of nodes of each type can be summarized withBELGraph.summarize.nodes()
as in:
>>>graph.summarize.nodes(examples=False)Type (3)Count-------------------Protein97Complex27Abundance2
The number of nodes with each namespace can be summarized withBELGraph.summarize.namespaces()
as in:
>>>graph.summarize.namespaces(examples=False)Namespace (4)Count----------------------HGNC94FPLX3CHEBI1TEXT1
The edges can be summarized withBELGraph.summarize.edges()
as in:
>>>graph.summarize.edges(examples=False)EdgeType (12)Count----------------------------------------ProteinincreasesProtein64ProteinhasVariantProtein48ProteinpartOfComplex47ComplexincreasesProtein20ProteindecreasesProtein9ComplexdirectlyIncreasesProtein8ProteinincreasesComplex3AbundancepartOfComplex3ProteinincreasesAbundance1ComplexpartOfComplex1ProteindecreasesAbundance1AbundancedecreasesProtein1
Not all BEL graphs contain both the name and identifier for each entity. Some even use non-standard prefixes(also callednamespaces in BEL). Usually, BEL graphs are validated against controlled vocabularies,so the following demo shows how to add the corresponding identifiers to all nodes.
fromurllib.requestimporturlretrieveurl='https://github.com/cthoyt/selventa-knowledge/blob/master/selventa_knowledge/large_corpus.bel.nodelink.json.gz'urlretrieve(url,'large_corpus.bel.nodelink.json.gz')importpybelgraph=pybel.load('large_corpus.bel.nodelink.json.gz')importpybel.groundinggrounded_graph=pybel.grounding.ground(graph)
Note: you have to installpyobo
for this to work and be running Python 3.7+.
After installingjinja2
andipython
, BEL graphs can be displayed in Jupyter notebooks.
>>>frompybel.examplesimportsialic_acid_graph>>>frompybel.io.jupyterimportto_jupyter>>>to_jupyter(sialic_acid_graph)
If you don't want to use thepybel.BELGraph
data structure and just want to turn BEL statements into JSONfor your own purposes, you can directly use thepybel.parse()
function.
>>>importpybel>>>pybel.parse('p(hgnc:4617 ! GSK3B) regulates p(hgnc:6893 ! MAPT)'){'source': {'function':'Protein','concept': {'namespace':'hgnc','identifier':'4617','name':'GSK3B'}},'relation':'regulates','target': {'function':'Protein','concept': {'namespace':'hgnc','identifier':'6893','name':'MAPT'}}}
This functionality can also be exposed through a Flask-based web application withpython -m pybel.apps.parser
afterinstallingflask
withpip install flask
. Note that the first run requires about a ~2 second delay to generatethe parser, after which each parse is very fast.
PyBEL also installs a command line interface with the commandpybel
for simple utilities such as dataconversion. In this example, a BEL document is compiled then exported toGraphMLfor viewing in Cytoscape.
$ pybel compile~/Desktop/example.bel$ pybel serialize~/Desktop/example.bel --graphml~/Desktop/example.graphml
In Cytoscape, open withImport > Network > From File
.
Contributions, whether filing an issue, making a pull request, or forking, are appreciated. SeeCONTRIBUTING.rst for more information on gettinginvolved.
The development of PyBEL has been supported by several projects/organizations (in alphabetical order):
- The Cytoscape Consortium
- Enveda Biosciences
- Fraunhofer Center for Machine Learning
- Fraunhofer Institute for Algorithms and Scientific Computing (SCAI)
- Harvard Program in Therapeutic Science - Laboratory of Systems Pharmacology
- University of Bonn
- DARPA Young Faculty Award W911NF2010255 (PI: Benjamin M. Gyori).
- TheEuropean Union,European Federation of Pharmaceutical Industries and Associations(EFPIA), andInnovative Medicines Initiative JointUndertaking underAETIONOMY [grant number 115568], resources of whichare composed of financial contribution from the European Union's Seventh Framework Programme (FP7/2007-2013) andEFPIA companies in kind contribution.
The PyBELlogo was designed byScott Colby.
About
🌶️ An ecosystem in Python for working with the Biological Expression Language (BEL)