- Notifications
You must be signed in to change notification settings - Fork1
sfproductlabs/jelass
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
With a database like this, all your friends will be jealous.
Elassandra stores Elastic data on Cassandra. So there's no double up on this system. Cassandra is the boss. Elastic runs on top of it and allows it to be useful (searchable, querying etc.). Janus comes to town and adds all the graph functionality LinkedIn could ever need. All under the one roof.
How is this different from straight Janus? Janus' elastic data isn't stored in Cassandra. This has all 3 together. Bullet-proof.
https://hub.docker.com/repository/docker/sfproductlabs/jelass
See examplehttps://github.com/sfproductlabs/tracker/blob/master/docker-compose.yml
Ensure you have enough memory.
cqlsh --ssl
- Remotely:
./bin/gremlin.sh
then:remote connect tinkerpop.server conf/remote.yaml
- Or locally:
./bin/gremlin.sh
thengraph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')
- etc.
Then try the basic demo:
On the console hosting docker run:
docker ps#then replace [container_number] with your docker container hashdockerexec -it [container_number] bash
Then inside the docker container:
cd /app/ela/janusgraph-full-0.5.2./bin/gremlin.sh
Then inside thegremlin>
console (also works remotely) you may need to change the ip:
cluster = Cluster.open('conf/remote-objects.yaml')graph = EmptyGraph.instance()g = graph.traversal().withRemote(DriverRemoteConnection.using(cluster, "g"))// graph = EmptyGraph.instance()// g = graph.traversal().withRemote('conf/remote-graph.properties')// TinkerPop Predicatesg.V().has('age',within(5000))g.V().has('age',without(5000))g.V().has('age',within(5000,45))g.V().has('age',inside(45,5000)).valueMap(true)g.V().and(has('age',between(45,5000)),has('name',within('pluto'))).valueMap(true)g.V().or(has('age',between(45,5000)),has('name',within('pluto','neptune'))).valueMap(true)// Janus Graph Geo Predicatesg.E().has('place', geoIntersect(Geoshape.circle(37.97, 23.72, 50)))g.E().has('place', geoWithin(Geoshape.circle(37.97, 23.72, 50)))g.E().has('place', geoDisjoint(Geoshape.circle(37.97, 23.72, 50)))// master branch onlyg.addV().property('place', Geoshape.circle(37.97, 23.72, 50))g.V().has('place', geoContains(Geoshape.point(37.97, 23.72)))// Janus Graph Text Predicatesg.V().has('name',textContains('neptune')).valueMap(true)g.V().has('name',textContainsPrefix('nep')).valueMap(true)g.V().has('name',textContainsRegex('nep.*')).valueMap(true)g.V().has('name',textPrefix('n')).valueMap(true)g.V().has('name',textRegex('.*n.*')).valueMap(true)// master branch onlyg.V().has('name',textContainsFuzzy('neptun')).valueMap(true)g.V().has('name',textFuzzy('nepitne')).valueMap(true)
You can also run the examples locally:
graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')GraphOfTheGodsFactory.load(graph)g = graph.traversal()saturn = g.V().has('name', 'saturn').next()g.V(saturn).valueMap()g.V(saturn).in('father').in('father').values('name')//Add a fulltext index on a new property aliasmgmt = graph.openManagement()summary = mgmt.makePropertyKey('alias').dataType(String.class).make()mgmt.buildIndex('alias', Vertex.class).addKey(summary, Mapping.TEXTSTRING.asParameter()).buildMixedIndex("search")mgmt.commit()g.addV('person').property('alias','bob')g.V().has('alias', textContains('bob')).hasNext()graph.tx().commit()
./bin/gremlin.sh
graph = JanusGraphFactory.open('conf/gremlin-server/janusgraph-cql-es-server.properties')g = graph.traversal()g.V().drop().iterate()
or
JanusGraphFactory.drop(graph);
mgmt = graph.openManagement()mgmt.printSchema()
https://elassandra.readthedocs.io/
On a production environment, we recommand to to modify some system settings such as disabling swap. This guide shows you how to do it. On linux, you should install jemalloc.
Setup batch loading for the service:
echo "storage.batch-loading=true" >> ./conf/gremlin-server/janusgraph-cql-es-server.propertiesecho "schema.default=none" >> ./conf/gremlin-server/janusgraph-cql-es-server.properties
docker run -p 8889:8888 -d --name graph-explorer sfproductlabs/graph-explorer:latest
Open the Url:http://localhost:8889
Then connect to:ws://localhost:8182/gremlin
The creator created a great littleCRUD intro.
After you have created your first few nodes and edges try this in the query editor:
nodes=g.V().toList();edges=g.E().toList();[nodes,edges]
https://cassandra.apache.org/third-party/
Backup asingle instance (example uses keyspacescrp
replace your keyspace name with this):
cqlsh --ssl -e"desc scrp"> /tmp/scrp.cqlnodetool snapshot scrpcd /var/lib/cassandratar -czvf /tmp/scrp.tgz$(find. -type f| grep 1603309754293)
Restore the instance by copying into a directory:
cd /tmp/cqlsh --ssl -f /tmp/scrp.cqltar -xzvf /tmp/scrp.tgzcd /tmp/data/find. -type f -execdir mv {} ../..\;cd scrpforxin*;do sstableloader -v --conf-path /etc/cassandra/cassandra.yaml -d 172.19.0.3 /tmp/data/scrp/$x;done
I personally use a grandfather,father,son model for backups using a tool called Borg:
https://www.borgbackup.org/demo.html
curl -XGET http://$CASSANDRA_HOST:9200/_cluster/state?prettynodetool repair -fullnodetool cleanupnodetool flush#nodetool rebuild_index sfpla events_recent events_recent_idxnodetool gossipinfonodetool tpstatsnodetool describeclusternodetool statusthriftnodetool statusgossipnodetool ringnodetool statusnodetool status elastic_adminnodetool cfstats| grepread| grep latency#less /var/log/cassandra/system.log# ...#cqlsh --ssl#cqlsh>select * from elastic_admin.Metadata_log;
https://docs.janusgraph.org/connecting/python/
- TODO: Connecting to spark/superset
- TODO: Visualization in Elassandra. Superset. Spark.
- Docker with SSL by default
- Nginx SSL for elastic search (Available on port 443 & port 9343, using nginx reverse proxy)
- Cassandra client and server keystores by default
- TODO: add nginx streaming SSL for tinkerpop on 8182
Ex.alter keyspace elastic_admin WITH replication = {'class': 'NetworkTopologyStrategy', 'DC1' : 2};