Movatterモバイル変換


[0]ホーム

URL:


SlideShare a Scribd company logo

Custom Pregel Algorithms in ArangoDB

0 likes927 views
ArangoDB Database
ArangoDB Database

ArangoDB's Pregel framework simplifies distributed graph processing with support for predefined and customizable algorithms like PageRank and shortest path. The experimental feature allows users to add or modify algorithms dynamically without needing C++ code or restarting the database. Future developments will include gathering user feedback and enhancing the user-friendly front-end language in upcoming ArangoDB versions.

1 of 21
Download to read offline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Feature Preview: Custom PregelComplex Graph Algorithms made Easy@arangodb @joerg_schad @hkernbach
2tl;dr● “Many practical computing problems concern largegraphs.”● ArangoDB is a “Beyond Graph Database”supporting multiple data models around a scalablegraph foundation● Pregel is a framework for distributed graphprocessing○ ArangoDB supports predefined Prgel algorithms, e.g.PageRank, Single-Source Shortest Path and Connectedcomponents.● Programmable Pregel Algorithms (PPA) allowsadding/modifying algorithms on the flightDisclaimerThis is an experimentalfeature and especially thelanguage specification(front-end) is still underdevelopment!
Jörg Schad, PhDHead of Engineering and ML@ArangoDB● Suki.ai● Mesosphere● Architect @SAP Hana● PhD Distributed DBSystems● Twitter: @joerg_schad
4Heiko KernbachCore Engineer (Graphs Team)@● Graph● Custom Pregel● Geo / UI● Twitter: @hkernbach● Slack:hkernbach.ArangoDB
5● Open Source● Beyond Graph Database○ Stores, K/V, Documents connected byscalable Graph Processing● Scalable○ Distributed Graphs● AQL - SQL-like multi-model query language● ACID Transactions including Multi CollectionTransactions
https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/
https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/Pregel Max ValueWhile not converged:Communicate: send own value to neighboursCompute: Own value = Max Value from all messages (+ own value) Superstep
ArangoDB and Pregel: Status Quo● https://www.arangodb.com/docs/stable/graphs-pregel.html● https://www.arangodb.com/pregel-community-detection/Available Algorithms● Page Rank● Seeded PageRank● Single-Source Shortest Path● Connected Components○ Component○ WeaklyConnected○ StronglyConnected● Hyperlink-Induced Topic Search(HITS)Permalink● Vertex Centrality● Effective Closeness● LineRank● Label Propagation● Speaker-Listener Label Propagation 8var pregel = require("@arangodb/pregel");pregel.start("pagerank", "graphname", {maxGSS: 100,threshold: 0.00000001, resultField: "rank"})● Pregel support since 2014● Predefined algorithms○ Could be extended via C++● Same platform used for PPAChallengesAdd and modify Algorithms
Programmable Pregel Algorithms (PPA)const pregel = require("@arangodb/pregel");let pregelID = pregel.start("air", graphName, "<custom-algorithm>");var status = pregel.status(pregelID);● Add/Modify algorithms on-the-fly○ Without C++ code○ Without restarting the Database● Efficiency (as Pregel) depends on Sharding○ Smart Graphs○ Required: Collocation of vertices and edges9
Custom Algorithm10{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}AccumulatorsAccumulators are used to consume and process messages which are beingsent to them during the computational phase (initProgram, updateProgram,onPreStep, onPostStep) of a superstep. After a superstep is done, all messageswill be processed.● max: stores the maximum of all messages received.● min: stores the minimum of all messages received.● sum: sums up all messages received.● and: computes and on all messages received.● or: computes or and all messages received.● store: holds the last received value (non-deterministic).● list: stores all received values in list (order is non-deterministic).● custom
Custom Algorithm11{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}● resultField (string, optional): Name of the document attribute to store the result in. Thevertex computation results will be in all vertices pointing to the given attribute.● maxGSS (number, required): The max amount of global supersteps After the amount of maxdefined supersteps is reached, the Pregel execution will stop.● dataAccess (object, optional): Allows to define writeVertex, readVertex and readEdge.○ writeVertex: A program that is used to write the results into vertices. If writeVertex isused, the resultField will be ignored.○ readVertex: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single attribute at the top level.■ array of strings: Represents a nested path○ readEdge: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single path at the top level which is not nested.■ array of strings: Represents a nested path● vertexAccumulators (object, optional): Definition of all used vertex accumulators.● globalAccumulators (object, optional): Definition all used global accumulators. GlobalAccumulators are able to access variables at shared global level.● customAccumulators (object, optional): Definition of all used custom accumulators.● phases (array): Array of a single or multiple phase definitions.● debug (optional): See Debugging.
Phases - Execution order12Step 1: Initialization1. onPreStep (Conductor, executed on Coordinatorinstances)2. initProgram (Worker, executed on DB-Server instances)3. onPostStep (Conductor)Step {2, ...n} Computation1. onPreStep (Conductor)2. updateProgram (Worker)3. onPostStep (Conductor)
Program - Arango Intermediate Representation (AIR)13
Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation, represented inJSON and supports its data types14Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation,represented in JSON and supports its data types15Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
PregelatorSimple Foxx service based IDE16https://github.com/arangodb-foxx/pregelator
Custom Pregel Algorithms in ArangoDB
PPA: What is next?- Gather Feedback- In particular use-cases- Missing functions & functionality- User-friendly Front-End language- Improve Scale/Performance of underlyingPregel platform- Algorithm library- Blog Post (including Jupyter example)18ArangoDB 3.8 (end of year)- Experimental Feature- Initial LibraryArangoDB 3.9 (Q1 21)- Draft for Front-End- Extended Library- Platform ImprovementsArangoDB 4.0 (Mid 21)- GA
Pregel vs AQLWhen to (not) use Pregel…- Can the algorithm be efficiently beexpressed in Pregel?- Counter example: Topological Sort- Is the graph size worth the loading?19AQL PregelAll Models (Graph, Document, Key-Value, Search, …) Iterative Graph ProcessingOnline Queries Large Graphs, multiple iterations
How can I start?● Docker Image: arangodb/enterprise-preview:3.8.0-milestone.3● Check existing algorithms● Preview documentation● Give Feedback○ https://slack.arangodb.com/ -> custom-pregel20
Thanks for listening!21Reach out with Feedback/Questions!• @arangodb• https://www.arangodb.com/• docker pull arangodbTest-drive Oasis14-days for free
Ad

Recommended

PPT
Chapter 6 intermediate code generation
Vipul Naik
 
PPT
Two one Problem artificial intelligence
Wasim Raza
 
PDF
Unit8: Uncertainty in AI
Tekendra Nath Yogi
 
PDF
Heuristic search-in-artificial-intelligence
grinu
 
PPT
Agents_AI.ppt
sandeep54552
 
PPTX
Forms of learning in ai
Robert Antony
 
PPTX
The dag representation of basic blocks
Shabeen Taj
 
PPTX
Hci – Project Presentation
slmsaady
 
PPTX
Adversarial search
Nilu Desai
 
PPSX
Inheritance
Selvin Josy Bai Somu
 
PPTX
Knowledge representation and Predicate logic
Amey Kerkar
 
PPT
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
PDF
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
PPTX
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PPT
PHP Regular Expressions
Jussi Pohjolainen
 
PPT
Debugging
Indu Sharma Bhardwaj
 
PDF
Modern Python Testing
Alexander Loechel
 
PPT
5.3 mining sequential patterns
Krish_ver2
 
PPTX
Job sequencing with deadlines(with example)
Vrinda Sheela
 
PDF
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
PPTX
R Programming Language
NareshKarela1
 
PDF
Recursive algorithms
subhashchandra197
 
PDF
I. AO* SEARCH ALGORITHM
vikas dhakane
 
PDF
An introduction to Google test framework
Abner Chih Yi Huang
 
PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
PDF
Solidity- Error Handling
Tutorials Diary
 
PDF
Design and Implementation of the Security Graph Language
Asankhaya Sharma
 
PPTX
Dart the Better JavaScript
Jorg Janke
 

More Related Content

What's hot(20)

PPTX
Adversarial search
Nilu Desai
 
PPSX
Inheritance
Selvin Josy Bai Somu
 
PPTX
Knowledge representation and Predicate logic
Amey Kerkar
 
PPT
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
PDF
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
PPTX
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PPT
PHP Regular Expressions
Jussi Pohjolainen
 
PPT
Debugging
Indu Sharma Bhardwaj
 
PDF
Modern Python Testing
Alexander Loechel
 
PPT
5.3 mining sequential patterns
Krish_ver2
 
PPTX
Job sequencing with deadlines(with example)
Vrinda Sheela
 
PDF
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
PPTX
R Programming Language
NareshKarela1
 
PDF
Recursive algorithms
subhashchandra197
 
PDF
I. AO* SEARCH ALGORITHM
vikas dhakane
 
PDF
An introduction to Google test framework
Abner Chih Yi Huang
 
PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
PDF
Solidity- Error Handling
Tutorials Diary
 
Adversarial search
Nilu Desai
 
Knowledge representation and Predicate logic
Amey Kerkar
 
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PHP Regular Expressions
Jussi Pohjolainen
 
Modern Python Testing
Alexander Loechel
 
5.3 mining sequential patterns
Krish_ver2
 
Job sequencing with deadlines(with example)
Vrinda Sheela
 
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
R Programming Language
NareshKarela1
 
Recursive algorithms
subhashchandra197
 
I. AO* SEARCH ALGORITHM
vikas dhakane
 
An introduction to Google test framework
Abner Chih Yi Huang
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
Solidity- Error Handling
Tutorials Diary
 

Similar to Custom Pregel Algorithms in ArangoDB(20)

PDF
Design and Implementation of the Security Graph Language
Asankhaya Sharma
 
PPTX
Dart the Better JavaScript
Jorg Janke
 
PDF
Building Your First Apache Apex Application
Apache Apex
 
PDF
Building your first aplication using Apache Apex
Yogi Devendra Vyavahare
 
PPTX
GraphQL & DGraph with Go
James Tan
 
PPTX
Hadoop and HBase experiences in perf log project
Mao Geng
 
PPTX
Oracle to Postgres Schema Migration Hustle
EDB
 
PDF
Big Data processing with Apache Spark
Lucian Neghina
 
PDF
Java 8
vilniusjug
 
PDF
Dart the better Javascript 2015
Jorg Janke
 
PDF
BUD17-302: LLVM Internals #2
Linaro
 
PPTX
Spark Concepts - Spark SQL, Graphx, Streaming
Petr Zapletal
 
PPTX
CS267_Graph_Lab
JaideepKatkar
 
PPTX
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
PPTX
Apache Hive for modern DBAs
Luis Marques
 
PDF
Meetup C++ A brief overview of c++17
Daniel Eriksson
 
PDF
Design for Scalability in ADAM
fnothaft
 
PDF
Apache spark - Spark's distributed programming model
Martin Zapletal
 
PPTX
Java High Level Stream API
Apache Apex
 
PPTX
Tech Talk - Overview of Dash framework for building dashboards
Appsilon Data Science
 
Design and Implementation of the Security Graph Language
Asankhaya Sharma
 
Dart the Better JavaScript
Jorg Janke
 
Building Your First Apache Apex Application
Apache Apex
 
Building your first aplication using Apache Apex
Yogi Devendra Vyavahare
 
GraphQL & DGraph with Go
James Tan
 
Hadoop and HBase experiences in perf log project
Mao Geng
 
Oracle to Postgres Schema Migration Hustle
EDB
 
Big Data processing with Apache Spark
Lucian Neghina
 
Java 8
vilniusjug
 
Dart the better Javascript 2015
Jorg Janke
 
BUD17-302: LLVM Internals #2
Linaro
 
Spark Concepts - Spark SQL, Graphx, Streaming
Petr Zapletal
 
CS267_Graph_Lab
JaideepKatkar
 
AWS Big Data Demystified #2 | Athena, Spectrum, Emr, Hive
Omid Vahdaty
 
Apache Hive for modern DBAs
Luis Marques
 
Meetup C++ A brief overview of c++17
Daniel Eriksson
 
Design for Scalability in ADAM
fnothaft
 
Apache spark - Spark's distributed programming model
Martin Zapletal
 
Java High Level Stream API
Apache Apex
 
Tech Talk - Overview of Dash framework for building dashboards
Appsilon Data Science
 
Ad

More from ArangoDB Database(20)

PPTX
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
PPTX
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
PDF
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
PDF
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
PDF
Graph Analytics with ArangoDB
ArangoDB Database
 
PDF
Getting Started with ArangoDB Oasis
ArangoDB Database
 
PPTX
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
PDF
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
PDF
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
PDF
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
PDF
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
PDF
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
PDF
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
PDF
3.5 webinar
ArangoDB Database
 
PDF
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
PDF
An introduction to multi-model databases
ArangoDB Database
 
PDF
Running complex data queries in a distributed system
ArangoDB Database
 
PDF
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
Graph Analytics with ArangoDB
ArangoDB Database
 
Getting Started with ArangoDB Oasis
ArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
3.5 webinar
ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
An introduction to multi-model databases
ArangoDB Database
 
Running complex data queries in a distributed system
ArangoDB Database
 
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
Ad

Recently uploaded(20)

PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PPT
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
PPTX
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
PPTX
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
PDF
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
PPTX
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
PDF
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPTX
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
PDF
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PPT
1 DATALINK CONTROL and it's applications
karunanidhilithesh
 
PDF
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
PPTX
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
PDF
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
Performance Report Sample (Draft7).pdf
AmgadMaher5
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
01 presentation finyyyal معهد معايره.ppt
eltohamym057
 
加拿大尼亚加拉学院毕业证书{Niagara在读证明信Niagara成绩单修改}复刻
Taqyea
 
apidays Munich 2025 - Building an AWS Serverless Application with Terraform, ...
apidays
 
Avatar for apidays apidays PRO June 07, 2025 0 5 apidays Helsinki & North 2...
apidays
 
Exploring Multilingual Embeddings for Italian Semantic Search: A Pretrained a...
Sease
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
MusicVideoProjectRubric Animation production music video.pdf
ALBERTIANCASUGA
 
WEF_Future_of_Global_Fintech_Second_Edition_2025.pdf
AproximacionAlFuturo
 
Climate Action.pptx action plan for climate
justfortalabat
 
Resmed Rady Landis May 4th - analytics.pptx
Adrian Limanto
 
R Cookbook - Processing and Manipulating Geological spatial data with R.pdf
OtnielSimopiaref2
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
1 DATALINK CONTROL and it's applications
karunanidhilithesh
 
apidays Helsinki & North 2025 - How (not) to run a Graphql Stewardship Group,...
apidays
 
apidays Munich 2025 - Building Telco-Aware Apps with Open Gateway APIs, Subhr...
apidays
 
T2_01 Apuntes La Materia.pdfxxxxxxxxxxxxxxxxxxxxxxxxxxxxxskksk
mathiasdasilvabarcia
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
Performance Report Sample (Draft7).pdf
AmgadMaher5
 

Custom Pregel Algorithms in ArangoDB

  • 1.Feature Preview: Custom PregelComplex Graph Algorithms made Easy@arangodb @joerg_schad @hkernbach
  • 2.2tl;dr● “Many practical computing problems concern largegraphs.”● ArangoDB is a “Beyond Graph Database”supporting multiple data models around a scalablegraph foundation● Pregel is a framework for distributed graphprocessing○ ArangoDB supports predefined Prgel algorithms, e.g.PageRank, Single-Source Shortest Path and Connectedcomponents.● Programmable Pregel Algorithms (PPA) allowsadding/modifying algorithms on the flightDisclaimerThis is an experimentalfeature and especially thelanguage specification(front-end) is still underdevelopment!
  • 3.Jörg Schad, PhDHead of Engineering and ML@ArangoDB● Suki.ai● Mesosphere● Architect @SAP Hana● PhD Distributed DBSystems● Twitter: @joerg_schad
  • 4.4Heiko KernbachCore Engineer (Graphs Team)@● Graph● Custom Pregel● Geo / UI● Twitter: @hkernbach● Slack:hkernbach.ArangoDB
  • 5.5● Open Source● Beyond Graph Database○ Stores, K/V, Documents connected byscalable Graph Processing● Scalable○ Distributed Graphs● AQL - SQL-like multi-model query language● ACID Transactions including Multi CollectionTransactions
  • 7.https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/Pregel Max ValueWhile not converged:Communicate: send own value to neighboursCompute: Own value = Max Value from all messages (+ own value) Superstep
  • 8.ArangoDB and Pregel: Status Quo● https://www.arangodb.com/docs/stable/graphs-pregel.html● https://www.arangodb.com/pregel-community-detection/Available Algorithms● Page Rank● Seeded PageRank● Single-Source Shortest Path● Connected Components○ Component○ WeaklyConnected○ StronglyConnected● Hyperlink-Induced Topic Search(HITS)Permalink● Vertex Centrality● Effective Closeness● LineRank● Label Propagation● Speaker-Listener Label Propagation 8var pregel = require("@arangodb/pregel");pregel.start("pagerank", "graphname", {maxGSS: 100,threshold: 0.00000001, resultField: "rank"})● Pregel support since 2014● Predefined algorithms○ Could be extended via C++● Same platform used for PPAChallengesAdd and modify Algorithms
  • 9.Programmable Pregel Algorithms (PPA)const pregel = require("@arangodb/pregel");let pregelID = pregel.start("air", graphName, "<custom-algorithm>");var status = pregel.status(pregelID);● Add/Modify algorithms on-the-fly○ Without C++ code○ Without restarting the Database● Efficiency (as Pregel) depends on Sharding○ Smart Graphs○ Required: Collocation of vertices and edges9
  • 10.Custom Algorithm10{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}AccumulatorsAccumulators are used to consume and process messages which are beingsent to them during the computational phase (initProgram, updateProgram,onPreStep, onPostStep) of a superstep. After a superstep is done, all messageswill be processed.● max: stores the maximum of all messages received.● min: stores the minimum of all messages received.● sum: sums up all messages received.● and: computes and on all messages received.● or: computes or and all messages received.● store: holds the last received value (non-deterministic).● list: stores all received values in list (order is non-deterministic).● custom
  • 11.Custom Algorithm11{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}● resultField (string, optional): Name of the document attribute to store the result in. Thevertex computation results will be in all vertices pointing to the given attribute.● maxGSS (number, required): The max amount of global supersteps After the amount of maxdefined supersteps is reached, the Pregel execution will stop.● dataAccess (object, optional): Allows to define writeVertex, readVertex and readEdge.○ writeVertex: A program that is used to write the results into vertices. If writeVertex isused, the resultField will be ignored.○ readVertex: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single attribute at the top level.■ array of strings: Represents a nested path○ readEdge: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single path at the top level which is not nested.■ array of strings: Represents a nested path● vertexAccumulators (object, optional): Definition of all used vertex accumulators.● globalAccumulators (object, optional): Definition all used global accumulators. GlobalAccumulators are able to access variables at shared global level.● customAccumulators (object, optional): Definition of all used custom accumulators.● phases (array): Array of a single or multiple phase definitions.● debug (optional): See Debugging.
  • 12.Phases - Execution order12Step 1: Initialization1. onPreStep (Conductor, executed on Coordinatorinstances)2. initProgram (Worker, executed on DB-Server instances)3. onPostStep (Conductor)Step {2, ...n} Computation1. onPreStep (Conductor)2. updateProgram (Worker)3. onPostStep (Conductor)
  • 13.Program - Arango Intermediate Representation (AIR)13
  • 14.Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation, represented inJSON and supports its data types14Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
  • 15.Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation,represented in JSON and supports its data types15Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
  • 16.PregelatorSimple Foxx service based IDE16https://github.com/arangodb-foxx/pregelator
  • 18.PPA: What is next?- Gather Feedback- In particular use-cases- Missing functions & functionality- User-friendly Front-End language- Improve Scale/Performance of underlyingPregel platform- Algorithm library- Blog Post (including Jupyter example)18ArangoDB 3.8 (end of year)- Experimental Feature- Initial LibraryArangoDB 3.9 (Q1 21)- Draft for Front-End- Extended Library- Platform ImprovementsArangoDB 4.0 (Mid 21)- GA
  • 19.Pregel vs AQLWhen to (not) use Pregel…- Can the algorithm be efficiently beexpressed in Pregel?- Counter example: Topological Sort- Is the graph size worth the loading?19AQL PregelAll Models (Graph, Document, Key-Value, Search, …) Iterative Graph ProcessingOnline Queries Large Graphs, multiple iterations
  • 20.How can I start?● Docker Image: arangodb/enterprise-preview:3.8.0-milestone.3● Check existing algorithms● Preview documentation● Give Feedback○ https://slack.arangodb.com/ -> custom-pregel20
  • 21.Thanks for listening!21Reach out with Feedback/Questions!• @arangodb• https://www.arangodb.com/• docker pull arangodbTest-drive Oasis14-days for free

[8]ページ先頭

©2009-2025 Movatter.jp