Movatterモバイル変換


[0]ホーム

URL:


SlideShare a Scribd company logo

Custom Pregel Algorithms in ArangoDB

0 likes927 views
ArangoDB Database
ArangoDB Database

ArangoDB's Pregel framework simplifies distributed graph processing with support for predefined and customizable algorithms like PageRank and shortest path. The experimental feature allows users to add or modify algorithms dynamically without needing C++ code or restarting the database. Future developments will include gathering user feedback and enhancing the user-friendly front-end language in upcoming ArangoDB versions.

1 of 21
Download to read offline
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
Feature Preview: Custom PregelComplex Graph Algorithms made Easy@arangodb @joerg_schad @hkernbach
2tl;dr● “Many practical computing problems concern largegraphs.”● ArangoDB is a “Beyond Graph Database”supporting multiple data models around a scalablegraph foundation● Pregel is a framework for distributed graphprocessing○ ArangoDB supports predefined Prgel algorithms, e.g.PageRank, Single-Source Shortest Path and Connectedcomponents.● Programmable Pregel Algorithms (PPA) allowsadding/modifying algorithms on the flightDisclaimerThis is an experimentalfeature and especially thelanguage specification(front-end) is still underdevelopment!
Jörg Schad, PhDHead of Engineering and ML@ArangoDB● Suki.ai● Mesosphere● Architect @SAP Hana● PhD Distributed DBSystems● Twitter: @joerg_schad
4Heiko KernbachCore Engineer (Graphs Team)@● Graph● Custom Pregel● Geo / UI● Twitter: @hkernbach● Slack:hkernbach.ArangoDB
5● Open Source● Beyond Graph Database○ Stores, K/V, Documents connected byscalable Graph Processing● Scalable○ Distributed Graphs● AQL - SQL-like multi-model query language● ACID Transactions including Multi CollectionTransactions
https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/
https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/Pregel Max ValueWhile not converged:Communicate: send own value to neighboursCompute: Own value = Max Value from all messages (+ own value) Superstep
ArangoDB and Pregel: Status Quo● https://www.arangodb.com/docs/stable/graphs-pregel.html● https://www.arangodb.com/pregel-community-detection/Available Algorithms● Page Rank● Seeded PageRank● Single-Source Shortest Path● Connected Components○ Component○ WeaklyConnected○ StronglyConnected● Hyperlink-Induced Topic Search(HITS)Permalink● Vertex Centrality● Effective Closeness● LineRank● Label Propagation● Speaker-Listener Label Propagation 8var pregel = require("@arangodb/pregel");pregel.start("pagerank", "graphname", {maxGSS: 100,threshold: 0.00000001, resultField: "rank"})● Pregel support since 2014● Predefined algorithms○ Could be extended via C++● Same platform used for PPAChallengesAdd and modify Algorithms
Programmable Pregel Algorithms (PPA)const pregel = require("@arangodb/pregel");let pregelID = pregel.start("air", graphName, "<custom-algorithm>");var status = pregel.status(pregelID);● Add/Modify algorithms on-the-fly○ Without C++ code○ Without restarting the Database● Efficiency (as Pregel) depends on Sharding○ Smart Graphs○ Required: Collocation of vertices and edges9
Custom Algorithm10{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}AccumulatorsAccumulators are used to consume and process messages which are beingsent to them during the computational phase (initProgram, updateProgram,onPreStep, onPostStep) of a superstep. After a superstep is done, all messageswill be processed.● max: stores the maximum of all messages received.● min: stores the minimum of all messages received.● sum: sums up all messages received.● and: computes and on all messages received.● or: computes or and all messages received.● store: holds the last received value (non-deterministic).● list: stores all received values in list (order is non-deterministic).● custom
Custom Algorithm11{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}● resultField (string, optional): Name of the document attribute to store the result in. Thevertex computation results will be in all vertices pointing to the given attribute.● maxGSS (number, required): The max amount of global supersteps After the amount of maxdefined supersteps is reached, the Pregel execution will stop.● dataAccess (object, optional): Allows to define writeVertex, readVertex and readEdge.○ writeVertex: A program that is used to write the results into vertices. If writeVertex isused, the resultField will be ignored.○ readVertex: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single attribute at the top level.■ array of strings: Represents a nested path○ readEdge: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single path at the top level which is not nested.■ array of strings: Represents a nested path● vertexAccumulators (object, optional): Definition of all used vertex accumulators.● globalAccumulators (object, optional): Definition all used global accumulators. GlobalAccumulators are able to access variables at shared global level.● customAccumulators (object, optional): Definition of all used custom accumulators.● phases (array): Array of a single or multiple phase definitions.● debug (optional): See Debugging.
Phases - Execution order12Step 1: Initialization1. onPreStep (Conductor, executed on Coordinatorinstances)2. initProgram (Worker, executed on DB-Server instances)3. onPostStep (Conductor)Step {2, ...n} Computation1. onPreStep (Conductor)2. updateProgram (Worker)3. onPostStep (Conductor)
Program - Arango Intermediate Representation (AIR)13
Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation, represented inJSON and supports its data types14Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation,represented in JSON and supports its data types15Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
PregelatorSimple Foxx service based IDE16https://github.com/arangodb-foxx/pregelator
Custom Pregel Algorithms in ArangoDB
PPA: What is next?- Gather Feedback- In particular use-cases- Missing functions & functionality- User-friendly Front-End language- Improve Scale/Performance of underlyingPregel platform- Algorithm library- Blog Post (including Jupyter example)18ArangoDB 3.8 (end of year)- Experimental Feature- Initial LibraryArangoDB 3.9 (Q1 21)- Draft for Front-End- Extended Library- Platform ImprovementsArangoDB 4.0 (Mid 21)- GA
Pregel vs AQLWhen to (not) use Pregel…- Can the algorithm be efficiently beexpressed in Pregel?- Counter example: Topological Sort- Is the graph size worth the loading?19AQL PregelAll Models (Graph, Document, Key-Value, Search, …) Iterative Graph ProcessingOnline Queries Large Graphs, multiple iterations
How can I start?● Docker Image: arangodb/enterprise-preview:3.8.0-milestone.3● Check existing algorithms● Preview documentation● Give Feedback○ https://slack.arangodb.com/ -> custom-pregel20
Thanks for listening!21Reach out with Feedback/Questions!• @arangodb• https://www.arangodb.com/• docker pull arangodbTest-drive Oasis14-days for free
Ad

Recommended

PPT
Chapter 6 intermediate code generation
Vipul Naik
 
PPT
Two one Problem artificial intelligence
Wasim Raza
 
PDF
Unit8: Uncertainty in AI
Tekendra Nath Yogi
 
PDF
Heuristic search-in-artificial-intelligence
grinu
 
PPT
Agents_AI.ppt
sandeep54552
 
PPTX
Forms of learning in ai
Robert Antony
 
PPTX
The dag representation of basic blocks
Shabeen Taj
 
PPTX
Hci – Project Presentation
slmsaady
 
PPTX
Adversarial search
Nilu Desai
 
PPSX
Inheritance
Selvin Josy Bai Somu
 
PPTX
Knowledge representation and Predicate logic
Amey Kerkar
 
PPT
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
PDF
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
PPTX
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PPT
PHP Regular Expressions
Jussi Pohjolainen
 
PPT
Debugging
Indu Sharma Bhardwaj
 
PDF
Modern Python Testing
Alexander Loechel
 
PPT
5.3 mining sequential patterns
Krish_ver2
 
PPTX
Job sequencing with deadlines(with example)
Vrinda Sheela
 
PDF
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
PPTX
R Programming Language
NareshKarela1
 
PDF
Recursive algorithms
subhashchandra197
 
PDF
I. AO* SEARCH ALGORITHM
vikas dhakane
 
PDF
An introduction to Google test framework
Abner Chih Yi Huang
 
PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
PDF
Solidity- Error Handling
Tutorials Diary
 
PDF
Graph processing - Pregel
Amir Payberah
 
PDF
Pregel reading circle
charlingual
 

More Related Content

What's hot(20)

PPTX
Adversarial search
Nilu Desai
 
PPSX
Inheritance
Selvin Josy Bai Somu
 
PPTX
Knowledge representation and Predicate logic
Amey Kerkar
 
PPT
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
PDF
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
PPTX
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PPT
PHP Regular Expressions
Jussi Pohjolainen
 
PPT
Debugging
Indu Sharma Bhardwaj
 
PDF
Modern Python Testing
Alexander Loechel
 
PPT
5.3 mining sequential patterns
Krish_ver2
 
PPTX
Job sequencing with deadlines(with example)
Vrinda Sheela
 
PDF
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
PPTX
R Programming Language
NareshKarela1
 
PDF
Recursive algorithms
subhashchandra197
 
PDF
I. AO* SEARCH ALGORITHM
vikas dhakane
 
PDF
An introduction to Google test framework
Abner Chih Yi Huang
 
PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
PDF
Solidity- Error Handling
Tutorials Diary
 
Adversarial search
Nilu Desai
 
Knowledge representation and Predicate logic
Amey Kerkar
 
Depth First Search, Breadth First Search and Best First Search
Adri Jovin
 
Top 20 Python Interview Questions And Answers 2023.pdf
AnanthReddy38
 
And or graph problem reduction using predicate logic
Mohanlal Sukhadia University (MLSU)
 
PHP Regular Expressions
Jussi Pohjolainen
 
Modern Python Testing
Alexander Loechel
 
5.3 mining sequential patterns
Krish_ver2
 
Job sequencing with deadlines(with example)
Vrinda Sheela
 
Lecture 2: Entropy and Mutual Information
ssuserb83554
 
R Programming Language
NareshKarela1
 
Recursive algorithms
subhashchandra197
 
I. AO* SEARCH ALGORITHM
vikas dhakane
 
An introduction to Google test framework
Abner Chih Yi Huang
 
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
Edureka!
 
Solidity- Error Handling
Tutorials Diary
 

Similar to Custom Pregel Algorithms in ArangoDB(20)

PDF
Graph processing - Pregel
Amir Payberah
 
PDF
Pregel reading circle
charlingual
 
PPT
MHH_20Feb_2012111111111111111111111111111.ppt
BiHongPhc
 
PPTX
Scalable Distributed Graph Algorithms on Apache Spark
LynxAnalytics
 
PDF
Large scale graph processing
Harisankar H
 
PDF
Processing large-scale graphs with Google(TM) Pregel
ArangoDB Database
 
PDF
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
NoSQLmatters
 
PDF
Rupy2012 ArangoDB Workshop Part1
ArangoDB Database
 
PDF
Pregel - Ezequiel Aguilar
Ezequiel Aguilar Gonzalez
 
PDF
Introducing Apache Giraph for Large Scale Graph Processing
sscdotopen
 
PDF
Andrea Iacono - Graphs are everywhere!
Codemotion
 
PPTX
Pregel
Weiru Dai
 
PDF
Large Scale Graph Processing with Apache Giraph
sscdotopen
 
PDF
Pregel: A System For Large Scale Graph Processing
Riyad Parvez
 
PDF
Processing large-scale graphs with Google Pregel
Max Neunhöffer
 
PPTX
Pregel and giraph
Cao Manh Dat
 
PDF
Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
Zuhair khayyat
 
PDF
Graphs
Steve Loughran
 
PDF
Graph Analytics with ArangoDB
ArangoDB Database
 
PDF
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Big Data Spain
 
Graph processing - Pregel
Amir Payberah
 
Pregel reading circle
charlingual
 
MHH_20Feb_2012111111111111111111111111111.ppt
BiHongPhc
 
Scalable Distributed Graph Algorithms on Apache Spark
LynxAnalytics
 
Large scale graph processing
Harisankar H
 
Processing large-scale graphs with Google(TM) Pregel
ArangoDB Database
 
Frank Celler – Processing large-scale graphs with Google(TM) Pregel - NoSQL m...
NoSQLmatters
 
Rupy2012 ArangoDB Workshop Part1
ArangoDB Database
 
Pregel - Ezequiel Aguilar
Ezequiel Aguilar Gonzalez
 
Introducing Apache Giraph for Large Scale Graph Processing
sscdotopen
 
Andrea Iacono - Graphs are everywhere!
Codemotion
 
Pregel
Weiru Dai
 
Large Scale Graph Processing with Apache Giraph
sscdotopen
 
Pregel: A System For Large Scale Graph Processing
Riyad Parvez
 
Processing large-scale graphs with Google Pregel
Max Neunhöffer
 
Pregel and giraph
Cao Manh Dat
 
Mizan: A System for Dynamic Load Balancing in Large-scale Graph Processing
Zuhair khayyat
 
Graph Analytics with ArangoDB
ArangoDB Database
 
Processing large-scale graphs with Google(TM) Pregel by MICHAEL HACKSTEIN at...
Big Data Spain
 
Ad

More from ArangoDB Database(20)

PPTX
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
PPTX
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
PPTX
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
PDF
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
PDF
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
PDF
Getting Started with ArangoDB Oasis
ArangoDB Database
 
PPTX
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
PDF
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
PDF
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
PDF
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
PDF
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
PDF
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
PDF
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
PDF
3.5 webinar
ArangoDB Database
 
PDF
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
PDF
An introduction to multi-model databases
ArangoDB Database
 
PDF
Running complex data queries in a distributed system
ArangoDB Database
 
PDF
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
PPTX
Are you a Tortoise or a Hare?
ArangoDB Database
 
ATO 2022 - Machine Learning + Graph Databases for Better Recommendations (3)....
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V2 08/20/2022
ArangoDB Database
 
Machine Learning + Graph Databases for Better Recommendations V1 08/06/2022
ArangoDB Database
 
ArangoDB 3.9 - Further Powering Graphs at Scale
ArangoDB Database
 
GraphSage vs Pinsage #InsideArangoDB
ArangoDB Database
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
Getting Started with ArangoDB Oasis
ArangoDB Database
 
Hacktoberfest 2020 - Intro to Knowledge Graphs
ArangoDB Database
 
A Graph Database That Scales - ArangoDB 3.7 Release Webinar
ArangoDB Database
 
gVisor, Kata Containers, Firecracker, Docker: Who is Who in the Container Space?
ArangoDB Database
 
ArangoML Pipeline Cloud - Managed Machine Learning Metadata
ArangoDB Database
 
ArangoDB 3.7 Roadmap: Performance at Scale
ArangoDB Database
 
Webinar: What to expect from ArangoDB Oasis
ArangoDB Database
 
ArangoDB 3.5 Feature Overview Webinar - Sept 12, 2019
ArangoDB Database
 
3.5 webinar
ArangoDB Database
 
Webinar: How native multi model works in ArangoDB
ArangoDB Database
 
An introduction to multi-model databases
ArangoDB Database
 
Running complex data queries in a distributed system
ArangoDB Database
 
Guacamole Fiesta: What do avocados and databases have in common?
ArangoDB Database
 
Are you a Tortoise or a Hare?
ArangoDB Database
 
Ad

Recently uploaded(20)

PDF
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
PPTX
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
PDF
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
PPTX
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
PPTX
Climate Action.pptx action plan for climate
justfortalabat
 
PPTX
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
PPTX
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
PPTX
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
PDF
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
PDF
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
PDF
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
DOCX
AI/ML Applications in Financial domain projects
Rituparna De
 
PPTX
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
PPT
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
PPTX
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
PDF
Choosing the Right Database for Indexing.pdf
Tamanna
 
PDF
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
PDF
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
PPTX
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
PDF
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 
Incident Response and Digital Forensics Certificate
VICTOR MAESTRE RAMIREZ
 
Numbers of a nation: how we estimate population statistics | Accessible slides
Office for National Statistics
 
List of all the AI prompt cheat codes.pdf
Avijit Kumar Roy
 
Usage of Power BI for Pharmaceutical Data analysis.pptx
Anisha Herala
 
Climate Action.pptx action plan for climate
justfortalabat
 
Human-Action-Recognition-Understanding-Behavior.pptx
nreddyjanga
 
Slide studies GC- CRC - PC - HNC baru.pptx
LLen8
 
Module-5-Measures-of-Central-Tendency-Grouped-Data-1.pptx
lacsonjhoma0407
 
How to Connect Your On-Premises Site to AWS Using Site-to-Site VPN.pdf
Tamanna
 
Building Production-Ready AI Agents with LangGraph.pdf
Tamanna
 
apidays Helsinki & North 2025 - API-Powered Journeys: Mobility in an API-Driv...
apidays
 
AI/ML Applications in Financial domain projects
Rituparna De
 
Advanced_NLP_with_Transformers_PPT_final 50.pptx
Shiwani Gupta
 
Lecture 2-1.ppt at a higher learning institution such as the university of Za...
rachealhantukumane52
 
Hadoop_EcoSystem slide by CIDAC India.pptx
migbaruget
 
Choosing the Right Database for Indexing.pdf
Tamanna
 
Web Scraping with Google Gemini 2.0 .pdf
Tamanna
 
What does good look like - CRAP Brighton 8 July 2025
Jan Kierzyk
 
recruitment Presentation.pptxhdhshhshshhehh
devraj40467
 
Context Engineering vs. Prompt Engineering, A Comprehensive Guide.pdf
Tamanna
 

Custom Pregel Algorithms in ArangoDB

  • 1.Feature Preview: Custom PregelComplex Graph Algorithms made Easy@arangodb @joerg_schad @hkernbach
  • 2.2tl;dr● “Many practical computing problems concern largegraphs.”● ArangoDB is a “Beyond Graph Database”supporting multiple data models around a scalablegraph foundation● Pregel is a framework for distributed graphprocessing○ ArangoDB supports predefined Prgel algorithms, e.g.PageRank, Single-Source Shortest Path and Connectedcomponents.● Programmable Pregel Algorithms (PPA) allowsadding/modifying algorithms on the flightDisclaimerThis is an experimentalfeature and especially thelanguage specification(front-end) is still underdevelopment!
  • 3.Jörg Schad, PhDHead of Engineering and ML@ArangoDB● Suki.ai● Mesosphere● Architect @SAP Hana● PhD Distributed DBSystems● Twitter: @joerg_schad
  • 4.4Heiko KernbachCore Engineer (Graphs Team)@● Graph● Custom Pregel● Geo / UI● Twitter: @hkernbach● Slack:hkernbach.ArangoDB
  • 5.5● Open Source● Beyond Graph Database○ Stores, K/V, Documents connected byscalable Graph Processing● Scalable○ Distributed Graphs● AQL - SQL-like multi-model query language● ACID Transactions including Multi CollectionTransactions
  • 7.https://blog.acolyer.org/2015/05/26/pregel-a-system-for-large-scale-graph-processing/Pregel Max ValueWhile not converged:Communicate: send own value to neighboursCompute: Own value = Max Value from all messages (+ own value) Superstep
  • 8.ArangoDB and Pregel: Status Quo● https://www.arangodb.com/docs/stable/graphs-pregel.html● https://www.arangodb.com/pregel-community-detection/Available Algorithms● Page Rank● Seeded PageRank● Single-Source Shortest Path● Connected Components○ Component○ WeaklyConnected○ StronglyConnected● Hyperlink-Induced Topic Search(HITS)Permalink● Vertex Centrality● Effective Closeness● LineRank● Label Propagation● Speaker-Listener Label Propagation 8var pregel = require("@arangodb/pregel");pregel.start("pagerank", "graphname", {maxGSS: 100,threshold: 0.00000001, resultField: "rank"})● Pregel support since 2014● Predefined algorithms○ Could be extended via C++● Same platform used for PPAChallengesAdd and modify Algorithms
  • 9.Programmable Pregel Algorithms (PPA)const pregel = require("@arangodb/pregel");let pregelID = pregel.start("air", graphName, "<custom-algorithm>");var status = pregel.status(pregelID);● Add/Modify algorithms on-the-fly○ Without C++ code○ Without restarting the Database● Efficiency (as Pregel) depends on Sharding○ Smart Graphs○ Required: Collocation of vertices and edges9
  • 10.Custom Algorithm10{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}AccumulatorsAccumulators are used to consume and process messages which are beingsent to them during the computational phase (initProgram, updateProgram,onPreStep, onPostStep) of a superstep. After a superstep is done, all messageswill be processed.● max: stores the maximum of all messages received.● min: stores the minimum of all messages received.● sum: sums up all messages received.● and: computes and on all messages received.● or: computes or and all messages received.● store: holds the last received value (non-deterministic).● list: stores all received values in list (order is non-deterministic).● custom
  • 11.Custom Algorithm11{"resultField": "<string>","maxGSS": "<number>","dataAccess": {"writeVertex": "<program>","readVertex": "<array>","readEdge": "<array>"},"vertexAccumulators": "<object>","globalAccumulators": "<object>","customAccumulators": "<object>","phases": "<array>"}● resultField (string, optional): Name of the document attribute to store the result in. Thevertex computation results will be in all vertices pointing to the given attribute.● maxGSS (number, required): The max amount of global supersteps After the amount of maxdefined supersteps is reached, the Pregel execution will stop.● dataAccess (object, optional): Allows to define writeVertex, readVertex and readEdge.○ writeVertex: A program that is used to write the results into vertices. If writeVertex isused, the resultField will be ignored.○ readVertex: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single attribute at the top level.■ array of strings: Represents a nested path○ readEdge: An array that consists of strings and/or additional arrays (that representsa path).■ string: Represents a single path at the top level which is not nested.■ array of strings: Represents a nested path● vertexAccumulators (object, optional): Definition of all used vertex accumulators.● globalAccumulators (object, optional): Definition all used global accumulators. GlobalAccumulators are able to access variables at shared global level.● customAccumulators (object, optional): Definition of all used custom accumulators.● phases (array): Array of a single or multiple phase definitions.● debug (optional): See Debugging.
  • 12.Phases - Execution order12Step 1: Initialization1. onPreStep (Conductor, executed on Coordinatorinstances)2. initProgram (Worker, executed on DB-Server instances)3. onPostStep (Conductor)Step {2, ...n} Computation1. onPreStep (Conductor)2. updateProgram (Worker)3. onPostStep (Conductor)
  • 13.Program - Arango Intermediate Representation (AIR)13
  • 14.Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation, represented inJSON and supports its data types14Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
  • 15.Program - Arango Intermediate Representation (AIR)Lisp-like intermediate representation,represented in JSON and supports its data types15Specification● Language Primitives○ Basic Algebraic Operators○ Logical operators○ Comparison operators○ Lists○ Sort○ Dicts○ Lambdas○ Reduce○ Utilities○ Functional○ Variables○ Debug operators● Math Library● Special Form○ let statement○ seq statement○ if statement○ match statement○ for-each statement○ quote and quote-splicestatements○ quasi-quote, unquote andunquote-splice statements○ cons statement○ and and or statements
  • 16.PregelatorSimple Foxx service based IDE16https://github.com/arangodb-foxx/pregelator
  • 18.PPA: What is next?- Gather Feedback- In particular use-cases- Missing functions & functionality- User-friendly Front-End language- Improve Scale/Performance of underlyingPregel platform- Algorithm library- Blog Post (including Jupyter example)18ArangoDB 3.8 (end of year)- Experimental Feature- Initial LibraryArangoDB 3.9 (Q1 21)- Draft for Front-End- Extended Library- Platform ImprovementsArangoDB 4.0 (Mid 21)- GA
  • 19.Pregel vs AQLWhen to (not) use Pregel…- Can the algorithm be efficiently beexpressed in Pregel?- Counter example: Topological Sort- Is the graph size worth the loading?19AQL PregelAll Models (Graph, Document, Key-Value, Search, …) Iterative Graph ProcessingOnline Queries Large Graphs, multiple iterations
  • 20.How can I start?● Docker Image: arangodb/enterprise-preview:3.8.0-milestone.3● Check existing algorithms● Preview documentation● Give Feedback○ https://slack.arangodb.com/ -> custom-pregel20
  • 21.Thanks for listening!21Reach out with Feedback/Questions!• @arangodb• https://www.arangodb.com/• docker pull arangodbTest-drive Oasis14-days for free

[8]ページ先頭

©2009-2025 Movatter.jp