Movatterモバイル変換


[0]ホーム

URL:


Uploaded bybcantrill
PDF, PPTX5,996 views

Bringing the Unix Philosophy to Big Data

The document summarizes the Unix philosophy of building systems out of small, single-purpose programs and how this approach can be applied to big data problems. It describes how Joyent's Manta object storage system brings this philosophy to big data by combining ZFS for scalable storage with OS-level virtualization using zones to allow Unix tools and approaches to be used on large datasets. Manta allows computations to be run directly on stored data rather than requiring data movement, enabling terse Unix-style one-liners to solve problems like word counting on big data.

In this document
Powered by AI

Introduction to Unix, its minimalist philosophy, and how it revolutionized systems thinking.

Comparison of Unix solutions and traditional programming approaches to word frequency challenges.

Connection of Big Data problems to earlier Unix challenges, highlighting the lack of Unix philosophy.

Challenges of scaling for Big Data, including multi-tenancy and the need to leverage Unix philosophies.

Overview of scalable storage protocols: block, file, and object, emphasizing their pros and cons.

Discussion on OS-level virtualization vs hardware virtualization, introducing lightweight containers.

Combining ZFS and Zones for an efficient object store that leverages Unix for Big Data applications.

Introduction of Manta, a scalable system utilizing Unix philosophy for processing large data efficiently.

Manta's design principles including consistency preferences, hierarchical storage, and SDK support.

Prospects of compute/data convergence and Manta's role as a pioneering system in future Big Data solutions.

Information on Manta, including product details, documentation, and community engagement opportunities.

Embed presentation

Download as PDF, PPTX
Bringing the UnixPhilosophy to Big DataBryan CantrillSVP, Engineeringbryan@joyent.com@bcantrill
Unix•When Unix appeared in the early 1970s, it was not just anew system, but a new way of thinking about systems•Instead of a sealed monolith, the operating system wasa collection of small, easily understood programs•First Edition Unix (1971) contained many programs thatwe still use today (ls, rm, cat, mv)•Its very name conveyed this minimalist aesthetic: Unix isa homophone of “eunuchs” — a castrated MulticsWe were a bit oppressed by the big system mentality. Kenwanted to do something simple. — Dennis Ritchie
Unix: Let there be light•In 1969, Doug McIlroy had the idea of connectingdifferent components:At the same time that Thompson and Ritchie were sketchingout a file system, I was sketching out how to do dataprocessing on the blackboard by connecting togethercascades of processes•This was the primordial pipe, but it took three years topersuade Thompson to adopt it:And one day I came up with a syntax for the shell that wentalong with the piping, and Ken said, “I’m going to do it!”
Unix: ...and there was lightAnd the next morning we had thisorgy of one-liners. — Doug McIlroy
The Unix philosophy•The pipe — coupled with the small-system aesthetic —gave rise to the Unix philosophy, as articulated by DougMcIlroy:••Write programs to work together••Write programs that do one thing and do it wellWrite programs that handle text streams, becausethat is a universal interfaceFour decades later, this philosophy remains the singlemost important revolution in software systems thinking!
Doug McIlroy v. Don Knuth: FIGHT!•In 1986, Jon Bentley posed the challenge that becamethe Epic Rap Battle of computer science history:Read a file of text, determine the n most frequently usedwords, and print out a sorted list of those words along withtheir frequencies.•Don Knuth’s solution: an elaborate program in WEB, aPascal-like literate programming system of his owninvention, using a purpose-built algorithm•Doug McIlroy’s solution shows the power of the Unixphilosophy:tr -cs A-Za-z 'n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
Big Data: History repeats itself?•The original Google MapReduce paper (Dean et al.,OSDI ’04) poses a problem disturbingly similar toBentley’s challenge nearly two decades prior:Count of URL Access Frequency: The function processeslogs of web page requests and outputs ⟨URL, 1⟩. Thereduce function adds together all values for the same URLand emits a ⟨URL, total count⟩ pair••But the solutions do not adhere to the Unix philosophy...•e.g., Appendix A of the OSDI ’04 paper has a 71 lineword count in C++ — with nary a wc in sight...and nor do they make use of the substantial Unixfoundation for data processing
Big Data: Challenges•Must be able to scale storage to allow for “big data” —quantities of data that dwarf a single machine•••Must allow for massively parallel executionMust allow for multi-tenancyTo make use of both the Unix philosophy and its toolset,must be able to virtualize the operating system
Scaling storage•There are essentially three protocols for scalablestorage: block, file and object•Block (i.e., a SAN) is far too low an abstraction — andnotoriously expensive to scale•File (i.e., NAS) is too permissive an abstraction — itimplies a coherent store for arbitrary (partial) writes,trying (and failing) to be both C and A in CAP•Object (e.g., S3) is similar “enough” to a file-basedabstraction, but by not allowing partial writes, allows forproper CAP tradeoffs
Object storage••Object storage systems do not allow for partial updates•A different approach is to have a highly reliable local filesystem that erasure encodes across local spindles —with entire objects duplicated across nodes foravailability•ZFS pioneered both reliability and efficiency of thismodel with RAID-Z — and has refined it over the pastdecade of production use•ZFS is one of the four foundational technologies inJoyent’s open source SmartOSFor both durability and availability, objects are generallyerasure encoded across spindles on different nodes
Virtualizing the operating system?•Historically — since the 1960s — systems have beenvirtualized at the level of hardware•Hardware virtualization has its advantages, but it’sheavyweight: operating systems are not designed toshare resources like DRAM, CPU, I/O devices, etc.•One can instead virtualize at the level of the operatingsystem: a single OS kernel that creates lightweightcontainers — on the metal, but securely partitioned•Pioneered by BSD’s jails; taken to a logical extreme byzones found in Joyent’s SmartOS
Idea: ZFS + Zones?•Can we combine the efficiency and reliability of ZFSwith the abstraction provided by zones to develop anobject store that has compute as a first-class citizen?•ZFS rollback allows for zones to be trashed — simplyrollback the zone after compute completes on an object•Add a job scheduling system that allows for both mapand reduce phases of distributed work•Would allow for the Unix toolset to be used on arbitrarylarge amounts of data — unlocking big data one-liners•If it perhaps seems obvious now, it wasn’t at the time...
Idea: ZFS + Zones?
Manta: ZFS + Zones!•Building a sophisticated distributed system on top ofZFS and zones, we have built Manta, an internet-facingobject storage system offering in situ compute•That is, the description of compute can be brought towhere objects reside instead of having to backhaulobjects to transient compute•The abstractions made available for computation areanything that can run on the OS...•...and as a reminder, the OS — Unix — was built aroundthe notion of ad hoc unstructured data processing, andallows for remarkably terse expressions of computation
Manta: Unix for Big Data•Manta allows for an arbitrarily scalable variant ofMcIlroy’s solution to Bentley’s challenge:mfind -t o /bcantrill/public/v7/usr/man | mjob create -o -m "tr -cs A-Za-z 'n' | tr A-Z a-z | sort | uniq -c" -r "awk '{ x[$2] += $1 }END { for (w in x) { print x[w] " " w } }' | sort -rn | sed ${1}q"•This description not only terse, it is high performing: datais left at rest — with the “map” phase doing heavyreduction of the data stream•As such, Manta — like Unix — is not merely syntacticsugar; it converges compute and data in a new way
Manta: CAP tradeoffs•Eventual consistency represents the wrong CAPtradeoffs for most; we prefer consistency overavailability for writes (but still availability for reads)•Many more details:http://dtrace.org/blogs/dap/2013/07/03/fault-tolerance-in-manta/•Celebrity endorsement:
Manta: Other design principles•Hierarchical storage is an excellent idea (ht: Multics);Manta implements proper directories, delimited with aforward slash•Manta implements a snapshot/link hybrid dubbed asnaplink; can be used to effect versioning••Manta has full support for CORS headers••Manta SDKs exist for node.js, Java, Ruby, PythonManta uses SSH-based HTTP auth for client-sidetooling (IETF draft-cavage-http-signatures-00)“npm install manta” for command line interface
Manta and the future of big data•We believe compute/data convergence to be the futureof big data: stores of record must support computationas a first-class, in situ operation•We believe that Unix is a natural way of expressing thiscomputation — and that the OS is the right level atwhich to virtualize to support this securely•We believe that ZFS is the only sane storage substrateunderpinning for such a system•Manta will surely not be the only system to represent theconfluence of these — but it is the first•We are actively retooling our software stack in terms ofManta — Manta is changing the way we developsoftware!
Manta: More information•Product page:http://joyent.com/products/manta•node.js module:https://github.com/joyent/node-manta•Manta documentation:http://apidocs.joyent.com/manta/•IRC, e-mail, Twitter, etc.:#manta on freenode, manta@joyent.com, @mcavage,@dapsays, @yunongx, @joyent•Here’s to the orgy of big data one-liners!

Recommended

PDF
Linux: LVM
PDF
The dream is alive! Running Linux containers on an illumos kernel
PDF
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
PDF
ApacheCon09: Avro
PDF
(P.D.F. FILE) Programming Kubernetes: Developing Cloud-Native Applications
PDF
zenoh -- the ZEro Network OverHead protocol
PDF
BlueZで遊んでみる - BLE大阪勉強会
PDF
Introduction of AArch64 TrustZone and OPTEE
DOC
6 stages of linux boot process
KEY
Node.js - Best practices
PPTX
Facebook architecture presentation: scalability challenge
PDF
AOS Lab 2: Hello, xv6!
PDF
Introduction to Ubuntu
PDF
Linux Notes-1.pdf
PPT
Linux file system nevigation
TXT
OPTEE on QEMU - Build Tutorial
PDF
The Linux Kernel Implementation of Pipes and FIFOs
PPTX
Nettoyer et transformer ses données avec Openrefine : partie 2
PDF
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
PPTX
Linux IO
PDF
Pfsense 121202023417-phpapp02
PDF
HDFS Architecture
PDF
Big Data Processing with Spark and Scala
PDF
Library Operating System for Linux #netdev01
PPT
Real Time Analytics for Big Data a Twitter Case Study
PDF
Linux Kernel Overview
PPTX
Aca11 bk2 ch9
PPT
Linux file system
PDF
Corporate Open Source Anti-patterns
PPTX
Linux principles and philosophy

More Related Content

PDF
Linux: LVM
PDF
The dream is alive! Running Linux containers on an illumos kernel
PDF
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
PDF
ApacheCon09: Avro
PDF
(P.D.F. FILE) Programming Kubernetes: Developing Cloud-Native Applications
PDF
zenoh -- the ZEro Network OverHead protocol
PDF
BlueZで遊んでみる - BLE大阪勉強会
PDF
Introduction of AArch64 TrustZone and OPTEE
Linux: LVM
The dream is alive! Running Linux containers on an illumos kernel
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
ApacheCon09: Avro
(P.D.F. FILE) Programming Kubernetes: Developing Cloud-Native Applications
zenoh -- the ZEro Network OverHead protocol
BlueZで遊んでみる - BLE大阪勉強会
Introduction of AArch64 TrustZone and OPTEE

What's hot

DOC
6 stages of linux boot process
KEY
Node.js - Best practices
PPTX
Facebook architecture presentation: scalability challenge
PDF
AOS Lab 2: Hello, xv6!
PDF
Introduction to Ubuntu
PDF
Linux Notes-1.pdf
PPT
Linux file system nevigation
TXT
OPTEE on QEMU - Build Tutorial
PDF
The Linux Kernel Implementation of Pipes and FIFOs
PPTX
Nettoyer et transformer ses données avec Openrefine : partie 2
PDF
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
PPTX
Linux IO
PDF
Pfsense 121202023417-phpapp02
PDF
HDFS Architecture
PDF
Big Data Processing with Spark and Scala
PDF
Library Operating System for Linux #netdev01
PPT
Real Time Analytics for Big Data a Twitter Case Study
PDF
Linux Kernel Overview
PPTX
Aca11 bk2 ch9
PPT
Linux file system
6 stages of linux boot process
Node.js - Best practices
Facebook architecture presentation: scalability challenge
AOS Lab 2: Hello, xv6!
Introduction to Ubuntu
Linux Notes-1.pdf
Linux file system nevigation
OPTEE on QEMU - Build Tutorial
The Linux Kernel Implementation of Pipes and FIFOs
Nettoyer et transformer ses données avec Openrefine : partie 2
Kernel Recipes 2015: Linux Kernel IO subsystem - How it works and how can I s...
Linux IO
Pfsense 121202023417-phpapp02
HDFS Architecture
Big Data Processing with Spark and Scala
Library Operating System for Linux #netdev01
Real Time Analytics for Big Data a Twitter Case Study
Linux Kernel Overview
Aca11 bk2 ch9
Linux file system

Viewers also liked

PDF
Corporate Open Source Anti-patterns
PPTX
Linux principles and philosophy
PPTX
Unix philosophy and principles
PPTX
Linux principles and philosophy
 
KEY
Mobile Knife Fighting at JSConf US
PPTX
Eduards Sizovs - Micro Service Architecture
PPTX
Mocking Test - QA Ninja Conf 2016
PPTX
Unix Philosophy
PDF
Writing Well-Behaved Unix Utilities
ODP
Circuit breakers for Java: Failsafe, Javaslang-Circuitbreaker, Hystrix and Ve...
PDF
To Microservices and Beyond
PPTX
Mockito vs JMockit, battle of the mocking frameworks
PPTX
Using Hystrix to Build Resilient Distributed Systems
PDF
Down Memory Lane: Two Decades with the Slab Allocator
PDF
Oral tradition in software engineering: Passing the craft across generations
PPTX
Mocking
PDF
The State of Cloud 2016: The whirlwind of creative destruction
PPTX
How we sleep well at night using Hystrix at Finn.no
PDF
Microservice Architecture
PDF
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...
Corporate Open Source Anti-patterns
Linux principles and philosophy
Unix philosophy and principles
Linux principles and philosophy
 
Mobile Knife Fighting at JSConf US
Eduards Sizovs - Micro Service Architecture
Mocking Test - QA Ninja Conf 2016
Unix Philosophy
Writing Well-Behaved Unix Utilities
Circuit breakers for Java: Failsafe, Javaslang-Circuitbreaker, Hystrix and Ve...
To Microservices and Beyond
Mockito vs JMockit, battle of the mocking frameworks
Using Hystrix to Build Resilient Distributed Systems
Down Memory Lane: Two Decades with the Slab Allocator
Oral tradition in software engineering: Passing the craft across generations
Mocking
The State of Cloud 2016: The whirlwind of creative destruction
How we sleep well at night using Hystrix at Finn.no
Microservice Architecture
Microservices vs. The First Law of Distributed Objects - GOTO Nights Chicago ...

Similar to Bringing the Unix Philosophy to Big Data

PDF
Manta: a new internet-facing object storage facility that features compute by...
PDF
The Internet-of-things: Architecting for the deluge of data
PDF
The Container Revolution: Reflections after the first decade
PDF
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
PPT
SQL or NoSQL, that is the question!
PDF
BruCON 2010 Lightning Talks - DIY Grid Computing
PDF
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
PPTX
Software Architectures, Week 5 - Advanced Architectures
PDF
The Rise of Cloud Computing Systems
PDF
Distributed Data processing in a Cloud
PPTX
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
PDF
XenSummit - 08/28/2012
PPT
Google Cloud Computing on Google Developer 2008 Day
PDF
Datacenter Computing with Apache Mesos - BigData DC
ODP
Block Storage For VMs With Ceph
PDF
End of RAID as we know it with Ceph Replication
PPS
Beyond the File System: Designing Large-Scale File Storage and Serving
 
PPS
Web20expo Filesystems
 
PPS
Web20expo Filesystems
PPS
Web20expo Filesystems
 
Manta: a new internet-facing object storage facility that features compute by...
The Internet-of-things: Architecting for the deluge of data
The Container Revolution: Reflections after the first decade
The Peril and Promise of Early Adoption: Arriving 10 Years Early to Containers
SQL or NoSQL, that is the question!
BruCON 2010 Lightning Talks - DIY Grid Computing
OSAC16: Unikernel-powered Transient Microservices: Changing the Face of Softw...
Software Architectures, Week 5 - Advanced Architectures
The Rise of Cloud Computing Systems
Distributed Data processing in a Cloud
Cloud infrastructure. Google File System and MapReduce - Andrii Vozniuk
XenSummit - 08/28/2012
Google Cloud Computing on Google Developer 2008 Day
Datacenter Computing with Apache Mesos - BigData DC
Block Storage For VMs With Ceph
End of RAID as we know it with Ceph Replication
Beyond the File System: Designing Large-Scale File Storage and Serving
 
Web20expo Filesystems
 
Web20expo Filesystems
Web20expo Filesystems
 

More from bcantrill

PDF
Predicting the Present
PDF
Sharpening the Axe: The Primacy of Toolmaking
PDF
Coming of Age: Developing young technologists without robbing them of their y...
PDF
I have come to bury the BIOS, not to open it: The need for holistic systems
PDF
Towards Holistic Systems
PDF
The Coming Firmware Revolution
PDF
Hardware/software Co-design: The Coming Golden Age
PDF
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
PDF
No Moore Left to Give: Enterprise Computing After Moore's Law
PDF
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
PDF
Visualizing Systems with Statemaps
PDF
Platform values, Rust, and the implications for system software
PDF
Is it time to rewrite the operating system in Rust?
PDF
dtrace.conf(16): DTrace state of the union
PDF
The Hurricane's Butterfly: Debugging pathologically performing systems
PDF
Papers We Love: ARC after dark
PDF
Principles of Technology Leadership
PDF
Zebras all the way down: The engineering challenges of the data path
PDF
Platform as reflection of values: Joyent, node.js, and beyond
PDF
Debugging under fire: Keeping your head when systems have lost their mind
Predicting the Present
Sharpening the Axe: The Primacy of Toolmaking
Coming of Age: Developing young technologists without robbing them of their y...
I have come to bury the BIOS, not to open it: The need for holistic systems
Towards Holistic Systems
The Coming Firmware Revolution
Hardware/software Co-design: The Coming Golden Age
Tockilator: Deducing Tock execution flows from Ibex Verilator traces
No Moore Left to Give: Enterprise Computing After Moore's Law
Andreessen's Corollary: Ethical Dilemmas in Software Engineering
Visualizing Systems with Statemaps
Platform values, Rust, and the implications for system software
Is it time to rewrite the operating system in Rust?
dtrace.conf(16): DTrace state of the union
The Hurricane's Butterfly: Debugging pathologically performing systems
Papers We Love: ARC after dark
Principles of Technology Leadership
Zebras all the way down: The engineering challenges of the data path
Platform as reflection of values: Joyent, node.js, and beyond
Debugging under fire: Keeping your head when systems have lost their mind

Recently uploaded

PDF
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
PDF
How Much Does It Cost to Build an eCommerce Website in 2025.pdf
PPTX
The power of Slack and MuleSoft | Bangalore MuleSoft Meetup #60
PDF
[BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Comp...
PDF
[BDD 2025 - Mobile Development] Exploring Apple’s On-Device FoundationModels
PDF
Crane Accident Prevention Guide: Key OSHA Regulations for Safer Operations
PDF
DUBAI IT MODERNIZATION WITH AZURE MANAGED SERVICES.pdf
PDF
Mulesoft Meetup Online Portuguese: MCP e IA
PDF
[BDD 2025 - Mobile Development] Crafting Immersive UI with E2E and AGSL Shade...
PPTX
UFCD 0797 - SISTEMAS OPERATIVOS_Unidade Completa.pptx
PDF
MuleSoft Meetup: Dreamforce'25 Tour- Vibing With AI & Agents.pdf
PDF
[BDD 2025 - Full-Stack Development] The Modern Stack: Building Web & AI Appli...
PDF
[BDD 2025 - Full-Stack Development] Agentic AI Architecture: Redefining Syste...
PPTX
Support, Monitoring, Continuous Improvement & Scaling Agentic Automation [3/3]
PDF
KMWorld - KM & AI Bring Collectivity, Nostalgia, & Selectivity
PDF
Mastering UiPath Maestro – Session 2 – Building a Live Use Case - Session 2
PDF
Open Source Post-Quantum Cryptography - Matt Caswell
PDF
So You Want to Work at Google | DevFest Seattle 2025
PDF
Mastering Agentic Orchestration with UiPath Maestro | Hands on Workshop
PDF
The Necessity of Digital Forensics, the Digital Forensics Process & Laborator...
[BDD 2025 - Artificial Intelligence] AI for the Underdogs: Innovation for Sma...
How Much Does It Cost to Build an eCommerce Website in 2025.pdf
The power of Slack and MuleSoft | Bangalore MuleSoft Meetup #60
[BDD 2025 - Artificial Intelligence] Building AI Systems That Users (and Comp...
[BDD 2025 - Mobile Development] Exploring Apple’s On-Device FoundationModels
Crane Accident Prevention Guide: Key OSHA Regulations for Safer Operations
DUBAI IT MODERNIZATION WITH AZURE MANAGED SERVICES.pdf
Mulesoft Meetup Online Portuguese: MCP e IA
[BDD 2025 - Mobile Development] Crafting Immersive UI with E2E and AGSL Shade...
UFCD 0797 - SISTEMAS OPERATIVOS_Unidade Completa.pptx
MuleSoft Meetup: Dreamforce'25 Tour- Vibing With AI & Agents.pdf
[BDD 2025 - Full-Stack Development] The Modern Stack: Building Web & AI Appli...
[BDD 2025 - Full-Stack Development] Agentic AI Architecture: Redefining Syste...
Support, Monitoring, Continuous Improvement & Scaling Agentic Automation [3/3]
KMWorld - KM & AI Bring Collectivity, Nostalgia, & Selectivity
Mastering UiPath Maestro – Session 2 – Building a Live Use Case - Session 2
Open Source Post-Quantum Cryptography - Matt Caswell
So You Want to Work at Google | DevFest Seattle 2025
Mastering Agentic Orchestration with UiPath Maestro | Hands on Workshop
The Necessity of Digital Forensics, the Digital Forensics Process & Laborator...

Bringing the Unix Philosophy to Big Data

  • 1.
    Bringing the UnixPhilosophyto Big DataBryan CantrillSVP, Engineeringbryan@joyent.com@bcantrill
  • 2.
    Unix•When Unix appearedin the early 1970s, it was not just anew system, but a new way of thinking about systems•Instead of a sealed monolith, the operating system wasa collection of small, easily understood programs•First Edition Unix (1971) contained many programs thatwe still use today (ls, rm, cat, mv)•Its very name conveyed this minimalist aesthetic: Unix isa homophone of “eunuchs” — a castrated MulticsWe were a bit oppressed by the big system mentality. Kenwanted to do something simple. — Dennis Ritchie
  • 3.
    Unix: Let therebe light•In 1969, Doug McIlroy had the idea of connectingdifferent components:At the same time that Thompson and Ritchie were sketchingout a file system, I was sketching out how to do dataprocessing on the blackboard by connecting togethercascades of processes•This was the primordial pipe, but it took three years topersuade Thompson to adopt it:And one day I came up with a syntax for the shell that wentalong with the piping, and Ken said, “I’m going to do it!”
  • 4.
    Unix: ...and therewas lightAnd the next morning we had thisorgy of one-liners. — Doug McIlroy
  • 5.
    The Unix philosophy•Thepipe — coupled with the small-system aesthetic —gave rise to the Unix philosophy, as articulated by DougMcIlroy:••Write programs to work together••Write programs that do one thing and do it wellWrite programs that handle text streams, becausethat is a universal interfaceFour decades later, this philosophy remains the singlemost important revolution in software systems thinking!
  • 6.
    Doug McIlroy v.Don Knuth: FIGHT!•In 1986, Jon Bentley posed the challenge that becamethe Epic Rap Battle of computer science history:Read a file of text, determine the n most frequently usedwords, and print out a sorted list of those words along withtheir frequencies.•Don Knuth’s solution: an elaborate program in WEB, aPascal-like literate programming system of his owninvention, using a purpose-built algorithm•Doug McIlroy’s solution shows the power of the Unixphilosophy:tr -cs A-Za-z 'n' | tr A-Z a-z | sort | uniq -c | sort -rn | sed ${1}q
  • 7.
    Big Data: Historyrepeats itself?•The original Google MapReduce paper (Dean et al.,OSDI ’04) poses a problem disturbingly similar toBentley’s challenge nearly two decades prior:Count of URL Access Frequency: The function processeslogs of web page requests and outputs ⟨URL, 1⟩. Thereduce function adds together all values for the same URLand emits a ⟨URL, total count⟩ pair••But the solutions do not adhere to the Unix philosophy...•e.g., Appendix A of the OSDI ’04 paper has a 71 lineword count in C++ — with nary a wc in sight...and nor do they make use of the substantial Unixfoundation for data processing
  • 8.
    Big Data: Challenges•Mustbe able to scale storage to allow for “big data” —quantities of data that dwarf a single machine•••Must allow for massively parallel executionMust allow for multi-tenancyTo make use of both the Unix philosophy and its toolset,must be able to virtualize the operating system
  • 9.
    Scaling storage•There areessentially three protocols for scalablestorage: block, file and object•Block (i.e., a SAN) is far too low an abstraction — andnotoriously expensive to scale•File (i.e., NAS) is too permissive an abstraction — itimplies a coherent store for arbitrary (partial) writes,trying (and failing) to be both C and A in CAP•Object (e.g., S3) is similar “enough” to a file-basedabstraction, but by not allowing partial writes, allows forproper CAP tradeoffs
  • 10.
    Object storage••Object storagesystems do not allow for partial updates•A different approach is to have a highly reliable local filesystem that erasure encodes across local spindles —with entire objects duplicated across nodes foravailability•ZFS pioneered both reliability and efficiency of thismodel with RAID-Z — and has refined it over the pastdecade of production use•ZFS is one of the four foundational technologies inJoyent’s open source SmartOSFor both durability and availability, objects are generallyerasure encoded across spindles on different nodes
  • 11.
    Virtualizing the operatingsystem?•Historically — since the 1960s — systems have beenvirtualized at the level of hardware•Hardware virtualization has its advantages, but it’sheavyweight: operating systems are not designed toshare resources like DRAM, CPU, I/O devices, etc.•One can instead virtualize at the level of the operatingsystem: a single OS kernel that creates lightweightcontainers — on the metal, but securely partitioned•Pioneered by BSD’s jails; taken to a logical extreme byzones found in Joyent’s SmartOS
  • 12.
    Idea: ZFS +Zones?•Can we combine the efficiency and reliability of ZFSwith the abstraction provided by zones to develop anobject store that has compute as a first-class citizen?•ZFS rollback allows for zones to be trashed — simplyrollback the zone after compute completes on an object•Add a job scheduling system that allows for both mapand reduce phases of distributed work•Would allow for the Unix toolset to be used on arbitrarylarge amounts of data — unlocking big data one-liners•If it perhaps seems obvious now, it wasn’t at the time...
  • 13.
  • 14.
    Manta: ZFS +Zones!•Building a sophisticated distributed system on top ofZFS and zones, we have built Manta, an internet-facingobject storage system offering in situ compute•That is, the description of compute can be brought towhere objects reside instead of having to backhaulobjects to transient compute•The abstractions made available for computation areanything that can run on the OS...•...and as a reminder, the OS — Unix — was built aroundthe notion of ad hoc unstructured data processing, andallows for remarkably terse expressions of computation
  • 15.
    Manta: Unix forBig Data•Manta allows for an arbitrarily scalable variant ofMcIlroy’s solution to Bentley’s challenge:mfind -t o /bcantrill/public/v7/usr/man | mjob create -o -m "tr -cs A-Za-z 'n' | tr A-Z a-z | sort | uniq -c" -r "awk '{ x[$2] += $1 }END { for (w in x) { print x[w] " " w } }' | sort -rn | sed ${1}q"•This description not only terse, it is high performing: datais left at rest — with the “map” phase doing heavyreduction of the data stream•As such, Manta — like Unix — is not merely syntacticsugar; it converges compute and data in a new way
  • 16.
    Manta: CAP tradeoffs•Eventualconsistency represents the wrong CAPtradeoffs for most; we prefer consistency overavailability for writes (but still availability for reads)•Many more details:http://dtrace.org/blogs/dap/2013/07/03/fault-tolerance-in-manta/•Celebrity endorsement:
  • 17.
    Manta: Other designprinciples•Hierarchical storage is an excellent idea (ht: Multics);Manta implements proper directories, delimited with aforward slash•Manta implements a snapshot/link hybrid dubbed asnaplink; can be used to effect versioning••Manta has full support for CORS headers••Manta SDKs exist for node.js, Java, Ruby, PythonManta uses SSH-based HTTP auth for client-sidetooling (IETF draft-cavage-http-signatures-00)“npm install manta” for command line interface
  • 18.
    Manta and thefuture of big data•We believe compute/data convergence to be the futureof big data: stores of record must support computationas a first-class, in situ operation•We believe that Unix is a natural way of expressing thiscomputation — and that the OS is the right level atwhich to virtualize to support this securely•We believe that ZFS is the only sane storage substrateunderpinning for such a system•Manta will surely not be the only system to represent theconfluence of these — but it is the first•We are actively retooling our software stack in terms ofManta — Manta is changing the way we developsoftware!
  • 19.
    Manta: More information•Productpage:http://joyent.com/products/manta•node.js module:https://github.com/joyent/node-manta•Manta documentation:http://apidocs.joyent.com/manta/•IRC, e-mail, Twitter, etc.:#manta on freenode, manta@joyent.com, @mcavage,@dapsays, @yunongx, @joyent•Here’s to the orgy of big data one-liners!

[8]ページ先頭

©2009-2025 Movatter.jp