| Protocol Buffers | |
|---|---|
| Developer | |
| Initial release | Early 2001 (internal)[1] July 7, 2008 (2008-07-07) (public) |
| Stable release | |
| Repository | |
| Written in | C++, C#, Java, Python, JavaScript, Ruby, Go, PHP, Dart |
| Operating system | Any |
| Platform | Cross-platform |
| Type | serialization format and library,IDL compiler |
| License | BSD |
| Website | protobuf |
| Protocol Buffers | |
|---|---|
| Filename extension | .proto |
| Internet media type | application/protobuf, application/vnd.google.protobuf |
| Developed by | |
| Latest release | 3 |
| Type of format | Interface description language |
| Open format? | Yes |
| Free format? | Yes |
| Website | protobuf |
Protocol Buffers (Protobuf) is afree and open-sourcecross-platform data format used toserialize structured data. It is useful in developing programs that communicate with each other over a network or for storing data. The method involves aninterface description language that describes the structure of some data and a program that generates source code from that description for generating or parsing a stream of bytes that represents the structured data.
Google developed Protocol Buffers for internal use and provided acode generator for multiple languages under anopen-source license.
The design goals for Protocol Buffers emphasized simplicity and performance. In particular, it was designed to be smaller and faster thanXML.[3]
Protocol Buffers is widely used at Google for storing and interchanging all kinds of structured information. The method serves as a basis for a customremote procedure call (RPC) system that is used for nearly allinter-machine communication at Google.[4]
Protocol Buffers is similar to theApache Thrift,Ion, and Microsoft Bond protocols, offering a concrete RPCprotocol stack to use for definedservices calledgRPC.[5]
Data structure schemas (calledmessages) and services are described in a proto definition file (.proto) and compiled withprotoc. This compilation generates code that can be invoked by a sender or recipient of these data structures. For example,example.pb.cc andexample.pb.h are generated fromexample.proto. They defineC++ classes for each message and service inexample.proto.
Canonically, messages are serialized into abinarywire format which is compact,forward- andbackward-compatible, but notself-describing (that is, there is no way to tell the names, meaning, or full datatypes of fields without an external specification). There is no defined way to include or refer to such an external specification (schema) within a Protocol Buffers file. The officially supported implementation includes anASCII serialization format,[6] but this format—though self-describing—loses the forward- and backward-compatibility behavior, and is thus not a good choice for applications other than human editing and debugging.[7]
Though the primary purpose of Protocol Buffers is to facilitate network communication, its simplicity and speed make Protocol Buffers an alternative to data-centric C++ classes and structs, especially where interoperability with other languages or systems might be needed in the future.
Protobufs have no single specification.[8] The format is best suited for small data chunks that don't exceed a few megabytes and can be loaded/sent into memory right away and therefore is not a streamable format.[9] The library doesn't provide compression out of the box. The format also isn't well supported in non–object-oriented languages (e.g.Fortran).[10]
A schema for a particular use of protocol buffers associates data types with field names, using integers to identify each field. (The protocol buffer data contains only the numbers, not the field names, providing some bandwidth/storage savings compared with systems that include the field names in the data.)
// polyline.protosyntax="proto2";messagePoint{requiredint32x=1;requiredint32y=2;optionalstringlabel=3;}messageLine{requiredPointstart=1;requiredPointend=2;optionalstringlabel=3;}messagePolyline{repeatedPointpoint=1;optionalstringlabel=2;}
The "Point" message defines two mandatory data items,x andy. The data itemlabel is optional. Each data item has a tag. The tag is defined after the equal sign. For example,x has the tag 1.
The "Line" and "Polyline" messages, which both use Point, demonstrate how composition works in Protocol Buffers. Polyline has arepeated field, and thus Polyline behaves like a set of points (of unspecified number).
This schema can subsequently be compiled for use by one or more programming languages. Google provides a compiler calledprotoc which can produce output for C++, Java or Python. Other schema compilers are available from other sources to create language-dependent output for over 20 other languages.[11]
For example, after a C++ version of the protocol buffer schema above is produced, a C++ source code file, polyline.cpp, can use the message objects as follows:
// polyline.cpp#include"polyline.pb.h" // generated by calling "protoc polyline.proto"Line*createNewLine(conststd::string&name){// create a line from (10, 20) to (30, 40)Line*line=newLine;line->mutable_start()->set_x(10);line->mutable_start()->set_y(20);line->mutable_end()->set_x(30);line->mutable_end()->set_y(40);line->set_label(name);returnline;}Polyline*createNewPolyline(){// create a polyline with points at (10,10) and (20,20)Polyline*polyline=newPolyline;Point*point1=polyline->add_point();point1->set_x(10);point1->set_y(10);Point*point2=polyline->add_point();point2->set_x(20);point2->set_y(20);returnpolyline;}
Protobuf 2.0 provides acode generator forC++,Java,C#,[12] andPython.[13]
Protobuf 3.0 provides a code generator forC++,Java (including JavaNano, a dialect intended forlow-resource environments),Kotlin,Python,Go,Ruby,Objective-C,C#,PHP,Dart.[14] It also supports JavaScript since 3.0.0-beta-2.[15]
Third-party implementations are also available forAda,[16]Ballerina,[17]C,[18][19]C++,[20]Dart,Elixir,[21][22]Erlang,[23]Haskell,[24]JavaScript,[25]Julia,[26]Nim,[27]Perl,PHP,Prolog,[28][29]R,[30]Rust,[31][32][33]Scala,[34] andSwift.[35]