вторник, 22 юли 2008 г.
Protocol Buffers - Google's Data Interchange Format
Protocol buffers are Google's language-neutral, platform-neutral, extensible mechanism for serializing structured data – think XML, but smaller, faster, and simpler. You define how you want your data to be structured once, then you can use special generated source code to easily write and read your structured data to and from a variety of data streams and using a variety of languages – Java, C++, C#, Python and Perl.
It is used everyone inside Google.
The initial version Proto1 was developed in Google starting in early 2001.
Proto2 is a complete rewrite, though it keeps most of the design and uses many of the implementation ideas from Proto1. Some features have been added, some removed. Most importantly, though, the code is cleaned up and does not have any dependencies on Google libraries that have not yet been open-sourced...
Do we write hand-coded parsing and serialization routines for each data structure? Well, we used to. Needless to say, that didn't last long. When you have tens of thousands of different structures in your code base that need their own serialization formats, you simply cannot write them all by hand.
Instead, we developed Protocol Buffers. Protocol Buffers allow you to define simple data structures in a special definition language, then compile them to produce classes to represent those structures in the language of your choice. These classes come complete with heavily-optimized code to parse and serialize your message in an extremely compact format. Best of all, the classes are easy to use: each field has simple "get" and "set" methods, and once you're ready, serializing the whole thing to – or parsing it from – a byte array or an I/O stream just takes a single method call...
And, yes, it is very fast – at least an order of magnitude faster than XML.