protobufjs

donate ❤

Protocol Buffers are a language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more, originally designed at Google (see).

protobuf.js is a pure JavaScript implementation with TypeScript support for node.js and the browser. It's easy to use, blazingly fast and works out of the box with .proto files!

Apollo GraphQL fork

We have forked the source repo because we need to make changes to the package for use in Apollo Server.

Version 1.0.0 was forked from master which contained version 6.8.8 plus a few unreleased commits. sha

Contents

Installation

node.js

Note that this library's versioning scheme is not semver-compatible for historical reasons. For guaranteed backward compatibility, always depend on ~6.A.B instead of ^6.A.B (hence the --save-prefix above).

Browsers

Development:

Production:

Remember to replace the version tag with the exact release your project depends upon.

The library supports CommonJS and AMD loaders and also exports globally as protobuf.

Distributions

Where bundle size is a factor, there are additional stripped-down versions of the full library (~19kb gzipped) available that exclude certain functionality:

  • When working with JSON descriptors (i.e. generated by pbjs) and/or reflection only, see the light library (~16kb gzipped) that excludes the parser. CommonJS entry point is:

  • When working with statically generated code only, see the minimal library (~6.5kb gzipped) that also excludes reflection. CommonJS entry point is:

Usage

Because JavaScript is a dynamically typed language, protobuf.js introduces the concept of a valid message in order to provide the best possible performance (and, as a side product, proper typings):

Valid message

A valid message is an object (1) not missing any required fields and (2) exclusively composed of JS types understood by the wire format writer.

There are two possible types of valid messages and the encoder is able to work with both of these for convenience:

  • Message instances (explicit instances of message classes with default values on their prototype) always (have to) satisfy the requirements of a valid message by design and

  • Plain JavaScript objects that just so happen to be composed in a way satisfying the requirements of a valid message as well.

In a nutshell, the wire format writer understands the following types:

Field type
Expected JS type (create, encode)
Conversion (fromObject)

s-/u-/int32 s-/fixed32

number (32 bit integer)

value | 0 if signed value >>> 0 if unsigned

s-/u-/int64 s-/fixed64

Long-like (optimal) number (53 bit integer)

Long.fromValue(value) with long.js parseInt(value, 10) otherwise

float double

number

Number(value)

bool

boolean

Boolean(value)

string

string

String(value)

bytes

Uint8Array (optimal) Buffer (optimal under node) Array.<number> (8 bit integers)

base64.decode(value) if a string Object with non-zero .length is assumed to be buffer-like

enum

number (32 bit integer)

Looks up the numeric id if a string

message

Valid message

Message.fromObject(value)

  • Explicit undefined and null are considered as not set if the field is optional.

  • Repeated fields are Array.<T>.

  • Map fields are Object.<string,T> with the key being the string representation of the respective value or an 8 characters long binary hash string for Long-likes.

  • Types marked as optimal provide the best performance because no conversion step (i.e. number to low and high bits or base64 string to buffer) is required.

Toolset

With that in mind and again for performance reasons, each message class provides a distinct set of methods with each method doing just one thing. This avoids unnecessary assertions / redundant operations where performance is a concern but also forces a user to perform verification (of plain JavaScript objects that might just so happen to be a valid message) explicitly where necessary - for example when dealing with user input.

Note that Message below refers to any message class.

  • Message.verify(message: Object): null|string verifies that a plain JavaScript object satisfies the requirements of a valid message and thus can be encoded without issues. Instead of throwing, it returns the error message as a string, if any.

  • Message.encode(message: Message|Object [, writer: Writer]): Writer encodes a message instance or valid plain JavaScript object. This method does not implicitly verify the message and it's up to the user to make sure that the payload is a valid message.

  • Message.encodeDelimited(message: Message|Object [, writer: Writer]): Writer works like Message.encode but additionally prepends the length of the message as a varint.

  • Message.decode(reader: Reader|Uint8Array): Message decodes a buffer to a message instance. If required fields are missing, it throws a util.ProtocolError with an instance property set to the so far decoded message. If the wire format is invalid, it throws an Error.

  • Message.decodeDelimited(reader: Reader|Uint8Array): Message works like Message.decode but additionally reads the length of the message prepended as a varint.

  • Message.create(properties: Object): Message creates a new message instance from a set of properties that satisfy the requirements of a valid message. Where applicable, it is recommended to prefer Message.create over Message.fromObject because it doesn't perform possibly redundant conversion.

  • Message.fromObject(object: Object): Message converts any non-valid plain JavaScript object to a message instance using the conversion steps outlined within the table above.

  • Message.toObject(message: Message [, options: ConversionOptions]): Object converts a message instance to an arbitrary plain JavaScript object for interoperability with other libraries or storage. The resulting plain JavaScript object might still satisfy the requirements of a valid message depending on the actual conversion options specified, but most of the time it does not.

For reference, the following diagram aims to display relationships between the different methods and the concept of a valid message:

Toolset Diagram

In other words: verify indicates that calling create or encode directly on the plain object will [result in a valid message respectively] succeed. fromObject, on the other hand, does conversion from a broader range of plain objects to create valid messages. (ref)

Examples

Using .proto files

It is possible to load existing .proto files using the full library, which parses and compiles the definitions to ready to use (reflection-based) message classes:

Additionally, promise syntax can be used by omitting the callback, if preferred:

Using JSON descriptors

The library utilizes JSON descriptors that are equivalent to a .proto definition. For example, the following is identical to the .proto definition seen above:

JSON descriptors closely resemble the internal reflection structure:

Type (T)
Extends
Type-specific properties

ReflectionObject

options

Namespace

ReflectionObject

nested

Root

Namespace

nested

Type

Namespace

fields

Enum

ReflectionObject

values

Field

ReflectionObject

rule, type, id

MapField

Field

keyType

OneOf

ReflectionObject

oneof (array of field names)

Service

Namespace

methods

Method

ReflectionObject

type, requestType, responseType, requestStream, responseStream

  • Bold properties are required. Italic types are abstract.

  • T.fromJSON(name, json) creates the respective reflection object from a JSON descriptor

  • T#toJSON() creates a JSON descriptor from the respective reflection object (its name is used as the key within the parent)

Exclusively using JSON descriptors instead of .proto files enables the use of just the light library (the parser isn't required in this case).

A JSON descriptor can either be loaded the usual way:

Or it can be loaded inline:

Using reflection only

Both the full and the light library include full reflection support. One could, for example, define the .proto definitions seen in the examples above using just reflection:

Detailed information on the reflection structure is available within the API documentation.

Using custom classes

Message classes can also be extended with custom functionality and it is also possible to register a custom constructor with a reflected message type:

(*) Besides referencing its reflected type through AwesomeMessage.$type and AwesomeMesage#$type, the respective custom class is automatically populated with:

  • AwesomeMessage.create

  • AwesomeMessage.encode and AwesomeMessage.encodeDelimited

  • AwesomeMessage.decode and AwesomeMessage.decodeDelimited

  • AwesomeMessage.verify

  • AwesomeMessage.fromObject, AwesomeMessage.toObject and AwesomeMessage#toJSON

Afterwards, decoded messages of this type are instanceof AwesomeMessage.

Alternatively, it is also possible to reuse and extend the internal constructor if custom initialization code is not required:

Using services

The library also supports consuming services but it doesn't make any assumptions about the actual transport channel. Instead, a user must provide a suitable RPC implementation, which is an asynchronous function that takes the reflected service method, the binary request and a node-style callback as its parameters:

Below is a working example with a typescript implementation using grpc npm package.

Example:

Services also support promises:

There is also an example for streaming RPC.

Note that the service API is meant for clients. Implementing a server-side endpoint pretty much always requires transport channel (i.e. http, websocket, etc.) specific code with the only common denominator being that it decodes and encodes messages.

Usage with TypeScript

The library ships with its own type definitions and modern editors like Visual Studio Code will automatically detect and use them for code completion.

The npm package depends on @types/node because of Buffer and @types/long because of Long. If you are not building for node and/or not using long.js, it should be safe to exclude them manually.

Using the JS API

The API shown above works pretty much the same with TypeScript. However, because everything is typed, accessing fields on instances of dynamically generated message classes requires either using bracket-notation (i.e. message["awesomeField"]) or explicit casts. Alternatively, it is possible to use a typings file generated for its static counterpart.

Using generated static code

If you generated static code to bundle.js using the CLI and its type definitions to bundle.d.ts, then you can just do:

Using decorators

The library also includes an early implementation of decorators.

Note that decorators are an experimental feature in TypeScript and that declaration order is important depending on the JS target. For example, @Field.d(2, AwesomeArrayMessage) requires that AwesomeArrayMessage has been defined earlier when targeting ES5.

Supported decorators are:

  • Type.d(typeName?: string) (optional) annotates a class as a protobuf message type. If typeName is not specified, the constructor's runtime function name is used for the reflected type.

  • Field.d<T>(fieldId: number, fieldType: string | Constructor<T>, fieldRule?: "optional" | "required" | "repeated", defaultValue?: T) annotates a property as a protobuf field with the specified id and protobuf type.

  • MapField.d<T extends { [key: string]: any }>(fieldId: number, fieldKeyType: string, fieldValueType. string | Constructor<{}>) annotates a property as a protobuf map field with the specified id, protobuf key and value type.

  • OneOf.d<T extends string>(...fieldNames: string[]) annotates a property as a protobuf oneof covering the specified fields.

Other notes:

  • Decorated types reside in protobuf.roots["decorated"] using a flat structure, so no duplicate names.

  • Enums are copied to a reflected enum with a generic name on decorator evaluation because referenced enum objects have no runtime name the decorator could use.

  • Default values must be specified as arguments to the decorator instead of using a property initializer for proper prototype behavior.

  • Property names on decorated classes must not be renamed on compile time (i.e. by a minifier) because decorators just receive the original field name as a string.

ProTip! Not as pretty, but you can use decorators in plain JavaScript as well.

Command line

Note that moving the CLI to its own package is a work in progress. At the moment, it's still part of the main package.

The command line interface (CLI) can be used to translate between file formats and to generate static code as well as TypeScript definitions.

pbjs for JavaScript

For production environments it is recommended to bundle all your .proto files to a single .json file, which minimizes the number of network requests and avoids any parser overhead (hint: works with just the light library):

Now, either include this file in your final bundle:

or load it the usual way:

Generated static code, on the other hand, works with just the minimal library. For example

will generate static code for definitions within file1.proto and file2.proto to a CommonJS module compiled.js.

ProTip! Documenting your .proto files with /** ... */-blocks or (trailing) /// ... lines translates to generated static code.

pbts for TypeScript

Picking up on the example above, the following not only generates static code to a CommonJS module compiled.js but also its respective TypeScript definitions to compiled.d.ts:

Additionally, TypeScript definitions of static modules are compatible with their reflection-based counterparts (i.e. as exported by JSON modules), as long as the following conditions are met:

  1. Instead of using new SomeMessage(...), always use SomeMessage.create(...) because reflection objects do not provide a constructor.

  2. Types, services and enums must start with an uppercase letter to become available as properties of the reflected types as well (i.e. to be able to use MyMessage.MyEnum instead of root.lookup("MyMessage.MyEnum")).

For example, the following generates a JSON module bundle.js and a bundle.d.ts, but no static code:

Reflection vs. static code

While using .proto files directly requires the full library respectively pure reflection/JSON the light library, pretty much all code but the relatively short descriptors is shared.

Static code, on the other hand, requires just the minimal library, but generates additional source code without any reflection features. This also implies that there is a break-even point where statically generated code becomes larger than descriptor-based code once the amount of code generated exceeds the size of the full respectively light library.

There is no significant difference performance-wise as the code generated statically is pretty much the same as generated at runtime and both are largely interchangeable as seen in the previous section.

Source
Library
Advantages
Tradeoffs

.proto

full

Easily editable Interoperability with other libraries No compile step

Some parsing and possibly network overhead

JSON

light

Easily editable No parsing overhead Single bundle (no network overhead)

protobuf.js specific Has a compile step

static

minimal

Works where eval access is restricted Fully documented Small footprint for small protos

Can be hard to edit No reflection Has a compile step

Command line API

Both utilities can be used programmatically by providing command line arguments and a callback to their respective main functions:

Additional documentation

Protocol Buffers

protobuf.js

Community

Performance

The package includes a benchmark that compares protobuf.js performance to native JSON (as far as this is possible) and Google's JS implementation. On an i7-2600K running node 6.9.1 it yields:

These results are achieved by

  • generating type-specific encoders, decoders, verifiers and converters at runtime

  • configuring the reader/writer interface according to the environment

  • using node-specific functionality where beneficial and, of course

  • avoiding unnecessary operations through splitting up the toolset.

You can also run the benchmark ...

and the profiler yourself (the latter requires a recent version of node):

Note that as of this writing, the benchmark suite performs significantly slower on node 7.2.0 compared to 6.9.1 because moths.

Compatibility

  • Works in all modern and not-so-modern browsers except IE8.

  • Because the internals of this package do not rely on google/protobuf/descriptor.proto, options are parsed and presented literally.

  • If typed arrays are not supported by the environment, plain arrays will be used instead.

  • Support for pre-ES5 environments (except IE8) can be achieved by using a polyfill.

  • Support for Content Security Policy-restricted environments (like Chrome extensions without unsafe-eval) can be achieved by generating and using static code instead.

  • If a proper way to work with 64 bit values (uint64, int64 etc.) is required, just install long.js alongside this library. All 64 bit numbers will then be returned as a Long instance instead of a possibly unsafe JavaScript number (see).

  • For descriptor.proto interoperability, see ext/descriptor

Building

To build the library or its components yourself, clone it from GitHub and install the development dependencies:

Building the respective development and production versions with their respective source maps to dist/:

Building the documentation to docs/:

Building the TypeScript definition to index.d.ts:

Browserify integration

By default, protobuf.js integrates into any browserify build-process without requiring any optional modules. Hence:

  • If int64 support is required, explicitly require the long module somewhere in your project as it will be excluded otherwise. This assumes that a global require function is present that protobuf.js can call to obtain the long module.

    If there is no global require function present after bundling, it's also possible to assign the long module programmatically:

  • If you have any special requirements, there is the bundler for reference.

License: BSD 3-Clause License

Last updated