What are Protocol Buffers and why they are widely used?
Hello everyone. In this article, we are going to look at one of the interesting topics — Protocol Buffers, which are called Protobuf in short. Protocol buffers are a method of serializing data like XML and JSON and are very efficient and super fast over the network. Let us see about Protobuf in detail and go through the Java code to understand it better.

What is Protobuf?
To make remote procedure calls, we need a way to transform objects (payload) in memory into bytes such that they can be transmitted to the other systems as shown below. There are many names for it: serialization, marshaling, encoding, etc

A lot of programming languages support these mechanisms by default and they often use JSON format. It is a human-readable format and is widely used. But JSON tends to be slower and get bigger in size when we use it with big data or when a number of microservices communicate with each other. JSON also has forward and backward compatibility issues.
To address these issues, google introduced Protobuf in 2008. Protobuf is an Interface Definition Language that is language-neutral and platform-neutral. It is a way of serializing structured data and transmitting it over the network Using Protobuf, binary is transmitted which improves the speed of transmission compared to JSON’s string format. It is one of the important pillars of gRPC protocol along with HTTP/2.
With Protocol buffers, we define the structure of the models once and then use generated source code to easily write and read structured data to/from a variety of data streams using a variety of languages like Java, Python, Go, Ruby, and C++, C# and so on.
Data Serialization with Protocol Buffers
Let us look at the steps performed to achieve serialization using Protobuf and in the next section, I will walk you through the Java code.
With protobuf, we define the message format in a .proto file. Then we use the protobuf compilers to generate the client and service side code to encode and parse the data as shown below

Steps to be performed to serialize/deserialize data using Protobuf

- Create a .proto text file
In a .proto text file, we define a schema for the object which can also embed documentation inside.
2. Compile the .proto file to a language-specific source file
Using a protoc compiler, the .proto file is automatically compiled into source code in any of the supported languages likeJava, Python, Go, Ruby, and C++, C#, Dart, Objective-C, Ruby, and more.
3. Generate executable package
The executable package is generated and deployed along with the source files generated from the Protobuf code. At runtime, messages are compressed and serialized in binary format.
4. Deserialize the data by the recipient
The recipient uses the same package structure to deserialize the transmitted data back to the original form
Protocol Buffers Definition Syntax
When defining .proto files, we can specify that a field is either optional or repeated (in both proto2 and proto3). proto3 has a new type called singular.
Protocol buffers support the usual primitive data types, such as integers, booleans, and floats. For the complete list, please see https://developers.google.com/protocol-buffers/docs/proto#scalar
A field can also be of:
- A
messagetype, which nests parts of the definition. This is like a class - An
enum type, so you can specify a set of values to choose from. - A
oneoftype, which you can use when a message has many optional fields and at most one field will be set at the same time. - A
maptype, to add key-value pairs to your definition.
Field numbers cannot be repurposed or reused. If we delete a field, we should reserve its field number to prevent someone from accidentally reusing the number.
The following code samples show you an example of this flow in Java.
eg: file name student.proto
syntax = "proto3";
package demo;
message Student{
optional string id = 1;
optional string name = 2;
}In the above code, we have mentioned the version of proto to be proto3. We also assigned a package under which the source files are generated. Then we defined the message field with the name Student which has 2 attributes.
{ id: “S101”, name: “test” }If we provide the values above, then the data transmitted will look like below
Serialized data124S101226test
- In the above message, 1 is Field Identifier. 2 is for data type- which is string 4 is for the length of the data and next is the field value (4 characters)
- Again for field 2, the type is 2 (String) and 4 characters (test)
- We can clearly see that JSON is highly readable but the Protobuf takes less space when compared to JSON data
Compile the .proto file to a Java class
To use protocol buffers in Java and provide serialization and de-serialization of different formats based on Google’s protobuf Message. we need to add the following Maven dependencies
<dependency>
<groupId>com.google.protobuf</groupId>
<artifactId>protobuf-java</artifactId>
<version>${protobuf-java.version}</version>
</dependency>
<dependency>
<groupId>com.googlecode.protobuf-java-format</groupId>
<artifactId>protobuf-java-format</artifactId>
<version>${protobuf-java-format.version}</version>
</dependency>A protocol buffer compiler converts the .proto file to the Java class. It generates a .java file with a class for each message type, as well as a special Builder class for creating message class instances.
Compiling this .proto file creates a Builder class that you can use to create new instances, as in the following Java code:
Student john = Student.newBuilder()
.setId("1234")
.setName("test")
.setEmail("[email protected]")
.build();Maven plugin to compile .proto files to .java files
To generate java files from the .proto file automatically during mvn install, we need to add the following maven plugin
<plugin>
<groupId>org.codehaus.mojo</groupId>
<artifactId>build-helper-maven-plugin</artifactId>
<executions>
<execution>
<id>add-protobuf-generate-sources</id>
<phase>generate-sources</phase>
<goals>
<goal>add-source</goal>
</goals>
<configuration>
<sources>
<source>${project.build.directory}/generated-sources</source>
</sources>
</configuration>
</execution>
</executions>
</plugin>- mvn install will generate the source file as defined in the .proto file and the generated file will have setters, getters, constructors, and builders for our defined messages. It will also have some util methods for saving protobuf files and deserializing them from binary format to Java class.
We will have the normal REST API endpoint that calls the service class to generate a Student object using the Builder pattern
@RequestMapping("/students")
Student listAllStudents() {
return studentSvc.getAllStudent();
}The following bean will be used to convert responses returned by @RequestMapping annotated methods to protocol buffer messages instead of the JSON format.
@Bean
ProtobufHttpMessageConverter protobufHttpMessageConverter() {
return new ProtobufHttpMessageConverter();
}With this, the client receives the serialized data as follows
Serialized data124S101226test
Then the other microservices/applications that call this service will have the .class files in the package. So when this API response is received, the protobuf data is deserialized into data to the original format of Student
Student{ id: “S101”, name: “test” }Advantages of Protobuf
- Optimization: When using Protocol Buffers for sending messages over the network, the payloads are serialized in binary. So they are much smaller when compared to XML or JSON. This will save you bandwidth and improve network performance, especially in a microservice architecture where there are a lot of network calls.
- Efficient parsing: Parsing with Protocol Buffers is less CPU-intensive because data is represented in a binary format which minimizes the size of encoded messages. This means that message exchange happens faster, even in devices with a slower CPU like IoT or mobile devices.
- Schema and Validation: By forcing programmers to use schema, we can ensure that the message doesn’t get lost between applications and that the structure of the data stays intact on another service as well. Protocol Buffer validates the value type of the given data during encoding and decoding, ensuring data integrity during data transmission.
- Language Support: Availability in many programming languages
- Backward Compatability: By using the unique field number, Protocol Buffer offers excellent backward compatibility. Users can easily add or delete a field on the sender/recipient side without worrying about compatibility issues.
How Does Protobuf differ from JSON?
JSON and Protobuf are different message formats that are widely used. They have been developed with different goals. JSON messages are sent in text format and they are completely independent and supported by most programming languages.
Protobuf is a set of rules and tools to define and exchange messages. Google has made it open source and provides tools to generate code for the most used programming languages Besides that, Protobuf has more data types than JSON, like enumerates and methods, and is also heavily used on Remote Procedure Calls (RPC)
Although JSON is widely used, it has the following drawbacks
- It’s a schema-less data format
- It’s a text-based encoding data format, which results in a bigger data size
- It doesn’t impose any validation on the schema level
The Protobuf addresses the above drawbacks of JSON for the communication between the systems by focussing solely on the ability to serialize and deserialize data as fast as possible.
The data and context aren’t separate in JSON and XML, whereas in Protobuf, it is separate and we need to check the schema to understand the data which makes it one of the drawbacks of using a Protobuf
In this article, we saw what is Protobuf and we saw the steps to Serialize the data using the Protobuf format. We also saw a Java code example along with the dependencies needed to compile the .proto files. We saw the benefits of protobuf and then ended up comparing it with JSON format.
Hope you found this article useful and thanks for reading!!!
In the next article, we will be going through the gRPC protocol which uses HTTP/2 and Protobuf as its pillars. Please look at my articles on HTTP/2 and gRPC if you want to learn about it
If you like to get more updates from me, please follow me on Medium and subscribe to email alert.If you are considering to buy a medium membership, please buy through my referral link https://dineshchandgr.medium.com/membership





