Is Protobuf 5x Faster Than JSON? (Part 2)
Many articles out there claim that Protobuf is a better choice than JSON for performance reasons. But is that really true? The investigation continues here.
Join the DZone community and get the full member experience.
Join For FreeIn Part 1, we started looking at some benchmarks regarding whether Protobuf is actually the better choice as opposed to JSON for performance reasons. Let's continue the discussion here.
Decode Object
We have proved JSON is slow for numeric input. What about object binding itself? Is the benchmarking result bad because binding is slow in JSON? Given that we are using 10 fields in the benchmark, let’s find out.
To make the game fair, we use a short and pure ASCII string field this time. The string copying performance should be very similar. So, the performance difference should come from the binding process.
message PbTestObject {
string field1 = 1;
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 2.52 | 68666.658 |
Thrift | 2.74 | 63139.324 |
Jsoniter | 5.78 | 29887.361 |
DSL-Json | 5.32 | 32458.030 |
Jackson | 1 | 172747.146 |
For 1 string field, Protobuf is actually slower than Jsoniter by 2.3x.
We can repeat the same test for 5 fields, 10 fields, and 15 fields to see a pattern.
message PbTestObject {
string field1 = 1;
string field2 = 2;
string field3 = 3;
string field4 = 4;
string field5 = 5;
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.3 | 276972.857 |
Thrift | 1.44 | 250016.572 |
Jsoniter | 2.5 | 143807.401 |
DSL-Json | 2.41 | 149261.728 |
Jackson | 1 | 359868.351 |
For 5 string fields, Protobuf is only 1.3x faster than Jackson. If you think JSON object binding is slow and that it will dominate the performance — you are wrong.
message PbTestObject {
string field1 = 1;
string field2 = 2;
string field3 = 3;
string field4 = 4;
string field5 = 5;
string field6 = 6;
string field7 = 7;
string field8 = 8;
string field9 = 9;
string field10 = 10;
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.22 | 462167.920 |
Thrift | 1.12 | 503725.605 |
Jsoniter | 2.04 | 277531.128 |
DSL-Json | 1.84 | 307569.103 |
Jackson | 1 | 564942.726 |
For 10 string fields, Protobuf is only 1.22x faster than Jackson. I think you get the idea. The binding process in Protobuf is done by switch case:
boolean done = false;
while (!done) {
int tag = input.readTag();
switch (tag) {
case 0:
done = true;
break;
default: {
if (!input.skipField(tag)) {
done = true;
}
break;
}
case 10: {
java.lang.String s = input.readStringRequireUtf8();
field1_ = s;
break;
}
case 18: {
java.lang.String s = input.readStringRequireUtf8();
field2_ = s;
break;
}
case 26: {
java.lang.String s = input.readStringRequireUtf8();
field3_ = s;
break;
}
case 34: {
java.lang.String s = input.readStringRequireUtf8();
field4_ = s;
break;
}
case 42: {
java.lang.String s = input.readStringRequireUtf8();
field5_ = s;
break;
}
}
}
This implementation is faster than Hashmap, but only marginal faster. DSL-JSON implemented the binding using similar hash dispatching:
switch(nameHash) {
case 1212206434:
_field1_ = com.dslplatform.json.StringConverter.deserialize(reader);
nextToken = reader.getNextToken();
break;
case 1178651196:
_field3_ = com.dslplatform.json.StringConverter.deserialize(reader);
nextToken = reader.getNextToken();
break;
case 1195428815:
_field2_ = com.dslplatform.json.StringConverter.deserialize(reader);
nextToken = reader.getNextToken();
break;
case 1145095958:
_field5_ = com.dslplatform.json.StringConverter.deserialize(reader);
nextToken = reader.getNextToken();
break;
case 1161873577:
_field4_ = com.dslplatform.json.StringConverter.deserialize(reader);
nextToken = reader.getNextToken();
break;
default:
nextToken = reader.skip();
break;
}
The hashing function is FNV:
long hash = 0x811c9dc5;
while (ci < buffer.length) {
final byte b = buffer[ci++];
if (b == '"') break;
hash ^= b;
hash *= 0x1000193;
}
Hash will collide, so use with caution. If the input is likely to contain an unknown field, use the slower version to check the string again once hash matched. Jsoniter has a decoding mode DYNAMIC_MODE_AND_MATCH_FIELD_STRICTLY
, which will generate exact matching code:
switch (field.len()) {
case 6:
if (field.at(0) == 102 &&
field.at(1) == 105 &&
field.at(2) == 101 &&
field.at(3) == 108 &&
field.at(4) == 100) {
if (field.at(5) == 49) {
obj.field1 = (java.lang.String) iter.readString();
continue;
}
if (field.at(5) == 50) {
obj.field2 = (java.lang.String) iter.readString();
continue;
}
if (field.at(5) == 51) {
obj.field3 = (java.lang.String) iter.readString();
continue;
}
if (field.at(5) == 52) {
obj.field4 = (java.lang.String) iter.readString();
continue;
}
if (field.at(5) == 53) {
obj.field5 = (java.lang.String) iter.readString();
continue;
}
}
break;
}
iter.skip();
DSL-JSON can also be changed to use exact string match again after hash matched. The numbers are:
library | compared with Jackson | ns/op |
---|---|---|
Jsoniter (hash mode) | 2.13 | 274949.346 |
Jsoniter (strict mode) | 1.95 | 300524.989 |
DSL-JSON (hash mode) | 1.91 | 305812.208 |
DSL-JSON (strict mode) | 1.71 | 343203.344 |
Jackson | 1 | 585421.314 |
The conclusion for object binding is, given field names are short, the number tag dispatching approach is not significantly faster — even compared with Jackson.
Encode Object
For 1 field:
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.22 | 57502.775 |
Thrift | 0.86 | 137094.627 |
Jsoniter | 2.06 | 57081.756 |
DSL-Json | 2.46 | 47890.664 |
Jackson | 1 | 117604.479 |
For 5 fields:
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.68 | 127933.179 |
Thrift | 0.46 | 467818.566 |
Jsoniter | 2.54 | 84702.001 |
DSL-Json | 2.68 | 80211.517 |
Jackson | 1 | 214802.686 |
For 10 fields:
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.72 | 194371.476 |
Thrift | 0.38 | 888230.783 |
Jsoniter | 2.59 | 129305.086 |
DSL-Json | 2.56 | 130379.967 |
Jackson | 1 | 334297.953 |
For object encoding, Protobuf is about 1.7x faster than Jackson, but it is slower than DSL-JSON.
The optimization of object encoding is to write out as many control bytes as possible in one write.
public void encode(Object obj, com.jsoniter.output.JsonStream stream) throws java.io.IOException {
if (obj == null) { stream.writeNull(); return; }
stream.write((byte)'{');
encode_((com.jsoniter.benchmark.with_1_string_field.TestObject)obj, stream);
stream.write((byte)'}');
}
public static void encode_(com.jsoniter.benchmark.with_1_string_field.TestObject obj, com.jsoniter.output.JsonStream stream) throws java.io.IOException {
boolean notFirst = false;
if (obj.field1 != null) {
if (notFirst) { stream.write(','); } else { notFirst = true; }
stream.writeRaw("\"field1\":", 9);
stream.writeVal((java.lang.String)obj.field1);
}
}
If we know the field is not nullable, even the quote of string can be merged and written out once.
public void encode(Object obj, com.jsoniter.output.JsonStream stream) throws java.io.IOException {
if (obj == null) { stream.writeNull(); return; }
stream.writeRaw("{\"field1\":\"", 11);
encode_((com.jsoniter.benchmark.with_1_string_field.TestObject)obj, stream);
stream.write((byte)'\"', (byte)'}');
}
public static void encode_(com.jsoniter.benchmark.with_1_string_field.TestObject obj, com.jsoniter.output.JsonStream stream) throws java.io.IOException {
com.jsoniter.output.CodegenAccess.writeStringWithoutQuote((java.lang.String)obj.field1, stream);
}
The conclusion of object encoding/decoding: Protobuf might be slightly faster than Jackson, but it is actually slower than DSL-JSON.
Decode Integer List
Protobuf has special support for packed integer lists.
22 // tag (field number 4, wire type 2)
06 // payload size (6 bytes)
03 // first element (varint 3)
8E 02 // second element (varint 270)
9E A7 05 // third element (varint 86942)
message PbTestObject {
repeated int32 field1 = 1 [packed=true];
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 2.92 | 249888.105 |
Thrift | 3.63 | 201439.691 |
Jsoniter | 2.97 | 245837.298 |
DSL-Json | 1.97 | 370897.998 |
Jackson | 1 | 730450.607 |
Protobuf is about 3x faster than Jackson, for integer list decoding. However, Jsoniter is as fast as Protobuf in this case.
The decoding loop in Jsoniter is unrolled:
public static java.lang.Object decode_(com.jsoniter.JsonIterator iter) throws java.io.IOException {
java.util.ArrayList col = (java.util.ArrayList)com.jsoniter.CodegenAccess.resetExistingObject(iter);
if (iter.readNull()) { com.jsoniter.CodegenAccess.resetExistingObject(iter); return null; }
if (!com.jsoniter.CodegenAccess.readArrayStart(iter)) {
return col == null ? new java.util.ArrayList(0): (java.util.ArrayList)com.jsoniter.CodegenAccess.reuseCollection(col);
}
Object a1 = java.lang.Integer.valueOf(iter.readInt());
if (com.jsoniter.CodegenAccess.nextToken(iter) != ',') {
java.util.ArrayList obj = col == null ? new java.util.ArrayList(1): (java.util.ArrayList)com.jsoniter.CodegenAccess.reuseCollection(col);
obj.add(a1);
return obj;
}
Object a2 = java.lang.Integer.valueOf(iter.readInt());
if (com.jsoniter.CodegenAccess.nextToken(iter) != ',') {
java.util.ArrayList obj = col == null ? new java.util.ArrayList(2): (java.util.ArrayList)com.jsoniter.CodegenAccess.reuseCollection(col);
obj.add(a1);
obj.add(a2);
return obj;
}
Object a3 = java.lang.Integer.valueOf(iter.readInt());
if (com.jsoniter.CodegenAccess.nextToken(iter) != ',') {
java.util.ArrayList obj = col == null ? new java.util.ArrayList(3): (java.util.ArrayList)com.jsoniter.CodegenAccess.reuseCollection(col);
obj.add(a1);
obj.add(a2);
obj.add(a3);
return obj;
}
Object a4 = java.lang.Integer.valueOf(iter.readInt());
java.util.ArrayList obj = col == null ? new java.util.ArrayList(8): (java.util.ArrayList)com.jsoniter.CodegenAccess.reuseCollection(col);
obj.add(a1);
obj.add(a2);
obj.add(a3);
obj.add(a4);
while (com.jsoniter.CodegenAccess.nextToken(iter) == ',') {
obj.add(java.lang.Integer.valueOf(iter.readInt()));
}
return obj;
}
Encode Integer List
Integer list encoding should be fast in Protobuf, which does not need to write out all those commas.
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.35 | 159337.360 |
Thrift | 0.45 | 472555.572 |
Jsoniter | 1.9 | 112770.811 |
DSL-Json | 2.19 | 97998.250 |
Jackson | 1 | 214409.223 |
Protobuf is only 1.35x faster than Jackson. Although integer object fields are faster in Protobuf, integer lists are not. There is no special optimization here. DSL-JSON is faster than Jackson because individual numbers are written out faster.
Decode Object List
The list is more frequently being used as a container for objects.
message PbTestObject {
message ElementObject {
string field1 = 1;
}
repeated ElementObject field1 = 1;
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.26 | 1118704.310 |
Thrift | 1.3 | 1078278.555 |
Jsoniter | 2.91 | 483304.365 |
DSL-Json | 2.22 | 635179.183 |
Jackson | 1 | 1407116.476 |
Protobuf is about 1.3x faster than Jackson for object list decoding, but DSL-JSON is faster than Protobuf.
Encode Object List
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 2.22 | 328219.768 |
Thrift | 0.38 | 1885052.964 |
Jsoniter | 3.63 | 200420.923 |
DSL-Json | 3.87 | 187964.594 |
Jackson | 1 | 727582.950 |
Protobuf is more than 2x faster than Jackson for object list encoding, but DSL-JSON is faster than Protobuf. It seems like Protobuf is not good at list encoding/decoding.
Decode Double Array
Java arrays are special. double[]
is more efficient than List<Double>
. It is very common to see an array of doubles to represent the value/coordinate of the time interval. However, the Protobuf Java library does not speak double[]
. It will always use List<Double>
. We can expect a win for JSON here.
message PbTestObject {
repeated double field1 = 1 [packed=true];
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 5.18 | 207503.316 |
Thrift | 6.12 | 175678.703 |
Jsoniter | 3.52 | 305553.080 |
DSL-Json | 2.8 | 383549.289 |
Jackson | 1 | 1075423.265 |
Protobuf is more than 5x faster than Jackson for double array decoding, but compared with Jsoniter, Protobuf is only 1.47x faster. So, if you have a lot of double numbers but they are in the array instead of on the fields, the performance difference is smaller.
The loop is unrolled in Jsoniter and optimized for a small array.
public static java.lang.Object decode_(com.jsoniter.JsonIterator iter) throws java.io.IOException {
... // abbreviated
nextToken = com.jsoniter.CodegenAccess.nextToken(iter);
if (nextToken == ']') {
return new double[0];
}
com.jsoniter.CodegenAccess.unreadByte(iter);
double a1 = iter.readDouble();
if (!com.jsoniter.CodegenAccess.nextTokenIsComma(iter)) {
return new double[]{ a1 };
}
double a2 = iter.readDouble();
if (!com.jsoniter.CodegenAccess.nextTokenIsComma(iter)) {
return new double[]{ a1, a2 };
}
double a3 = iter.readDouble();
if (!com.jsoniter.CodegenAccess.nextTokenIsComma(iter)) {
return new double[]{ a1, a2, a3 };
}
double a4 = (double) iter.readDouble();
if (!com.jsoniter.CodegenAccess.nextTokenIsComma(iter)) {
return new double[]{ a1, a2, a3, a4 };
}
double a5 = (double) iter.readDouble();
double[] arr = new double[10];
arr[0] = a1;
arr[1] = a2;
arr[2] = a3;
arr[3] = a4;
arr[4] = a5;
int i = 5;
while (com.jsoniter.CodegenAccess.nextTokenIsComma(iter)) {
if (i == arr.length) {
double[] newArr = new double[arr.length * 2];
System.arraycopy(arr, 0, newArr, 0, arr.length);
arr = newArr;
}
arr[i++] = iter.readDouble();
}
double[] result = new double[i];
System.arraycopy(arr, 0, result, 0, i);
return result;
}
Encode Double Array
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 15.63 | 107760.788 |
Thrift | 0.54 | 3125678.472 |
Jsoniter (only 6 digit precision) | 6.74 | 249945.866 |
DSL-Json | 1.14 | 1478332.248 |
Jackson | 1 | 1684935.837 |
Protobuf is more than 15x faster than Jackson for double array encoding. Protobuf is 2.3x faster than Jsoniter with 6-digit precision. Again, double encoding is really really slow in JSON.
Decode String
JSON strings contain escaped characters. Protobuf can decode string by a simple array copy. The string tested is a 160-character ASCII.
syntax = "proto3";
option optimize_for = SPEED;
message PbTestObject {
string field1 = 1;
}
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 1.85 | 173680.548 |
Thrift | 2.29 | 140635.170 |
Jsoniter | 2.4 | 134067.924 |
DSL-Json | 2.27 | 141419.108 |
Jackson | 1 | 321406.155 |
Protobuf is 1.85x faster than Jackson for long string decoding. However, DSL-JSON is actually faster than Protobuf. There are a couple of interesting things happening here.
Fast Path
DSL-JSON implemented a fast path for ASCII.
for (int i = 0; i < chars.length; i++) {
bb = buffer[ci++];
if (bb == '"') {
currentIndex = ci;
return i;
}
// If we encounter a backslash, which is a beginning of an escape sequence
// or a high bit was set - indicating an UTF-8 encoded multibyte character,
// there is no chance that we can decode the string without instantiating
// a temporary buffer, so quit this loop
if ((bb ^ '\\') < 1) break;
chars[i] = (char) bb;
}
This fast path avoids the cost to process escaped char
and UTF-8
.
JVM Hotspot Optimization
Before JDK9, java.lang.String
was char[]
-based. The input is byte[]
and UTF-8
encoded; we cannot copy directly from byte[]
to char[]
. In JDK9, java.lang.String
is changed to be byte[]-based
. If we take a look the JDK 9 source code:
@Deprecated(since="1.1")
public String(byte ascii[], int hibyte, int offset, int count) {
checkBoundsOffCount(offset, count, ascii.length);
if (count == 0) {
this.value = "".value;
this.coder = "".coder;
return;
}
if (COMPACT_STRINGS && (byte)hibyte == 0) {
this.value = Arrays.copyOfRange(ascii, offset, offset + count);
this.coder = LATIN1;
} else {
hibyte <<= 8;
byte[] val = StringUTF16.newBytesFor(count);
for (int i = 0; i < count; i++) {
StringUTF16.putChar(val, i, hibyte | (ascii[offset++] & 0xff));
}
this.value = val;
this.coder = UTF16;
}
}
Using the deprecated but still available constructor, we can use Arrays.copyOfRange
to construct a java.lang.String
now. However, after testing, it turns out this is not faster than DSL-JSON implementation.
It seems like the JVM Hotspot is doing some loop code pattern matching here. If the loop is written in this way, the string is constructed directly from copying byte[]
. Even if in JDK9 with +UseCompactStrings
, in theory, the byte[]
> char[]
> byte[]
conversion should be slow. However, it turns out DSL-JSON implementation is still the fastest.
If the input is mostly string. Then this optimization is crucial. The art of parsing in Java is more about the art of copying bytes into a JVM-stupid java.lang.String
. In a modern language like Go, the string is UTF-8 byte[]
-based, which is wise.
Encode String
Similar problem. We cannot copy char[]
into byte[]
.
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 0.96 | 262077.921 |
Thrift | 0.99 | 252140.935 |
Jsoniter | 1.5 | 166381.978 |
DSL-Json | 1.38 | 181008.120 |
Jackson | 1 | 250431.354 |
Protobuf is slightly slower than Jackson when encoding long strings. Again, the char[]
-based string is the problem here.
Skip Structure
JSON is a format without a header. Without a header, JSON will need to scan every byte to locate the field needed — even if other fields are not intended to be parsed.
message PbTestWriteObject {
repeated string field1 = 1;
message Field2 {
repeated string field1 = 1;
repeated string field2 = 2;
repeated string field3 = 3;
}
Field2 field2 = 2;
string field3 = 3;
}
message PbTestReadObject {
string field3 = 3;
}
The message will be written using PbTestWriteObject
and read by PbTestReadObject
. field1
and field2
should be skipped.
library | compared with Jackson | ns/op |
---|---|---|
Protobuf | 5.05 | 152194.483 |
Thrift | 5.43 | 141467.209 |
Jsoniter | 3.75 | 204704.100 |
DSL-Json | 2.51 | 305784.845 |
Jackson | 1 | 768840.597 |
Protobuf can skip the structure faster than Jackson by 5x. If skipping long string, the cost for JSON will be linear to the string size, while the cost of Protobuf is constant time.
Summary
It is the time to count the scores!
scenario | Protobuf V. Jackson | Protobuf V. Jsoniter | Jsoniter V. Jackson |
---|---|---|---|
Decode Integer | 8.51 | 2.64 | 3.22 |
Encode Integer | 2.9 | 1.44 | 2.02 |
Decode Double | 13.75 | 4.4 | 3.13 |
Encode Double | 12.71 | 1.96 (only 6 digits precision) | 6.5 |
Decode Object | 1.22 | 0.6 | 2.04 |
Encode Object | 1.72 | 0.67 | 2.59 |
Decode Integer List | 2.92 | 0.98 | 2.97 |
Encode Integer List | 1.35 | 0.71 | 1.9 |
Decode Object List | 1.26 | 0.43 | 2.91 |
Encode Object List | 2.22 | 0.61 | 3.63 |
Decode Double Array | 5.18 | 1.47 | 3.52 |
Encode Double Array | 15.63 | 2.32 (only 6 digits precision) | 6.74 |
Decode String | 1.85 | 0.77 | 2.4 |
Encode String | 0.96 | 0.63 | 1.5 |
Skip Structure | 5.05 | 1.35 | 3.75 |
Worst case scenario for JSON:
- Skip very long string — proportional to the string length.
- Decode double field — Protobuf can be 4.4x faster.
- Encode double field with precision — it can be 12.71x slower to write all the digits out.
- Decode integer — Protobuf can be 2.64x faster.
If your real workload is unlike the worst case scenario mentioned above, but mostly composed of strings, then the speedup should be < 2x (compared against Jsoniter); Protobuf can even be slower if you are really unlucky.
Opinions expressed by DZone contributors are their own.
Comments