Introduction
ZMQ is high level UNIX sockets abstraction library allowing to send data in binary format (typically in Google Buffer Protocol). Previous articles on this blog already explained both ZMQ and Google Buffer Protocol V2.Google Buffer Protocol V3
Google Buffer Protocol V3 (also called protobuf) is just a step ahead it's predecessor (It's recommended to read Google Buffer Protocol V2 before going further). As already said in previous article, Google Buffer Protocol lifecycle goes through 3 steps :- Describe the layout of data to transmit in .proto file.
- Compile .proto file using protoc (protobuf compiler) to generate corresponding classes.
- Use generated classes in your code.
Protobuf3 layout description
Data must be described using protobuf syntax which looks like regular C structures. An example would be :syntax = "proto3"; package com_company_atom; message AtomStructure{ string atom_name = 1; int32 atom_nb_protons = 2; int32 atom_nb_neutrons = 3; int32 atom_nb_electrons = 4; bool atom_is_radioactive = 5; }
Prootbuf syntax version declaration
This must be the first line (don't insert comments or empty lines before syntax declaration) in proto file. Version-3 is the latest version of protobuf.Remark : Every proto file must start with a version syntax declaration, otherwise protoc will assume version 2 by default (syntax="proto2").
Package name (optional)
Although being not mandatory, package names will be translated into namespaces (to avoid naming conflicts in your code). One should always include a package name.Message description
Data content is described using message keyword, and fields are conventional data types (string, bool, int32, int64, float, double, ..., etc).Some conventions
Naming conventions
- Message name : message names should have a capital letter for every new word.
- field names : should be in lower case separated by "_" for every new word.
Field identifiers
Every field must be given a unique ID starting from 1 (in our example : atom_name have an ID of 1).
Remark : When field ID is less than 16, only one byte is required to serialize field type + field ID
Protoc V3
Protoc-V3 is not available in repository as this time of writing (only Protoc-V2 can be found).Installing protoc-V3
One can easily download and compile protoc sources as follow:- Getting required dependencies :
$ sudo apt-get install autoconf automake libtool curl make g++ unzip
- Download protobuf-all-[VERSION].tar.gz (this compiler can generate protobuf classes for various languages like Python and C++) from https://github.com/protocolbuffers/protobuf/releases.
- Compile sources as shown :
$ cd protobuf $ ./configure $ make $ sudo make install $ sudo ldconfig # refresh shared library cache
Compiling proto buffer files
As We have already mentioned, protoc can compile proto files to multiple programming languages. The general compilation syntax :$ protoc --[LANGUAGE]_out=[OUTPUT_GENERATED_CLASS_DIRECTORY] [PATH_PROTO_FILE]
- C++ : protoc generates two files : fileName.pb.h (to include in your code) and fileName.pb.cc (to include for compilation).
- Python : protoc generates one file fileName_pb2.py (to be imported in your code).
Classes generated by protoc contain at least setters and getters for every field name.
Working with protobuf
Let's have a practical example in both Python and C++ and see how We can serialize our data using Google Protocol Buffer.- Writing a proto file :
syntax = "proto3"; package com_company_pet; message PetIdentity{ string pet_name = 1; int32 pet_age = 2; bool pet_gender = 3; } // petIdentity.proto
- Generating protobuf classes :
$ protoc --cpp_out=. petIdentity.proto $ protoc --python_out=. petIdentity.proto
- Using protobuf classes in our code:
- Python :
The above code yields to the following output :
# main.py # import petIdentity_pb2 module import petIdentity_pb2 import sys print("-------- Serializing data -------") # Create an instance of PetIdentity petIdentity = petIdentity_pb2.PetIdentity() # Fill PetIdentity instance petIdentity.pet_name = "Oscar"; petIdentity.pet_age = 2; petIdentity.pet_gender = True; # Serialize PetIdentity instance using protobuf petIdentitySerialized = petIdentity.SerializeToString() # display serialized data print("Serialized Data : " + petIdentitySerialized) print("") # add empty line print("-------- Deserializing data -------") # Create an instance of PetIdentity for deserialization petIdentityDeserialized = petIdentity_pb2.PetIdentity() # Deserialize Serialized data petIdentityDeserialized.ParseFromString(petIdentitySerialized) # Display deserialized data print("Cat-Name : " + petIdentityDeserialized.pet_name + " <===> Cat-age : " + str(petIdentityDeserialized.pet_age) + " <===> Cat-gender : " + ("male" if petIdentityDeserialized.pet_gender else "female"))
Remark : In practice, the serialized data (in this example, it's petIdentitySerialized) is what we need to send through the network.
- C++ :
Executing the above code generates the following output :
/* --------------- main.cpp ---------- ----- Google Protocol Buffer ------ --------- Serializer and ---------- ------- Deserializer Example ------ */ #include <iostream> #include <string> #include "petIdentity.pb.h" using namespace std; int main(){ GOOGLE_PROTOBUF_VERIFY_VERSION; // it's recommanded by Google to make sure that the correct protobuf library is loaded /* ------------------------------- ---- Protobuf serialization --- ------------ process ---------- ------------------------------- */ com_company_pet::PetIdentity petIdentity; // Create an instance of PetIndentity petIdentity.set_pet_name("Oscar"); // Set pet name to Oscar petIdentity.set_pet_age(2); // Set pet age to 2 years petIdentity.set_pet_gender(false); // Set gender to female string petIdentitySerialized; petIdentity.SerializeToString(&petIdentitySerialized); cout << "Serialized protobuf data : " << petIdentitySerialized << endl; /* --------------------------------------------- ------ Protobuf deserialization process ----- --------------------------------------------- */ com_company_pet::PetIdentity petIdentityDeserialized; petIdentityDeserialized.ParseFromString(petIdentitySerialized); cout << "\nDeserializing the data" << endl; cout << "Cat-Name : " << petIdentityDeserialized.pet_name() << " <===> Cat-age : " << petIdentityDeserialized.pet_age() << " <===> Cat-gender : " << (petIdentityDeserialized.pet_gender()?"male":"female") << endl; google::protobuf::ShutdownProtobufLibrary(); // free all resources return 0; }
Remark : In practice, the serialized data (in this example, it's petIdentitySerialized) is what we need to send through the network.
- Python :
Sending protobuf data with ZMQ
As one may expect, protobuf data are expected to be sent through the network. We may use traditional UNIX sockets, however; they can become quickly a bottle in the neck.
ZMQ is an easier, reliable and less cumbersome to use. Previous post already discussed ZMQ. Google Protocol Buffer is cross platform and can be used between multiple languages.- Creating a proto file :
syntax = "proto3"; package com_company_caesar; message CaesarCipher { string caesar_cipher_text = 1; // Carries caesar cipher int32 shift_key = 2; // Shift key (it is equal to 3) }
- Heterogeneous Publisher and Subscriber
- Python Publisher :
import caesarCipher_pb2 import zmq import time def encryptCaesarCipher(plainText, shiftKey): cipherText = "" for character in plainText: # shift every letter in message by 3 cipherText += chr(ord(character) + shiftKey) return cipherText def serializeToProtobuf(msg, caesarCipherProto, shiftKey): # fill caesarCipherProto caesarCipherProto.caesar_cipher_text = encryptCaesarCipher(msg, shiftKey) caesarCipherProto.shift_key = shiftKey # return serialized protobuf caesarCipherProto return caesarCipherProto.SerializeToString() # messages to send messagesPlainText = ["hello world!", "programming is awesome", "computer science"] caesarCipherProto = caesarCipher_pb2.CaesarCipher() portPublisher = "5580" # create an zmq context context = zmq.Context() # create a publisher socket socket = context.socket(zmq.PUB) # Bind the socket at a predefined port socket.bind("tcp://*:%s" % portPublisher) while True: for msg in messagesPlainText: # serialize caesarCipherProto into protobuf format dataSerialized = serializeToProtobuf(msg, caesarCipherProto, 3) print("Plain Text : " + msg + " <===> Caesar cipher : " + caesarCipherProto.caesar_cipher_text) print("Protobuf message to send : " + str(dataSerialized)) # display caesarCipherProto data time.sleep(1) socket.send(b""+dataSerialized) # send binary serialized data print("---------------------------------") print("---------------------------------") print("---------------------------------")
- C++ Subscriber :
#include <iostream> #include <zmq.hpp> #include <string> #include "caesarCipher.pb.h" using namespace std; void DecryptCipherDisplay(std::string cipherText, int cipherKey); int main(){ GOOGLE_PROTOBUF_VERIFY_VERSION; /* -------------------------- */ /* Create a subscriber socket */ /* -------------------------- */ zmq::context_t context(1); zmq::socket_t subSocket(context, ZMQ_SUB); // Connect to pyton's publisher binding port subSocket.connect("tcp://localhost:5580"); cout << "------ Subscriber running ------\n" << endl; // Listen for all topics subSocket.setsockopt(ZMQ_SUBSCRIBE, "" , strlen("")); // Instantiate a CaesarCipher to be filled with received data com_company_caesar::CaesarCipher caesarCipher; while(true) { zmq::message_t zmqMessageReceived; // used to hold zmq received data subSocket.recv(&zmqMessageReceived); // Blocks until data reception // Map zmq data holder to string std::string messageReceived(static_cast<char*>(zmqMessageReceived.data()), zmqMessageReceived.size()); // Deserialize protobuf data and store them into caesarCipher caesarCipher.ParseFromString(messageReceived); // Descrypt caesar cipher and display received string DecryptCipherDisplay(caesarCipher.caesar_cipher_text(), caesarCipher.shift_key()); cout << "-------------------------------------" << endl; } google::protobuf::ShutdownProtobufLibrary(); return 0; } void DecryptCipherDisplay(std::string cipherText, int cipherKey){ string::iterator it; string PlainTextRecovered; for (it = cipherText.begin(); it < cipherText.end(); it++) PlainTextRecovered += static_cast<char>(*it - cipherKey); // reverse caesar cipher cout << "Reversing caesar cipher : "<< PlainTextRecovered << endl; }
- Python Publisher :
- Testing the communication :