mardi 25 décembre 2018

ZMQ and Google Buffer Protocol V3

Introduction

ZMQ is high level UNIX sockets abstraction library allowing to send data in binary format (typically in Google Buffer Protocol). Previous articles on this blog already explained both ZMQ and Google Buffer Protocol V2.

Google Buffer Protocol V3

Google Buffer Protocol V3 (also called protobuf) is just a step ahead it's predecessor (It's recommended to read Google Buffer Protocol V2 before going further). As already said in previous article, Google Buffer Protocol lifecycle goes through 3 steps :
  1. Describe the layout of data to transmit in .proto file.
  2. Compile .proto file using protoc (protobuf compiler) to generate corresponding classes.
  3. Use generated classes in your code.
We are going to review the first two steps as they have changed a little from previous version of protobuf.

Protobuf3 layout description

Data must be described using protobuf syntax which looks like regular C structures. An example would be :
syntax = "proto3";

package com_company_atom;

message AtomStructure{
     string atom_name = 1;
     int32  atom_nb_protons = 2;
     int32  atom_nb_neutrons = 3;
     int32  atom_nb_electrons = 4;
     bool   atom_is_radioactive = 5;
}
Let's discuss the structure of a .proto file :

Prootbuf syntax version declaration

This must be the first line (don't insert comments or empty lines before syntax declaration) in proto file. Version-3 is the latest version of protobuf.

Remark : Every proto file must start with a version syntax declaration, otherwise protoc will assume version 2 by default (syntax="proto2").

Package name (optional)

Although being not mandatory, package names will be translated into namespaces (to avoid naming conflicts in your code). One should always include a package name.

Message description

Data content is described using message keyword, and fields are conventional data types (string, bool, int32, int64, float, double, ..., etc).

Some conventions

Naming conventions
  • Message name : message names should have a capital letter for every new word.
  • field names : should be in lower case separated by "_" for every new word.
The following schematic summarizes the above two properties :
Field identifiers

Every field must be given a unique ID starting from 1 (in our example : atom_name have an ID of 1).

One should ask, We do wee need an ID if field names are already unique? The reader should keep in mind that protobuf does serialize field names (because field names as being string in nature requires more bytes). Protocol buffer serializes only field type + field ID.

Remark : When field ID is less than 16, only one byte is required to serialize field type + field ID

Protoc V3

Protoc-V3 is not available in repository as this time of writing (only Protoc-V2 can be found).

Installing protoc-V3

One can easily download and compile protoc sources as follow:
  • Getting required dependencies :
    $ sudo apt-get install autoconf automake libtool curl make g++ unzip
    
  • Download protobuf-all-[VERSION].tar.gz (this compiler can generate protobuf classes for various languages like Python and C++) from https://github.com/protocolbuffers/protobuf/releases.
  • Compile sources as shown :
    $ cd protobuf
    $ ./configure
    $ make
    $ sudo make install
    $ sudo ldconfig # refresh shared library cache
    

Compiling proto buffer files

As We have already mentioned, protoc can compile proto files to multiple programming languages. The general compilation syntax :
$ protoc --[LANGUAGE]_out=[OUTPUT_GENERATED_CLASS_DIRECTORY] [PATH_PROTO_FILE]
An example is shown below: Some remarks :
  • C++ : protoc generates two files : fileName.pb.h (to include in your code) and fileName.pb.cc (to include for compilation).
  • Python : protoc generates one file fileName_pb2.py (to be imported in your code).

Classes generated by protoc contain at least setters and getters for every field name.

Working with protobuf

Let's have a practical example in both Python and C++ and see how We can serialize our data using Google Protocol Buffer.
  1. Writing a proto file :
    syntax = "proto3";
    
    package com_company_pet;
    
    message PetIdentity{
        string pet_name = 1;
        int32  pet_age = 2;
        bool   pet_gender = 3;
    }
    
    
    // petIdentity.proto
    
  2. Generating protobuf classes :
    $ protoc --cpp_out=. petIdentity.proto
    $ protoc --python_out=. petIdentity.proto
    
  3. Using protobuf classes in our code:
    • Python :
      # main.py
      # import petIdentity_pb2 module
      import petIdentity_pb2
      import sys
      
      
      print("-------- Serializing data -------")
      # Create an instance of PetIdentity
      petIdentity = petIdentity_pb2.PetIdentity()
      
      
      # Fill PetIdentity instance
      petIdentity.pet_name = "Oscar";
      petIdentity.pet_age = 2;
      petIdentity.pet_gender = True;
      
      # Serialize PetIdentity instance using protobuf
      petIdentitySerialized = petIdentity.SerializeToString()
      
      # display serialized data
      print("Serialized Data : " + petIdentitySerialized) 
      
      print("") # add empty line
      
      print("-------- Deserializing data -------")
      # Create an instance of PetIdentity for deserialization
      petIdentityDeserialized = petIdentity_pb2.PetIdentity()
      
      # Deserialize Serialized data
      petIdentityDeserialized.ParseFromString(petIdentitySerialized)
      
      # Display deserialized data
      print("Cat-Name : " + petIdentityDeserialized.pet_name + " <===> Cat-age : " + str(petIdentityDeserialized.pet_age) + " <===> Cat-gender : " + ("male" if petIdentityDeserialized.pet_gender  else "female"))
      
      The above code yields to the following output :

      Remark : In practice, the serialized data (in this example, it's petIdentitySerialized) is what we need to send through the network.

    • C++ :
      /* 
         --------------- main.cpp ----------
         ----- Google Protocol Buffer ------
         --------- Serializer and ----------
         ------- Deserializer Example ------
      */
      
      #include <iostream>
      #include <string>
      #include "petIdentity.pb.h"
      using namespace std;
      
      
      int main(){
          GOOGLE_PROTOBUF_VERIFY_VERSION; // it's recommanded by Google to make sure that the correct protobuf library is loaded
          
          /* -------------------------------
             ---- Protobuf serialization --- 
             ------------ process ----------
             -------------------------------
          */
          com_company_pet::PetIdentity petIdentity; // Create an instance of PetIndentity
      
          petIdentity.set_pet_name("Oscar"); // Set pet name to Oscar
          petIdentity.set_pet_age(2); // Set pet age to 2 years
          petIdentity.set_pet_gender(false); // Set gender to female
      
          string petIdentitySerialized;
      
          petIdentity.SerializeToString(&petIdentitySerialized);    
      
          cout << "Serialized protobuf data : " << petIdentitySerialized << endl;
       
      
      /* 
         ---------------------------------------------
         ------ Protobuf deserialization process -----
         ---------------------------------------------
      */
      
          com_company_pet::PetIdentity petIdentityDeserialized;
          
          petIdentityDeserialized.ParseFromString(petIdentitySerialized);
          cout << "\nDeserializing the data" << endl;
          cout << "Cat-Name : " << petIdentityDeserialized.pet_name() << " <===> Cat-age : " << petIdentityDeserialized.pet_age() << " <===> Cat-gender : " << (petIdentityDeserialized.pet_gender()?"male":"female") << endl; 
      
      
      
      
          google::protobuf::ShutdownProtobufLibrary(); // free all resources
          return 0;    
      }
      
      Executing the above code generates the following output :

      Remark : In practice, the serialized data (in this example, it's petIdentitySerialized) is what we need to send through the network.

Sending protobuf data with ZMQ

As one may expect, protobuf data are expected to be sent through the network. We may use traditional UNIX sockets, however; they can become quickly a bottle in the neck.

ZMQ is an easier, reliable and less cumbersome to use. Previous post already discussed ZMQ. Google Protocol Buffer is cross platform and can be used between multiple languages.
  1. Creating a proto file :
    syntax = "proto3";
    
    package com_company_caesar;
    
    message CaesarCipher {
        string caesar_cipher_text = 1; // Carries caesar cipher
        int32 shift_key = 2; // Shift key (it is equal to 3)
    }
    
  2. Heterogeneous Publisher and Subscriber
    • Python Publisher :
      import caesarCipher_pb2
      import zmq
      import time
      
      def encryptCaesarCipher(plainText, shiftKey):
          cipherText = ""    
          for character in plainText:
              # shift every letter in message by 3
              cipherText += chr(ord(character) + shiftKey) 
          return cipherText
      
      def serializeToProtobuf(msg, caesarCipherProto, shiftKey):
          # fill caesarCipherProto
          caesarCipherProto.caesar_cipher_text = encryptCaesarCipher(msg, shiftKey)
          caesarCipherProto.shift_key = shiftKey
          # return serialized protobuf caesarCipherProto
          return caesarCipherProto.SerializeToString()
      
      # messages to send
      messagesPlainText = ["hello world!", "programming is awesome", "computer science"]
      caesarCipherProto = caesarCipher_pb2.CaesarCipher()
      
      
      portPublisher = "5580"
      # create an zmq context
      context = zmq.Context()
      # create a publisher socket
      socket = context.socket(zmq.PUB)
      # Bind the socket at a predefined port  
      socket.bind("tcp://*:%s" % portPublisher)
      
      
      while True:
          for msg in messagesPlainText:
              # serialize caesarCipherProto into protobuf format
              dataSerialized = serializeToProtobuf(msg, caesarCipherProto, 3)
              print("Plain Text : " + msg + " <===> Caesar cipher : " + caesarCipherProto.caesar_cipher_text)
              print("Protobuf message to send : " + str(dataSerialized)) # display caesarCipherProto data
              time.sleep(1)
              socket.send(b""+dataSerialized) # send binary serialized data
              print("---------------------------------")
              print("---------------------------------")
              print("---------------------------------")
      
    • C++ Subscriber :
      #include <iostream>
      #include <zmq.hpp>
      #include <string>
      #include "caesarCipher.pb.h"
      
      using namespace std;
      
      void DecryptCipherDisplay(std::string cipherText, int cipherKey);
      
      int main(){
          GOOGLE_PROTOBUF_VERIFY_VERSION;
          /* -------------------------- */
          /* Create a subscriber socket */
          /* -------------------------- */
          zmq::context_t context(1);
          zmq::socket_t subSocket(context, ZMQ_SUB);
          // Connect to pyton's publisher binding port
          subSocket.connect("tcp://localhost:5580"); 
        
          cout << "------ Subscriber running ------\n" << endl;
          // Listen for all topics
          subSocket.setsockopt(ZMQ_SUBSCRIBE, "" , strlen(""));
          // Instantiate a CaesarCipher to be filled with received data
          com_company_caesar::CaesarCipher caesarCipher; 
          while(true) {
              
              zmq::message_t zmqMessageReceived; // used to hold zmq received data
              subSocket.recv(&zmqMessageReceived); // Blocks until data reception
              // Map zmq data holder to string
              std::string messageReceived(static_cast<char*>(zmqMessageReceived.data()), zmqMessageReceived.size());
              // Deserialize protobuf data and store them into caesarCipher
              caesarCipher.ParseFromString(messageReceived);
               
              // Descrypt caesar cipher and display received string
              DecryptCipherDisplay(caesarCipher.caesar_cipher_text(), caesarCipher.shift_key());
                 
              cout << "-------------------------------------" << endl;
          }        
      
          google::protobuf::ShutdownProtobufLibrary();
      
          return 0;
      }
      
      
      void DecryptCipherDisplay(std::string cipherText, int cipherKey){
          string::iterator it;
          string PlainTextRecovered;
          for (it = cipherText.begin(); it < cipherText.end(); it++) 
              PlainTextRecovered += static_cast<char>(*it - cipherKey); // reverse caesar cipher
          cout <<  "Reversing caesar cipher : "<< PlainTextRecovered << endl;
      }
      
  3. Testing the communication :

Aucun commentaire:

Enregistrer un commentaire