Analyze Google Protocol Buffers encoded files and network traffic

I just uploaded a new build of Unsniff 1.8 Beta that supports Google’s new protocol buffers scheme. Basically, you can stick your proto files in a particular folder and decode files and network streams on the fly.
Click here for step by steps on how to use this feature.

This is Beta software. Please report problems and suggestions – either as comments to this post or to the forum.

In the rest of the post, I will explain why we worked on this feature and how it works.
—–

If you have not yet heard, Protocol Buffers (protobuf) is a serialization mechanism for structured data.

From Googles Open Source Blog,

Protocol Buffers allow you to define simple data structures in a special definition language, then compile them to produce classes to represent those structures in the language of your choice. These classes come complete with heavily-optimized code to parse and serialize your message in an extremely compact format.

Blog post by Kenton Varda, Software Engineering Team at Google

You can visit the project page for more detail.

When the project was first announced in July 08, I was immediately attracted to it. It sounded like a perfect test case for Unsniff 2.0’s dynamic plugin framework.

A little background first, Unsniff Network Analyzer is a multi layer, scriptable, and content aware network analyzer. One of the cool things about Unsniff is its API. You can write a variety of plugins using the Unsniff API, but protocol plugins are the most common.

The types of protocol plugins you could write are.

  • A native plugin. A protocol plugin written as a C++ ATL COM Object using the framework provided. It is packaged as a DLL.
  • A dynamic plugin. Written using XML which describes the protocol in detail.
  • A mix. The XML handles the field dissection and the ATL handles other things like reassembly, custom descriptions, etc.

In Unsniff 2.0, we are introducing a new concept called “Custom Dynamic Plugin”. Instead of XML, the user can create plugins in any “IDL like” language they could parse. The API provides hooks so and they can be integrated into the Unsniff framework. This approach has great advantages because frequently a user has hundreds of in-house protocol messages in a custom format. They cannot be expected to write “XML documents” and certainly not “C functions”.
So, we decided to try supporting Protocol Buffers in the Beta (Unsniff 1.8) as a way to test out the concept. The way it works is.

1. You stick all your proto files in a special folder

2. You write a small XML stub describing each protocol and how they integrate into the Unsniff framework (eg, which ports they operate on, the name of the protocol, the ID etc)

Thats it !

When required, Unsniff will compile each proto on the fly and create a dynamic custom decoder. This supports decoding network packets as well as files containing protobuf encoded data.

You get all of Unsniff’s larger network features for free. This includes handling many link layer protocols, TCP segmentation, IP defragmentation, TLS decryption for debugging, etc. Each message is shown as a separate PDU in the PDU sheet. These messages could span multiple packets or several could be contained in a single link layer packet.

You can download the latest builds from the Beta Page

Enjoy !

—-

Postscript

I wrote a custom parser and lexer for proto files. It handles pretty much everything including groups, extensions, import files, package names, etc. I could have just used the library’s methods for compiling it, but I was already too far down the road of YACC. I also wanted to extract the comments in the proto file, which the grammar does.

If anyone is interested I can post the YACC and LEX files as public domain. I will post this offer in the discussion group.

Author: Vivek Rajagopalan

Vivek Rajagopalan is the a lead developer for Trisul Network Analytics. Prior products were Unsniff Network Analyzer and Unbrowse SNMP. Loves working with packets , very high speed networks, and helping track down the bad guys on the internet.