rgloden has asked for the wisdom of the Perl Monks concerning the following question:

I'm trying to think out of the box but I need some help...

Problem:

The embedded system that I work on can generate MegaBytes of data per second. I am looking to automatically analyze the captured bus data for sequence errors.

For instance, one of the protocols is for file transfer which includes transactions between:

- clients
- file server
- storage device

Making the problem tougher is that:

- protocols can be complex, be recursive, and need to remember state data, i.e. reading file 1 block at a time
- transactions from multiple clients occur simultaneously.
- bus contains multiple protocols, i.e. file server, processor control, health statusing, etc. and then lots of data that is a dont care.

My first thought was to write explicit code to walk through state machines. Problem is that each state machine would be a small complex project in itself and probably could not be off loaded to other users on my team.

Next thought is to write a script that will parse a set of protocol rules and generate PERL for Parse::RecDescent. (Maybe a little slow but good for prototyping.)

Very simple non-recursive file transfer example might look like:

Msg : File_Request -- client to file server Type Read File_ID 44 Offset 0 Size 400 Program 50 Msg : File_Response -- file server to client Type Read File_ID 44 Program 50 Status Request_ACK Msg : Get_File -- server to storage device Type Read File_ID 44 Offset 0 Size 400 Address 1AF3000 Msg : File_Data -- server to client Data 0000 ... Data FEF2 Msg : Get_File_Response -- storage device to server File_ID 44 ACK_NACK Complete Msg : Get_File_Response -- server to client File_ID 44 ACK_NACK Complete Msg : File_Response -- file server to client Type Read File_ID 44 Program 50 Status Complete

Are there better technics for analyzing protocols and state machines?

Thanks,

Ronny

Replies are listed 'Best First'.
Re: parsing protocol data : Is Parse::RecDescent the right tool?
by VSarkiss (Monsignor) on Jan 14, 2002 at 21:32 UTC

    Hmm... A very interesting question. It's been years since I've played with protocol analyzers, but my initial inclination is to say a recursive descent parser would not be a good idea.

    Writing error handling is a difficult part of a parser. If I'm reading this correctly, you want to know when something's out of sequence, so you probably want more detail than the parser will give you by default when something's out of order. Or you'll end up writing the same "I wanted this, I got that" error check for every action.

    I'd actually suggest using something like Expect.pm (alt. link). There are two distributions; the more recent one allows you use an opened filehandle instead of spawning a separate process. Using that, you could write the protocol blocks as expect patterns, and easily detect when something's out of sequence. You'd still need to do some hacking to support multiple parallel transactions on the bus, if you have such things.

    HTH

Re: parsing protocol data : Is Parse::RecDescent the right tool?
by IlyaM (Parson) on Jan 14, 2002 at 21:39 UTC
    The embedded system that I work on can generate MegaBytes of data per second.

    AFAIK Parse::RecDescent doesn't scale for such ammounts of data. So the answer is: No, it is not the right tool.

    --
    Ilya Martynov (http://martynov.org/)

Re: parsing protocol data : Is Parse::RecDescent the right tool?
by mattr (Curate) on Jan 16, 2002 at 17:29 UTC
    Some thoughts..

    Unfortunately, according to google this is the kind of thing that people like to do for graduate theses.. A number of commercial firewalls and routers seem to use Protocol Definition Language, for example PacketBoy which lets you define your own protocol. It's possible that bustools.com sells something you want too.

    Much work has been done with this in packet sniffing, especially snort. Maybe not applicable to you but it does have rule writing, variables, and decomposition into separate streams and stateful analysis. Although, it just might work if you could play the data across the network as tcp packets. (or not)

    That said, Expect sounds really good. One other possibility I could imagine would be to capture into a database and write SQL snippets. Or convert a processed stream to tokens and use regexes :) my favorite flavor. Good luck!