TrinityInfinity has asked for the wisdom of the Perl Monks concerning the following question:

I've never dealt with complicated matching before, so I definitely need a little assistance on this one!
I'm taking a frame in and need to break it down into a packet. This isn't a problem, except I've no idea how to split my frame into the necessary pieces to work with. The frame begins & ends the same way, has a 16-character byte count section, and then the mystery 'payload'. It would look something like this
flag byte count mystery flag 01111110 0000000000000100 010111001001 01111110
The only parts I'm *sure* of every time I get a frame is the flag sections on either end, and the length of the byte count is always 16.
I want to be able to split this frame when I get it into 4 parts, flag1, bytecount, mystery, flag2. What kind of pattern would I be matching for? Splitting against whitespace isn't an option, the frame only comes in one long piece. Having a variable like the mystery section in a string is something I've never had to work with before, hence my need for guidance from those more experienced than I

I do have the Programming Perl book, if someone can direct me to an example in there that might assist me.

Replies are listed 'Best First'.
Re: RegEx Confusion!
by japhy (Canon) on Nov 02, 2001 at 20:48 UTC
    A regex here is overkill (since it would end up requiring backtracking at the "mystery" part):
    ($flag1, $bytecount, $mystery, $flag2) = $frame =~ m{ ^ ( [01]{8} ) ( [01]{16} ) ( [01]*? ) # that could be [01]* for all I care... ( [01]{8} ) $ }x;
    I'd much rather use substr() and unpack():
    ($flag1, $bytecount, $mystery) = unpack "A8 A16 A*", $frame; $flag2 = substr($mystery, -8, 8, '');
    No backtracking. Just simplicity.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker.
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Re: RegEx Confusion!
by Masem (Monsignor) on Nov 02, 2001 at 20:43 UTC
    If "flag" is the same length (8 bits), then this should work
    my ( $flag, $bytes, $message ) = $frame =~ /^([01]{8})([01]{16})([01]*)\1$/;

    -----------------------------------------------------
    Dr. Michael K. Neylon - mneylon-pm@masemware.com || "You've left the lens cap of your mind on again, Pinky" - The Brain
    "I can see my house from here!"
    It's not what you know, but knowing how to find it if you don't know that's important

      Your example works perfectly. Thanks so much!

      I've already been breaking it down to see how it works, I was thinking in the right direction, but lacked a good example. Thanks again!
Re: RegEx Confusion!
by maverick (Curate) on Nov 02, 2001 at 20:51 UTC
    actually this sounds more like a task for substr than a regexp. Here's a tested fragment
    my $bit_string = "01111110000000000000010001011100100101111110"; # starting at the front, take 8 characters my $flag = substr($bit_string,0,8); # start 8 chars in, get 16 out. my $byte_count = substr($bit_string,8,16); # start at the END and get 8 characters; my $flag2 = substr($bit_string,-8,8); # the mystery part. 24 chars in, length of parts we haven't taken out +yet. my $the_rest = substr($bit_string,24,(length($bit_string) - 24 - 8));
    HTH

    Update The things you find on CPAN. NetPacket may do exactly what you need

    /\/\averick
    perl -l -e "eval pack('h*','072796e6470272f2c5f2c5166756279636b672');"

Re: RegEx Confusion!
by Albannach (Monsignor) on Nov 02, 2001 at 20:55 UTC
    just AWTDI:
    ($flag1, $bytecount, $mystery, $flag2) = unpack "a8 a16 a@{[length($frame)-32]} a8", $frame;

    --
    I'd like to be able to assign to an luser

Re: RegEx Confusion!
by dragonchild (Archbishop) on Nov 02, 2001 at 20:45 UTC
    The following assumes that the packet is in a stream of 1's and 0's.
    my $flag1 = substr $packet, 0, 8; my $byte_count = substr $packet, 9, 16; my $num_bytes = convert_byte_count($byte_count); my $payload = substr $packet, 25, $num_bytes * 8; my $flag2 = substr $packet, 25 + $num_bytes * 8, 8;
    The definition of convert_byte_count() is left as an exercise for the reader.

    ------
    We are the carpenters and bricklayers of the Information Age.

    Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Re: RegEx Confusion!
by Rich36 (Chaplain) on Nov 03, 2001 at 00:50 UTC
    Or if the delimiters between the fields is consistent (like one or more whitespace characters), you could even use split without worrying about the content/size of the fields.
    my ($flag1, $bytecount, $mystery, $flag2) = split(/\s+/, $frame);

    Rich36
    There's more than one way to screw it up...