Reading hex data

borgis has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Reading hex data by Zaxo (Archbishop) on Aug 09, 2005 at 00:47 UTC
We need to know how those are encoded in your data. Are they octets representing ASCII characters or are they 32-bit binary words? Your expressions for the start and stop strings contain ten and eight characters, respectively, so assumptions are hard to make. Take a look at the index function. It could work in either case, so long as the search substring used the same encoding as the data. After Compline, Zaxo	[reply]
Re: Reading hex data by saintmike (Vicar) on Aug 09, 2005 at 01:19 UTC
Find the relevant text using a regular expression, then go through it two characters at a time and use `hex()` and `chr()` to decode (added a closing 00020400 to your data): `my $raw = join '', <DATA>; $raw =~ s/\n//g; if(my($encoded) = ($raw =~ /0002000000(.*?)00020400/)) { while($encoded =~ /(..)/g) { print chr (hex($1)), "\n"; } } __DATA__ 045a010000020000004c45534f4e 4452412043414c4c454420424143 4b2c204920414456495345442054 48415420574500020400` [download] cracks your 'code': `L E S O N D R A C A L L E D B A C K ,` [download]	[reply] [d/l] [select]
Re^2: Reading hex data by fishbot_v2 (Chaplain) on Aug 09, 2005 at 02:21 UTC
The same, but with an `unpack`: `my $raw = do { local $/; <DATA> }; # inline slurp $raw =~ s/\s//g; print map { chr hex } unpack '(A2)', $1 if $raw =~ m/0002000000(.?)00020400/;` [download] Perhaps a bit too idiomatic, but I prefer using `unpack` to split characters into fixed widths. It is a tiny bit faster, though not blazingly so: `Rate regex unpack regex 953336/s -- -38% unpack 1549170/s 62% --` [download]	[reply] [d/l] [select]
Re: Reading hex data by GrandFather (Saint) on Aug 09, 2005 at 01:13 UTC
Taking the simplest interpretation of your problem description and modifying the sample data somewhat, the following may be what you want: use warnings; use strict; my $start = '0020'; my $end = '0204'; my $data = join "", <DATA>; pos ($data) = 0; while (pos ($data) < length ($data)) { last if $data !~ /\G.?($start)/gis; my $begin = pos ($data); last if $data !~ /\G.?($end)/gis; my $end = pos ($data) - length ($end); print ((substr $data, $begin, $end - $begin) . "\n"); } __DATA__ 045a010000020000004c4553 4f4e4452412043414c4c4544 204241434b2c20492041 +4456 00020400 495345442054484154205745 prints: 000004c4553 4f4e4452412043414c4c4544 204241434b2c204920414456 00 [download] Update:Using index per Zaxo's suggestion would probably be fast and cleaner than this. Perl is Huffman encoded by design.	[reply] [d/l]
Re: Reading hex data by GrandFather (Saint) on Aug 09, 2005 at 00:43 UTC
There is no `x00020400` in the input data supplied. Should there be more data in your sample? Perl is Huffman encoded by design.	[reply] [d/l]
Re: Reading hex data by anonymized user 468275 (Curate) on Aug 09, 2005 at 09:56 UTC
There is a start token in the sample data, but there are 8 hexadigits in front of it. Therefore IMO the OP has made a typo and means to suggest that - the start token is x00020000 - his data is 32 bit (a.k.a 8 decoded nybbles) The second trap to avoid is matching across 8 byte boundaries, which, as can be seen in the code of one of the replies, but was expressed as a preference rather than a necessity, can be done using unpack. One world, one people	[reply]