Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Help with raw data.

by samgold (Scribe)
on Oct 23, 2003 at 19:21 UTC ( [id://301682]=perlquestion: print w/replies, xml ) Need Help??

samgold has asked for the wisdom of the Perl Monks concerning the following question:

I have a file that contains SQL statements. In the SQL statements there is raw data.
When I look at the data using vi it looks like this: '\302U^K'
When I look at using more it looks like this: 'ÂU '
How can I convert that from raw to hex or raw to decimal?
How does it look to a regular expression?
Any help would be greatly appreciated.

Thanks,
Sam Gold

Replies are listed 'Best First'.
Re: Help with raw data.
by davido (Cardinal) on Oct 23, 2003 at 19:33 UTC
    \302 may be the octal code for A^ (A with a cap over it). The number is too high for it to be a decimal representation of an ASCII character. But that's just one possibility; it may also be Unicode. Perl (current versions) has Unicode support. U is seen literally. And ^K is "control K", which is a character that doesn't actually print visibly using more.

    How can I convert that from raw to hex or raw to decimal?

    You can use unpack, or use ord while iterating over each position in the string, to name a couple starting points.

    How does it look to a regular expression?

    It looks like a string of three characters: "A^" (I don't know the keystrokes for A with a cap over it), "U", and "^K" (control K). How you write your regexp will determine how the regexp deals with those characters.

    Hope this helps...


    Dave


    "If I had my life to do over again, I'd be a plumber." -- Albert Einstein
Re: Help with raw data.
by hardburn (Abbot) on Oct 23, 2003 at 19:43 UTC

    vim 6.0 added support for unicode, so you might try that for viewing it.

    For converting to hex, I would try a hex dumping program (search for one on Google--I think there is also a vim script for one). Once you have that, you can encode it into a regex using a unicode escape.

    ----
    I wanted to explore how Perl's closures can be manipulated, and ended up creating an object system by accident.
    -- Schemer

    :(){ :|:&};:

    Note: All code is untested, unless otherwise stated

Re: Help with raw data.
by samgold (Scribe) on Oct 23, 2003 at 19:59 UTC
    Thanks for the help Dave and hardburn. I think I will take a look at unpack and ord and see what that can do for me.

    Thanks,
    Sam Gold
Re: Help with raw data.
by samgold (Scribe) on Oct 23, 2003 at 21:00 UTC
    Unfortunitally unpack and ord are not the answer. :( So I am trying to use regular expressions to match the ÂU or something like it. The problems is that it I don't know how perl sees it and looking for this \302U^K doesn't seem to work as well. Maybe my regular expressions suck. This is what I have tried:
    ( $line =~ m/'*(?!a-zA-Z0-9)'/) ( $line =~ m/'\\[0-9]*'/)
    Neither one worked. Maybe I am doing something wrong. Any help is greatly appreciated.

    Thanks,
    Sam Gold
      Unfortunitally unpack and ord are not the answer.

      Um... what was the question? Based on the original post, I thought it was:

      How can I convert that from raw to hex or raw to decimal?

      So now the next question is: if you think unpack and ord are not the answer, why not? What did you try, what did you hope to get, and what did you actually get instead? Show us some code.

      Based on the original question, I would think something like this would do:

      # assume $sql contains an sql statement read from a file; # we want to print the statement, showing visible ascii # characters as-is, and converting other things to hex: @bytes = split //, $sql; for ( @bytes ) { if ( /[\x21-\x7e]/ ) { print " $_"; } else { printf( " %0.2x", ord ); } }
      YMMV, depending on what OS and Perl version you're using... e.g. Redhat 8 (9?) combined with Perl 5.8.0 might need some special pragmas to do this right (use bytes; no utf8; maybe binmode ":raw" on the file handle(s) involved).

      Note that the above will show things like " 20 " for spaces, " 0a " for line-feeds, etc; just add stuff into the character class for "printable ascii" characters if you want whitespace printed out as-as.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://301682]
Approved by HyperZonk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2024-03-28 20:15 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found