kevsurf has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone,
I'm trying to parse some data, but I don't really understand how it's defined. Supposedly this is definitions of the type of data that will be sent to me in that "field". I've asked the suppliers of the data to help me understand the layout, etc, so that I can parse it accordingly.

Here's a snip
2|ENTRY_TYPE|%s=%u
2|SERVICE_OPTION|%s=0x%04x
2|NEGOTIATED_SO|%s=0x%04x
2|LAST_MM_SETUP_EVENT|%s=%d
2|CIC_SPAN|%s=%d
2|CIC_SLOT|%s=%d

I believe the first column is the number of characters for the field whatever the datatype. The second column is the name of the field. The third is the format of the field.

I believe the third column is in unicode and that %s stands for string.

Note: This data is from the cellular industry so if anyone has experience with any of the major carriers I'd appreciate any help you could offer.

Can anyone throw me a line?

If you'd like to see the entire data file please email me off list and I'll supply it.

Thanks, Kevin

Replies are listed 'Best First'.
Re: Help with ?unicode? data
by tedrek (Pilgrim) on May 27, 2003 at 03:12 UTC
    At a guess without seeing the actual data I read it as something like
    column1: unknown
    column2: field name
    column3: a sprintf format string
    so SERVICE_OPTIONS would be something like 'SERVICE_OPTIONS=0x0fa3'
    take a look at sprintf for the codes

    Of course I could be way off here. if you post a sample of the data somebody here could probably figure it out :)
Re: Help with ?unicode? data
by graff (Chancellor) on May 27, 2003 at 03:08 UTC
    In what sense is this a Perl(-related) question?

    I doubt that there are very many who would want to see "the entire data file" -- perhaps you could present a snippet of the data (or, if you're worried about sensitive contents, make up pseudo-data that has similar properties, e.g. "random" digits, letters and punctuation positioned to be equivalent to the actual data).

    It seems unlikely that "%s" etc have anything to do with unicode. These designations are probably intended for programmers who know the "printf" function (which should include you -- you may want to refer to docs or other references regarding this standard C library function; it is available in Perl, of course -- see "perldoc -f printf" -- but the Perl manuals take it for granted that you already know about the C function that it emulates).

    In any case, your guess about "%s" referring to a string is probably right; "%d" probably refers to a signed decimal integer value, "%u" an unsigned decimal integer, and "%04x" an integer in hexidecimal format (padded with leading zeros to provide a constant width of four digits).

    If you try something with Perl, and it doesn't do what you want or expect, let us know...