Help with ?unicode? data

kevsurf has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone,
I'm trying to parse some data, but I don't really understand how it's defined. Supposedly this is definitions of the type of data that will be sent to me in that "field". I've asked the suppliers of the data to help me understand the layout, etc, so that I can parse it accordingly.

Here's a snip
2|ENTRY_TYPE|%s=%u
2|SERVICE_OPTION|%s=0x%04x
2|NEGOTIATED_SO|%s=0x%04x
2|LAST_MM_SETUP_EVENT|%s=%d
2|CIC_SPAN|%s=%d
2|CIC_SLOT|%s=%d

I believe the first column is the number of characters for the field whatever the datatype. The second column is the name of the field. The third is the format of the field.

I believe the third column is in unicode and that %s stands for string.

Note: This data is from the cellular industry so if anyone has experience with any of the major carriers I'd appreciate any help you could offer.

Can anyone throw me a line?

If you'd like to see the entire data file please email me off list and I'll supply it.

Thanks, Kevin

Comment on Help with ?unicode? data

Replies are listed 'Best First'.
Re: Help with ?unicode? data by tedrek (Pilgrim) on May 27, 2003 at 03:12 UTC
At a guess without seeing the actual data I read it as something like column1: unknown column2: field name column3: a sprintf format string so SERVICE_OPTIONS would be something like 'SERVICE_OPTIONS=0x0fa3' take a look at sprintf for the codes Of course I could be way off here. if you post a sample of the data somebody here could probably figure it out :)	[reply]
Re: Help with ?unicode? data by graff (Chancellor) on May 27, 2003 at 03:08 UTC
In what sense is this a Perl(-related) question? I doubt that there are very many who would want to see "the entire data file" -- perhaps you could present a snippet of the data (or, if you're worried about sensitive contents, make up pseudo-data that has similar properties, e.g. "random" digits, letters and punctuation positioned to be equivalent to the actual data). It seems unlikely that "%s" etc have anything to do with unicode. These designations are probably intended for programmers who know the "printf" function (which should include you -- you may want to refer to docs or other references regarding this standard C library function; it is available in Perl, of course -- see "perldoc -f printf" -- but the Perl manuals take it for granted that you already know about the C function that it emulates). In any case, your guess about "%s" referring to a string is probably right; "%d" probably refers to a signed decimal integer value, "%u" an unsigned decimal integer, and "%04x" an integer in hexidecimal format (padded with leading zeros to provide a constant width of four digits). If you try something with Perl, and it doesn't do what you want or expect, let us know...	[reply]