Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Converting Mainframe EBCDIC data

by Anonymous Monk
on Sep 27, 2013 at 10:39 UTC ( [id://1055960]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi. I am working on a copybook parser and converting the given mainframe EBCDIC data to ASCII. I am using Convert::IBM390 module. COBOL copybook looks like below:
15 ABC-NO PIC S9(09) COMP. 15 ABC-ACCT-NO PIC S9(11) COMP-3. 15 XYZ-NO PIC S9(04) COMP. 15 XYZ-NAME PIC X(10).
Perl code using unpackeb function from Convert::IBM390:
@fields = unpackeb('i, p6.0, s e10', $record);
When I print the unpacked data, the COMP and COMP-3 fields are giving wrong values. For the 1st field, the value should be '000679243'. But its converting as '1565220416'. PS:The string fields are getting converted properly. Please help. Thanks.

Replies are listed 'Best First'.
Re: Converting Mainframe EBCDIC data
by Corion (Patriarch) on Sep 27, 2013 at 10:46 UTC

    What are the byte-values in the COMP strings? Most likely they got corrupted during transfer. It seems that Convert::IBM390 provides a hexdump function to see the byte values.

    The hex output for the BCD numbers should be (or quite close to):

    00 06 79 24 3F

    Personally, I prefer the manual approach of using substr to get at the bytes and then Encode::decode to convert strings from CP1047 if needed. I also unpack the BCD numbers manually.

    If the module works for you in other cases, most likely the data went bad.

      Thanks for your reply. This is the first time I am using this Convert::IBM390 module. I tried using Convert::EBCDIC but it too had issues with converting COMP fields. Any pointers on how to manually unpack the COMP fields using Encode::decode or Code Page 1047. Thanks again. Really appreciate it.

        See substr for getting at the bytes, and then Encode for documentation about the decode function. For decoding COMP fields, I use the following:

        sub decode_COMP3 { my $digits = unpack "H*", $_[0]; #warn "COMP-3: $digits\n" if $digits =~ m!^0?404040!; if( $digits =~ m!^(?:40)+(?:00|f0)*$!i ) { warn "Invalid (space) content for field detected: $digits."; $digits=~ s![4f]0!00!g; } elsif( $digits =~ /[abcdef]\d/ ) { my $old= $digits; $digits=~ s![a-f]!0!g; warn "Invalid number/misaligned content for field detected [$o +ld] -> [$digits]."; }; my $sign = chop $digits; if ($sign eq 'D' or $sign eq 'd') { $sign = '-' } elsif ($sign eq 'C' or $sign eq 'c') { $sign = '+' } elsif ($sign eq 'F' or $sign eq 'f') { $sign = ' ' } elsif ($sign =~ /^\d+$/) { $digits .= $sign; $sign = '' } else { $digits .= $sign; $sign = '?' }; "$sign$digits" };

        If you have numbers with an implicit decimal point, you need to add that in afterwards.

Re: Converting Mainframe EBCDIC data
by roboticus (Chancellor) on Sep 27, 2013 at 11:49 UTC

    Anonymonk:

    I have some perl code to do this conversion, which I can't seem to find it at the moment. But if you don't mind using C, I *did* trip across one of my older C programs that did this sort of conversion. It's not pretty, but maybe it might be useful. Towards the end, you just convert your PIC statement into a few function calls to do the conversion work. All the stuff before main are the helper routines to make the conversion section simple.

    Much of it is boilerplate (in readmore tags so uninterested people can skip it).

    The only bit you need to work with is this section, where you use the functions to split the mainframe file up and convert it into an ASCII flat file:

    // ************************************************** // * YOU NEED TO CUSTOMIZE THE CODE BETWEEN HERE... * // ************************************************** // Now we use Uunpack, Sunpack, and xlat to translate the fiel +ds. // // Let's pretend our input record is: // // MERCHANT-NUMBER PIC 9(9) COMP-3. // STORE-NUMBER PIC 9(9) COMP-3. // CREATED-DATE PIC 9(8) COMP-3. (FMT=YYYYMMDD) // MERCHANT-NAME PIC X(32). // OWNER-NAME PIC X(32). // CURRENT-BALANCE PIC S9(8)v99. // // The first two fields are packed unsigned numeric, using an +odd number // of digits, so we use Uunpack: dst = Uunpack(dst, inbuf+0, 5); dst = Uunpack(dst, inbuf+5, 5); // The date is also a packed unsigned numeric, but has only 8 +digits, so // we tell Uunpack to skip the first digit (otherwise our date + would look // like '0YYYYMMDD'. dst = Uunpack(dst, inbuf+10, 5, 1); // The next two fields are simple text fields. This is the ea +sy // translation bit. You could call xlat twice, but since they +'re // adjacent, I'll translate both text fields at the same time: xlat(dst, inbuf+15, 64); // Signed numbers are similar to the unsigned, but they have t +railing // signs. A fancy program would move the sign to the front. +This is // decidedly *not* a fancy program. dst = Sunpack(dst, inbuf+74, 6); // Adding field delimiters, end of record markers, etc., is pr +etty // trivial. Here we'll add a CR+LF at the end of each line: *dst++ = '\r'; *dst++ = '\n'; *dst++ = 0; // ***************************** // * END OF CUSTOMIZED SECTION * // ***************************** // Write our translated record fputs(outbuf, fout); } printf("%lu reads\n", recs); fclose(fin); fclose(fout); }

    Update: Put the boilerplate code in readmore tags to shorten the node a bit.

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Converting Mainframe EBCDIC data
by swampyankee (Parson) on Sep 27, 2013 at 14:01 UTC

    EBCDIC <=> ASCII conversion has always been fraught, and cannot be done with 100% success, as the character sets represented are not identical (and there is the problem of which EBCDIC...). I've done a lot of porting to & from IBM platforms and it is not as straightforward as it should be. EBCDIC isn't the only issue.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Converting Mainframe EBCDIC data
by JohnYEC (Initiate) on Mar 23, 2014 at 22:34 UTC

    About 10 years ago I was receiving mainframe EBCDIC files to our Windows servers and converting them to ASCII with the simple script below. It was handy since it could be customized to any non-standard character set desired:

    # e2a - convert an EBCDIC file to ASCII. @E2A = qw( 00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 10 11 12 13 14 15 16 17 18 19 1A 1B 1C 1D 1E 1F 20 21 22 23 24 25 26 27 28 29 2A 2B 2C 2D F8 2F F8 F8 32 33 34 35 36 37 38 39 3A 3B 3C 3D F8 3F 20 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 3C 28 2B 7C 26 F8 F8 F8 F8 F8 F8 F8 F8 F8 21 24 2A 29 3B 5E 2D 2F F8 F8 F8 F8 F8 F8 F8 F8 7C 2C 25 5F 3E 3F F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 3A 23 40 27 3D 22 F8 61 62 63 64 65 66 67 68 69 F8 F8 F8 F8 F8 F8 F8 6A 6B 6C 6D 6E 6F 70 71 72 F8 F8 F8 F8 F8 F8 F8 7E 73 74 75 76 77 78 79 7A F8 F8 F8 5B F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 F8 5D F8 F8 7B 41 42 43 44 45 46 47 48 49 F8 F8 F8 F8 F8 F8 7D 4A 4B 4C 4D 4E 4F 50 51 52 F8 F8 F8 F8 F8 F8 5C F8 53 54 55 56 57 58 59 5A F8 F8 F8 F8 F8 F8 30 31 32 33 34 35 36 37 38 39 F8 F8 F8 F8 F8 F8 ) ; open (FILE, $ARGV[0]) or die "Can't read $ARGV[0]: $! "; while (<FILE>) { s/(.)/chr(hex($E2A[ord($1)]))/eg; print; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1055960]
Approved by hdb
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (7)
As of 2024-03-28 19:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found