jayw has asked for the wisdom of the Perl Monks concerning the following question:

Great and wise perl monks. I seek wisdom.

I have a UNIX shell script on Sun/Solaris 10. In my script I ftp a file to the MVS mainframe (binary mode of course). There are several fields in each row of data in the file that need to be compressed into the MVS packed decimal (comp-3) format prior to ftp'ing the file to the mainframe dataset. I want to us a perl inline command (see below) in the UNIX shell script (if possible) in order to make those field conversions.

/usr/bin/perl -wpe 's/([0-9A-Za-z]{9})/pack("C",hex($1))/ge' < input_file > output_file

What I am wondering is: A. Is this even possible? B. If it is possible, is there a specific template to use in the pack function call (ie. I don't think "C" is the correct one etc.) in order to convert specific fields in each row of data to packed decimal (comp-3) format? (ie. the fields on MVS are PIC 9(7) COMP-3, or PIC 9(9) COMP-3 etc.)

Many kind thanks for your specific response regarding this specific solution. (I did read through the history logs but did not find a mention of this specific solution). jayw

  • Comment on ASCII text on UNIX (Sun/Solaris 10) to MVS packed decimal (comp-3) using perl pack inline command
  • Download Code

Replies are listed 'Best First'.
Re: ASCII text on UNIX (Sun/Solaris 10) to MVS packed decimal (comp-3) using perl pack inline command
by roboticus (Chancellor) on Jul 30, 2009 at 01:19 UTC
    jayw:

    If your file is typical of the ones I work with, you're not going to do it with a simple one-liner. (Unless, of course, you don't mind long lines.)

    The COMP-3 format is basically a packed BCD scheme with a trailing sign indicator. So each data byte will contain two digits (or a digit and the sign for the last byte) of your number. So if you take a number, say 54106 and you want to put it into a PIC 9(7) COMP-3 field, you:

    1. Convert the number into a string, and pad it with leading zeroes to 7 (the number of digits in your field) characters: "0054106".
    2. Add the sign character to the end (I typically use A for unsigned, B or D for negative, and C or E or F for positive).
    3. Now you have a string representation of a hexadecimal number, four bytes long, that contains your value: "0054106C".
    4. Then convert the string to binary for the field, and you've finished that field.

    (Converting that to code, adding error checking, etc., is left as an exercise for the reader.)

    The main problem is that you need to pack in multiple COMP-3 fields, perhaps some floating point values (trickier to convert!) and perhaps some EBCDIC conversion as well. What fun!

    I decode these sorts of files all the time. Tell me "I want your crappy C program to decode a mainframe file", and give me your EMail address, and I'll ship it to you. (I'm flying out of town and will be back Tuesday evening, so don't expect a reply before Wednesday morning.)

    ...roboticus

      Ah, someone actually asked for my crappy C program. I'm including it here, so I don't have to look back in my archives for it again.

      To use it, you just have to look over your record definitions, and use Uunpack, Sunpack and xlat to translate each chunk. Look for the two comments in blocks of asterisks to see how to do it. I used to have a perl script that would read a COBOL record definition and spit out the the set of Uunpack, Sunpack and xlat calls needed, but I can't locate that script. If I trip across it (and remember), I'll try to append it to this node.

      /* ebcdic2ascii.cpp * * convert an EBCDIC file to ASCII. * * gcc -funsigned-char ebcdic2ascii.cpp * * Unsigned char else the array lookup will fail.... */ #include <stdio.h> char xlat_ebcdic_to_ascii[256] = { /* 0 1 2 3 4 5 6 7 8 9 A B C D +E F * --- --- --- --- --- --- --- --- --- --- --- --- --- --- - +-- --- */ /* 0 */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , + 1 , 1 , /* 1 */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , + 1 , 1 , /* 2 */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , + 1 , 1 , /* 3 */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , + 1 , 1 , /* 4 */ ' ', 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ,'?','.','<','(', +'+','|', /* 5 */ '&', 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ,'!','$','*',')', +';','?', /* 6 */ '-','/', 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ,'|',',','%','_', +'>','?', /* 7 */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 ,'`',':','#','@','\'' +,'=','~', /* 8 */ 1 ,'a','b','c','d','e','f','g','h','i', 1 , 1 , 1 , 1 , + 1 , 1 , /* 9 */ 1 ,'j','k','l','m','n','o','p','q','r', 1 , 1 , 1 , 1 , + 1 , 1 , /* A */ 1 ,'~','s','t','u','v','w','x','y','z', 1 , 1 , 1 , 1 , + 1 , 1 , /* B */ 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , 1 , + 1 , 1 , /* C */ '{','A','B','C','D','E','F','G','H','I', 1 , 1 , 1 , 1 , + 1 , 1 , /* D */ '}','J','K','L','M','N','O','P','Q','R', 1 , 1 , 1 , 1 , + 1 , 1 , /* E */ '\\',0 ,'S','T','U','V','W','X','Y','Z', 1 , 1 , 1 , 1 , + 1 , 1 , /* F */ '0','1','2','3','4','5','6','7','8','9', 1 , 1 , 1 , 1 , + 1 , 1 }; char tohex(char v) { // NOTE: if signed character, be sure it's not negative! So ignor +e the // warning, as it alerts you to a compile error (i.e. missing -fun +signed-char) if ( (v<0) || (v>15) ) { return '#'; } return "0123456789ABCDEF"[v]; } // convert <src> to ascii, stick in <*dst> and return next position char *ebcdic_to_ascii(char *dst, char src) { char x = xlat_ebcdic_to_ascii[src]; *dst++ = x; return dst; } // convert <len> chars at <*src> to ascii, storing them at <*dst> and // returning the next position char *xlat(char *dst, char *src, int len) { while (len--) { dst = ebcdic_to_ascii(dst, *src++); } return dst; } char packed_to_ascii[] = "0123456789 -+-++"; // SIGNED unpack: show trailing sign char *Sunpack(char *dst, char *src, int len, int skip=0) { // Skip digits as requested... while (skip > 1) { // Skip entire bytes ++src; skip -= 2; --len; } if (skip) { // Skip next digit (i.e. MSB) *dst++ = packed_to_ascii[ (*src++) & 0x0f ]; --len; } // Emit remaining digits while (len--) { *dst++ = packed_to_ascii[ (*src>>4) & 0x0f ]; *dst++ = packed_to_ascii[ (*src++) & 0x0f ]; } return dst; } // UNSIGNED unpack: skip sign char *Uunpack(char *dst, char *src, int len, int skip=0) { // Skip digits as requested... while (skip > 1) { // Skip entire bytes ++src; skip -= 2; --len; } if (skip) { // Skip next digit (i.e. MSB) *dst++ = packed_to_ascii[ (*src++) & 0x0f ]; --len; } // Emit remaining digits, except for last byte while (len-- > 1) { *dst++ = packed_to_ascii[ (*src>>4) & 0x0f ]; *dst++ = packed_to_ascii[ (*src++) & 0x0f ]; } // For last byte, show MSB only (last digit before sign) *dst++ = packed_to_ascii[ (*src>>4) & 0x0f ]; return dst; } #define IRECSIZE 500 // input record size #define ORECSIZE 2048 // Room for translated/unpacked record int main(int argc, char **argv) { if (argc != 3) { puts("Missing INFile or OUTFile!"); return 1; } // Open the input and output files FILE *fin = fopen(argv[1], "rb"); FILE *fout = fopen(argv[2], "w"); if (!fin || !fout) { puts("Can't open a file!"); return 2; } char inbuf[IRECSIZE+20]; char outbuf[ORECSIZE]; long recs = 0; while (!feof(fin)) { // Read the record size_t bytes_read = fread(inbuf, 1, IRECSIZE, fin); ++recs; if (IRECSIZE != bytes_read) { if (!bytes_read) printf("%lu: EOF detected?\n", recs); else printf("%lu: Sort record (saw %u, expected %u) found.\ +n", recs, bytes_read, IRECSIZE); continue; } // Translate input fields into output buffer. char *dst = outbuf; // *************************************** // * YOU NEED TO CUSTOMIZE STARTING HERE * // *************************************** // Now we use Uunpack, Sunpack, and xlat to translate the fiel +ds. // // Let's pretend our input record is: // // MERCHANT-NUMBER PIC 9(9) COMP-3. // STORE-NUMBER PIC 9(9) COMP-3. // CREATED-DATE PIC 9(8) COMP-3. (FMT=YYYYMMDD) // MERCHANT-NAME PIC X(32). // OWNER-NAME PIC X(32). // CURRENT-BALANCE PIC S9(8)v99. // // The first two fields are packed unsigned numeric, using an +odd number // of digits, so we use Uunpack: dst = Uunpack(dst, inbuf+0, 5); dst = Uunpack(dst, inbuf+5, 5); // The date is also a packed unsigned numeric, but has only 8 +digits, so // we tell Uunpack to skip the first digit (otherwise our date + would look // like '0YYYYMMDD'. dst = Uunpack(dst, inbuf+10, 5, 1); // The next two fields are simple text fields. This is the ea +sy // translation bit. You could call xlat twice, but since they +'re // adjacent, I'll translate both text fields at the same time: xlat(dst, inbuf+15, 64); // Signed numbers are similar to the unsigned, but they have t +railing // signs. A fancy program would move the sign to the front. +This is // decidedly *not* a fancy program. dst = Sunpack(dst, inbuf+74, 6); // Adding field delimiters, end of record markers, etc., is pr +etty // trivial. Here we'll add a CR+LF at the end of each line: *dst++ = '\r'; *dst++ = '\n'; *dst++ = 0; // ***************************** // * END OF CUSTOMIZED SECTION * // ***************************** // Write our translated record fputs(outbuf, fout); } printf("%lu reads\n", recs); fclose(fin); fclose(fout); }

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

Re: ASCII text on UNIX (Sun/Solaris 10) to MVS packed decimal (comp-3) using perl pack inline command
by Anonymous Monk on Jul 29, 2009 at 23:05 UTC

    In your character class [0-9A-Za-z] the characters [G-Zg-z] are not hexadecimal digits so hex($1) will not be able to convert them to anything.

    Posted by jwkrahn.

Re: ASCII text on UNIX (Sun/Solaris 10) to MVS packed decimal (comp-3) using perl pack inline command
by graff (Chancellor) on Jul 30, 2009 at 04:39 UTC
    I want to us a perl inline command ... in the UNIX shell script (if possible) in order to make those field conversions.

    In terms of using Perl in the context of shell scripting, it's often easier (IMHO) to just put the intended Perl operation(s) into an executable perl file, and invoke that file as a shell command, rather than trying to shoe-horn a perl "one-liner" into the shell script.

    Numerous advantages of that approach:

    • You'll probably figure out numerous ways for the perl script to take over a larger portion of what the shell script needs to do (and the perl script will be more compact, more flexible/adaptable, easier to read and maintain, etc).

    • You can easily include POD in the perl script that will be cleaner and easier to read than the typical comments in a shell script. The effectiveness of any source code can be judged by the quality and accessibility of its documentation.

    • You can do more error checking, and better warnings/error messages, more easily in Perl than in a shell script.