I use the following module to parse COBOL copy structures and to create the appropriate decoders:

package PIC; use strict; use Encode qw(decode encode); use POSIX qw(ceil); use Carp qw(croak); use vars qw(%parse_PIC); =head2 C<decode_COMP3> Decodes a BCD number with trailing sign. =cut sub decode_COMP3 { # 0C -> + # 0D -> - # 0F -> (unsigned) my $digits = unpack "H*", $_[0]; my $sign = chop $digits; #print "$digits\n"; if ($sign eq 'D' or $sign eq 'd') { $sign = '-' } elsif ($sign eq 'C' or $sign eq 'c') { $sign = '+' } elsif ($sign eq 'F' or $sign eq 'f') { $sign = ' ' } else { $digits .= $sign; $sign = '?' }; "$sign$digits" }; =head2 C<make_COMP3_decoder PRE, POST, SEP> Returns a subroutine that parses the digits into PRE places before and POST places after the decimal delimiter and returns a string with SEP as the decimal delimiter =cut sub make_COMP3_decoder { my ($pre,$post,$sep) = @_; $sep ||= ","; return sub { my $res = decode_COMP3($_[0]); substr($res,$pre+2,0) = $sep; $res }; }; =head2 C<decode_EBCDIC> Decodes an EBCDIC string to a Perl internal string Shorthand for decode( 'cp1047', $_[0] ) =cut sub decode_EBCDIC { decode( 'cp1047', $_[0] )}; # The patterns for things I recognize in the COBOL copy structures I e +ncounter %parse_PIC = ( qr/^PIC\s+S?9\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC +}, qr/^PIC\s+X\s*\((\d+)\)$/ => sub { return 0+$1, \&decode_EBCDIC }, qr/^PIC\s+S?9\s*\((\d+)\)\s*COMP-3$/ => sub { return ceil(($1+1) / +2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)V9\((\d+)\)\s*(?:USAGE\s+)?COMP-3$/ => su +b { return ceil(($1+$2+1)/2), make_COMP3_decoder($1,$2,$_[0]) }, qr/^PIC\s+S?9\s*\((\d+)\)V\s*COMP-3$/ => sub { return ceil(($1+1) +/2), \&decode_COMP3 }, qr/^PIC\s+S?9\s*\((\d+)\)\s+OCCURS\s+(\d+)$/ => sub { return $1*$2 +, \&decode_EBCDIC }, ); =head2 C<decode_PIC> Returns two values: =over 4 =item * Length in bytes of the PIC expression =item * Code reference that decodes the passed value according to the +copy structure =back =cut sub decode_PIC { my ($pic,$sep) = @_; for my $re (keys %parse_PIC) { if ($pic =~ /$re/) { return ($parse_PIC{$re}->($sep)); }; }; croak "Couldn't decode '$pic'"; }; =head2 C<PIC_length> Returns the length of the expression in bytes =cut sub PIC_length { my ($pic) = @_; my ($len,$decoder) = decode_PIC($pic); $len }; 1;

In reply to Re: EBCDIC and COBOL records by Corion
in thread EBCDIC and COBOL records by plegall

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.