Hi All

Particular thanks to tachyon, ikegami and Dave for assisting in getting the code together (see below).

My code is matching and processing data successfully except the following record:
1010950 01.07.200400.00.00008 M3 DN Y 2001.11.200400052.85 00044.95 5001.07.2004Aboriginal or Torres Strait Islander health 5001.07.2004service provided to a person by an eligible 5001.07.2004Aboriginal health worker
The data definition is (with data from the problem record):
Row 1 record10 = integer (length 2) (10). primary key = integer (length 5).(10950) sub item number = integer (length 3).( ) start date = dd.mm.yyyy.(01.07.2004) end date = dd.mm.yyyy.(00.00.0000) category = alphanumeric (length 3).(8 ) group = alphanumeric (length 3).(M3 ) subgroup = alphanumeric (length 3).( ) item type = text (options: S or N).(D) fee type = text (options: N or D).(N) provider type = alphanumeric (length 3).( ) new item = text (options: Y or N).( ) item change = text (options: Y or N).( ) procedure change = text (options: Y or N).( ) description change = text (options: Y or N).( ) fee change = text (options: Y or N).(Y) Row 2 record20 = integer (length 2).(20) start date = dd.mm.yyyy.(01.11.2004) fee = decimal (nnnnn.nn).(00052.85) benefit1 = decimal (nnnnn.nn).( ) benefit2 = decimal (nnnnn.nn).(00044.95) Row 5 record50 = integer (length 2).(50) start date = dd.mm.yyyy.(01.07.2004) description = alphanumeric (length 80).(Aboriginal or Torres Strait Is +lander health service)

I suspect the problem could be matching the group data.

Thanks in advance for any comments or suggestions.

The Code:
#!/usr/bin/perl -w use strict; local $/ = ''; my $date = qr/\d\d\.\d\d.\d\d\d\d/; my $fee = qr/\d\d\d\d\d\.\d\d/; my $digit3 = qr/\d\d\d|\d\s\s|\d\d\s|\s\s\d|\W{3}/; my $alpha3 = qr/\w\w\w|\w\w\s|\w\s\s|\s\w\s|\s\s\w|\s\w\w|\W{3}/; my $one = qr/ ^ \d\d (\d{5}) ($digit3) ($date) ($date) ($alpha3) ($alpha3) ($alpha3) ([SN]|\W{1}) ([ND]|\W{1}) ($alpha3) ([YN]|\W{1}) ([YN]|\W{1}) ([YN]|\W{1}) ([YN]|\W{1}) ([YN]|\W{1}) $ /x; my $two = qr/ ^ \d\d #record20 = integer (length 2).(20) ($date) #start date = dd.mm.yyyy.(01.11.1996) ($fee) #fee = decimal (nnnnn.nn).(00098.05) ($fee) #benefit1 = decimal (nnnnn.nn).(00073.55) ($fee) #benefit2 = decimal (nnnnn.nn).(00083.35) $ /x; my $gen = qr/ ^ \d\d ($date) (.*) $ /x; while(my $rec = <DATA>) { my $pkey, my @out = (); for my $line( split /[\n\r]/, $rec ) { my $type = substr $line,0,2; if ( $type == 10 ) { my @data = $line =~ m/$one/; die error( $line, $rec, $one ) unless @data; $pkey = $data[0]; $out[$type] = [$pkey, $type, @data]; } elsif ( $type == 20 ) { my @data = $line =~ m/$two/; die error( $line, $rec, $two ) unless @data; $out[$type] = [$pkey, $type, @data]; } elsif ( $line =~ m/$gen/ ) { my @data = ( $1, $2 ); if ( defined $out[$type] ) { $out[$type][-1] .= ' ' . $2; # continuing string } else { $out[$type] = [$pkey, $type, @data]; } } else { die error( $line, $rec, $gen ); } } format_output( [@out[10,20,30,40,50]], ",", 'quote' ); } sub format_output { my ($aryref, $sep, $quote ) = @_; for my $ref(@$aryref) { @$ref = map{ s/"/\\"/; qq!"$_"! } @$ref if $quote; print join $sep, @$ref; print "\n"; } } sub error { my ( $line, $rec, $re ) = @_; return "Invalid Line:\n$line\n\nRE:\n$re\n\nRecord:\n$rec\n"; } __DATA__ 1010950 01.07.200400.00.00008 M3 DN Y 2001.11.200400052.85 00044.95 5001.07.2004Aboriginal or Torres Strait Islander health 5001.07.2004service

Janitored by Arunbear - added readmore tags, as per Monastery guidelines


In reply to Can't match record by Anonymous Monk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.