jarn has asked for the wisdom of the Perl Monks concerning the following question:

I have been assigned to extract certain data out of a data file that is in a report format. Not even sure where to begin with this one. I do not need any of the header, just the data in each column. I have attached (in 'code' form) a sample of the file. This file could be 10 pages long, but should always follow this same format. Note: This is a wide report. Any suggestions would be wonderful! Thanks!
*****.0.0[=3h 12/09/2009 ADVISOR PERFORMAN +CE REPORT 3611 11:31:02 DEPT: ALL, MAKE: ALL, BILL TYPE: CWI, +FROM: 12/03/2009 TO: 12/03/2009 PAGE 9 ADVISORS: A +LL SERVICE ADVISOR MK DEPT RO.CNT ------------------- -- ---- --------- CUSTOMER WARRANTY INTERNAL CUSTOMER WAR +RANTY INTERNAL LABOR LABOR SALE LABOR SALE LABOR SALE PARTS SALE PAR +TS SALE PARTS SALE SUPPLIES DISCOUNTS DISCNT TOTALS ------------ ------------ ------------ ------------ --- +--------- ------------ --------- --------- ------ ------------ STEVEN JONES FO SERV 1 (0) C(0) W(0) I(0) SALES 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 0.00 0.00% GROSS PROFIT $ 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 0.00 GROSS PROFIT % 0.00% 0.00% 0.00% 0.00% + 0.00% 0.00% 0.00% 0.00% AVG. SALES/RO 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 TOTAL HOURS 0.00 0.00 0.00 + 0.00 AVG. HOURS/RO 0.00 0.00 0.00 + 0.00 EFF LBR RATE 0.00 0.00 0.00 + 0.00 PRESS RETURN FOR .2.0 12/09/2009 ADVISOR PERFORMAN +CE REPORT 3611 11:31:05 DEPT: ALL, MAKE: ALL, BILL TYPE: CWI, +FROM: 12/03/2009 TO: 12/03/2009 PAGE 10 ADVISORS: A +LL SERVICE ADVISOR MK DEPT RO.CNT ------------------- -- ---- --------- CUSTOMER WARRANTY INTERNAL CUSTOMER WAR +RANTY INTERNAL LABOR LABOR SALE LABOR SALE LABOR SALE PARTS SALE PAR +TS SALE PARTS SALE SUPPLIES DISCOUNTS DISCNT TOTALS ------------ ------------ ------------ ------------ --- +--------- ------------ --------- --------- ------ ------------ **** T O T A L S **** FOR STEVEN JONES 1 (0) C(0) + W(0) I(0) SALES 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 0.00 0.00% GROSS PROFIT $ 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 0.00 GROSS PROFIT % 0.00% 0.00% 0.00% 0.00% + 0.00% 0.00% 0.00% 0.00% AVG. SALES/RO 0.00 0.00 0.00 0.00 + 0.00 0.00 0.00 TOTAL HOURS 0.00 0.00 0.00 + 0.00 AVG. HOURS/RO 0.00 0.00 0.00 + 0.00 EFF LBR RATE 0.00 0.00 0.00 + 0.00 ---------------------------------------------------------------------- +-------------------------------------------------------------- PRESS RETURN FOR

Replies are listed 'Best First'.
Re: Report File Extract
by roboticus (Chancellor) on Dec 09, 2009 at 18:15 UTC

    jarn:

    I'd suggest just using a regex to specify the lines you want, and ignore all other lines. That way, you'd do something like:

    open my $INF, '<', 'your filename goes here' or die "Can't open file. + $!"; while (<$INF>) { next unless /your regex goes here/; # at this point, you have only the lines you want, so you # can store them and do something with the data later, or # parse out the fields you want and do something, or just # reformat and print the reformatted report. }

    ...roboticus

Re: Report File Extract
by bv (Friar) on Dec 09, 2009 at 19:02 UTC

    First, decide how you want to store your data. Then decide how to populate your data structure. I would do something along the lines of a dispatch table populating a hash, like so:

    my %dispatch = ( 'SALES ' => \&do_sales_stuff, 'GROSS PROFIT $' => \&do_gpdollar_stuff, 'GROSS PROFIT %' => \&do_gppercent_stuff, 'AVG. SALES/RO ' => \&do_avgsales_stuff, 'TOTAL HOURS ' => \&do_hours_stuff, 'AVG. HOURS/RO ' => \&do_avghours_stuff, 'ETC, ETC, ETC ' => \&do_whatever, ); my $salesperson; my %data; while(<>) { my $rowid = substr $_, 0, 14; next if $salesperson = discard_header_or_get_new_salesperson($_); ($dispatch{$rowid} || \&unhandled stuff)->(\$data{$salesperson}, $_) +; }

    @_=qw; Just another Perl hacker,; ;$_=q=print "@_"= and eval;