Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi All

I have a script that extracts data from a flatfile (using a configuration file for field names) into an excel file. I would like to be able to group each line by a certain field ("bargain reference") and only print a 'totals' line for each one (instead of all lines extracted). To summarise, if 5 lines share the same bargain reference, I would like to print one line to the file which represents a total of all data in the 5 lines.

Code snippet below :

#!/sbcimp/run/pd/perl/5.6.1/bin/perl BEGIN { push (@INC, "/sbcimp/run/pd/perl/5.004_04/lib/site_perl"); push (@INC, "/sbcimp/run/pd/perl/5.004_04/lib/site_perl/sun4-solar +is/"); } use strict; use English; use Getopt::Long; use efSetupEnv; use efLogging; use flex_sql; use transferFile; use G1DateUtils; LogStep "Started: $0 @ARGV"; ###################################################################### +######### # # Optional run-time switches # # my %opt = ( db => "UKWARB", ); GetOptions( \%opt, "db=s", # database "cc=s", # Cost Centre "in=s", # Input file "out=s", # Output file name "ccm=s", # cost centre major instead ) or LogWarning( "WARNING: GetOptions() error" ); ###################################################################### +######### die "-in=InputFile not specified" unless $opt{in}; my $centre = $opt{cc} || $opt{ccm}; @G::result = (); my $web_file_name; my $whereClause; if ( $opt{cc} ) { $whereClause = qq( && F.cost_centre_code eq '$opt{cc}'); $opt{out} ||= "$G::DATADIR/$centre.xls"; $web_file_name = IsoDate. ShortTime ."_$opt{DB}"."CCFeeRebate_$cen +tre.xls"; } elsif ( $opt{ccm} ) { $whereClause = qq( && F.cost_centre_major eq '$opt{ccm}'); $opt{out} ||= "$G::DATADIR/$centre.xls"; $web_file_name = IsoDate. ShortTime ."_$opt{DB}"."CCFeeRebate_$cen +tre.xls"; } else { $opt{out} ||= "$G::DATADIR/backdated_" . IsoDate() . ".xls"; $web_file_name = IsoDate. ShortTime ."_$opt{DB}"."CCFeeRebate_back +dated.xls"; } LogGeneral "Extracting details from $opt{in} for $centre into file $op +t{out}"; #this uses a config file to which maps all fields in the file i am ext +racting from flex_sql qq(select F.accrued_fee_rebate_value, F.actual_cost_of_funds, F.borrow_lend_indicator, F.callable_indicator, F.cash_settlement_mode, F.client_identifier, F.client_major, F.collateral_type_indicator, F.cost_centre_code, F.cost_centre_cross_reference, F.cost_centre_major, F.counterparty_cross_reference, F.current_rate_sign, F.days_open, F.dvp_indicator, F.effective_date, F.fee_rebate_payable_type, F.fund_report_category, F.initial_loan_quantity, F.initial_loan_value, F.initial_settlement_date, F.intercompany_reference, F.interest_type, F.link_reference, F.loan_value, F.loan_value_currency, F.management_charge, F.minimum_fee, F.net_cost_of_funds, F.net_fee_rebate_interest_rate, F.posting_type, F.quantity, F.record_type, F.security_bond_indicator, F.security_class, F.security_class_cross_reference, F.security_code, F.security_code_type, F.security_currency, F.security_cusip_code, F.security_description, F.security_isin_code, F.security_issuer_name, F.security_quick_code, F.security_sedol, F.security_settlement_mode, F.security_ticker, F.security_xref_in_house, F.term_date, F.trade_date, F.trade_reference, F.trade_type, into resultset \@G::result from $opt{in} where F.record_type eq '1' $whereClause ); LogWarning( "No records extracted!") unless ( scalar @G::result ); LogGeneral scalar( @G::result ) . " records extracted"; exit 0; ##send it## sendFile( system => 'GSBLWEB' , file => 'DAILY' , source => "$opt{out}", target => "$web_file_name" ); exit 0; ##subs## sub process { #print to file my $fh = FileHandle->new(">$opt{out}") or die "Unable to create $opt{out}, $OS_ERROR"; # Title stuff for the web - shortly to be converted to sylk format if ($opt{cc}){ $fh->printf( "%s\t%s\n", "Title:", "$opt{db} Fee/rebate detail +s: Month to date for Cost Centre $opt{cc}" ); }else{ $fh->printf( "%s\t%s\n", "Title:", "$opt{db} Fee/rebate detail +s: Month to date for Cost Centre Major $opt{ccm}" ); } $fh->printf( "%s\t'%s\n", "Run Date:", PrettyDate() ); $fh->printf( "%s\t%s\n", "Program:", $0 ); my $rec = $G::result[0]; $fh->print( join( "\t", sort keys %$rec), "\n" ); foreach my $rec ( @G::result ) { LogGeneral("Processing $rec->{\"trade_reference\"}"); #B/L = B & f/r p type = RR if (($rec->{"borrow_lend_indicator"} eq "B") && ($rec- +>{"fee_rebate_payable_type"} eq "RR")) { $rec->{"accrued_fee_rebate_value"} = "+" . $rec->{ +"accrued_fee_rebate_value"}; } #f/r type = FR if ($rec->{"fee_rebate_payable_type"} eq "FP" || "RP") + { $rec->{"accrued_fee_rebate_value"} = "-" . $rec->{ +"accrued_fee_rebate_value"}; } #f/r type = FP if ($rec->{"fee_rebate_payable_type"} eq "FR") { $rec->{"accrued_fee_rebate_value"} = "+" . $rec->{ +"accrued_fee_rebate_value"}; } #f/r type = RR if ($rec->{"fee_rebate_payable_type"} eq "RR") { $rec->{"accrued_fee_rebate_value"} = "+" . $rec->{ +"accrued_fee_rebate_value"}; } #B/L = B if ($rec->{"borrow_lend_indicator"} eq "B") { $rec->{"actual_cost_of_funds"} = "-" . $rec->{"act +ual_cost_of_funds"}; } #B/L = L if ($rec->{"borrow_lend_indicator"} eq "L") { $rec->{"actual_cost_of_funds"} = "+" . $rec->{"act +ual_cost_of_funds"}; } foreach my $field ( sort keys %$rec ) { $fh->print($rec->{$field},"\t"); } $fh->print("\n"); } $fh->close(); } #end process

Can anyone give me any pointers?

Your help is greatly appreciated

Thanks

Replies are listed 'Best First'.
Re: Grouping fields (mimic Excel 'AutoFilter' functionality)
by rdfield (Priest) on Nov 21, 2002 at 11:38 UTC
    The easiest way would appear to be by using a 'group by' clause in your SQL and wrapping a 'sum()' around each non-grouped column. Also, is there any particular reason that you're extracting the resultset to a flat file and re-parsing it rather than using the SQL output directly?

    rdfield

Re: Grouping fields (mimic Excel 'AutoFilter' functionality)
by robartes (Priest) on Nov 21, 2002 at 13:09 UTC
    rdfield has the right idea: databases are built for this kind of thing, so use them - it's all about using the best tool for the job.

    However, in general, if you want to tally categories in Perl, the best way is to use a hash, as in:

    use strict; use Data::Dumper; my %hash; (chomp, $hash{$_}++) for <DATA>; print Dumper(\%hash); __DATA__ camel flea camel dog frog jabberwock frog camel __END__ $VAR1 = { 'flea' => 1, 'camel' => 3, 'dog' => 1, 'jabberwock' => 1, 'frog' => 2 };
    In your case, you would search the input line for a bargain reference, and then increase the value of the entry in the hash corresponding to that bargain reference. Thanks to the magic of autovivifications, the entry for any new bargain references are automatically created for you.

    CU
    Robartes-

      Great - thanks a lot guys.

      One thing though, its not actually extracting from a Sybase database - the flex::SQL runs against a file which is mapped via F.trade_reference etc...

      So I don't have access to group_by or anything, as I am extracting directly from a .dat file...