Re: Simplifying my script...

A few observations:

Never use variables to name other variables (see perlreftut).
Use descriptive variable names. Instead of $ano and $bno (as well as $anum and $bnum!), how about $from and $to? Or $src and $dest?
Never use $a or $b by themselves (because use strict won't catch them — they're special-cased for use in sort BLOCK constructs).
Take repetitive work out of the loop (as others have already pointed out, you are reading in that bcodedb file for each incoming line — unless you have a good reason to avoid this, such as memory size).
Use regexes in preference to manipulating strings character-by-character.
Use equality when you mean equality; use regexes only when you really want regex matching.
Either use $_ directly, or assign it to a (well-named!) variable at the time you read it in.
As others have pointed out, your use of cat is painful and most likely unnecessary (unless you have something very strange going on with permissions or file types or named pipes, etc; but the odds are that you just want to open the file for read directly, not open another process to read that file then read from that process).
Consider constructing the template more cleverly (especially if you will ever need any other bits of information out of it).

Regarding reading that file within the loop each time, do this analysis. Is the main source data (that you are reading in via STDIN small enough to fit in memory? Is the charge database (that you're reading in from the .hdb file small enough to fit into memory? Depending on the answers to those questions, there are four different cases (2x2), and three different implementation strategies:

Both too large to fit in memory, especially concurrently

You have two options. The first is to load this information into a real database, then use the database to do the comparisons. You can use the free databases without too much pain; this particular application is an extremely straightforward join.

The other option is to work on the two files in pieces. Using the dedicated command-line tool sort(1), you could order each file by source number (which is what I think you are using $ano and $anum for). Then you can load just the calls from a particular source as well as just the charge for that source. This should cut back the amount of data that needs to be kept in core.

Charge code list is small (the input list can be large or small

This is the easy case. Just read the charge code list into memory before going through the input list.

Charge code list is large, input list is small

Another fairly easy case, although it requires you to turn the previous solution on its head: read in the input list once at the beginning of processing, then chew through the long input list.

Cleanup

This is assuming that the charge database is small:

#!/usr/bin/perl -w

require 5.006; # i'm using "open my $fh ..." construct

use strict;

# =================================================
# constants

# put magic constants at top, so people can easily change them

my $CHARGE_CODES_FILE = "mnta2/bmp/table/smsc/chg_indicator.hdb";

my $INPUT_TEMPLATE = "A12 A4 A20 A20
                      A2  A2 A2  A2
                      A12 A4 A24 A24";

my $SOURCE_IX = 2;
my $DEST_IX   = 3;

# =================================================

# first, load up the charge database.
my %charge_codes;
{
    open my $fh, $CHARGE_CODE_FILE
      or die "$0: opening '$CHARGE_CODE_FILE': $!";
    while ( <$fh> )
    {
        next if /^#/;      # skip comments
        s/\s+$/;           # remove end of line chars

        # get fields from line, clean them up.
        my ( $src, $dest, $code ) = split /,/;
        $src  =~ s/^0+//;
        $dest =~ s/^0+//;

        $charge_codes{$src}{$dest} = $code;
    }
}

# -------------------------------------------------
# process standard input, accumulate charges.

my %charges_against;

while ( my $input = <> )
{
    # extract source and destination values
    my @fields = unpack $INPUT_TEMPLATE, $input;
    my $src  = $fields[ $SOURCE_IX ];
    my $dest = $fields[ $DEST_IX ];

    # remove leading zeros
    $src  =~ s/^0+//;
    $dest =~ s/^0+//;

    # see if there are any charge codes associated with this pair.
    my $code = $charge_codes{$src}{$dest};
    if ( defined $code )
    {
        ++$charges_against{$code};
    }
}

# -------------------------------------------------
# output the results

foreach my $code ( sort keys %charges_against )
{
    my $count = $charges_against{$code};
    print "Total calls for Billing Code [$code] : $count \n";
}

# note that skip_zero is no longer necessary.
[download]

I had a lot of extra commented crap in the code to deal with situations that are likely rare but impossible to rule out without knowing the exact data we'd see. I've moved it below this cut to let me keep the code compact enough to display without a cut...

In the loop that reads in the charge codes:

        # the original code indicates there can never
        # be more than one charge code associated with
        # any given $src and $dest pair; if there are,
        # we would have to use an array ref here and
        # a foreach loop in the main processing.

        # only one charge code:
        $charge_codes{$src}{$dest} = $code;

        # # for multiple charge codes:
        # push @{ $charge_codes{$src}{$dest} }, $code;
[download]

Then, in the main processing:

    # # if you know that there is no charge code 0 or
    # # empty string, you could compress that like so:
    # if ( my $code = $charge_codes{$src}{$dest} )
    # {
    #     ++$charges_against{$code};
    # }

    # # if you need multiple codes per src/dest pair:
    # my $charge_code_aref = $charge_codes{$src}{$dest}
    #   or next;
    # # increment the number of charges against each code
    # foreach my $code ( @$charge_code_aref )
    # {
    #     ++$charges_against{$code};
    # }
[download]

And regarding my comment about building that template in a more useful fashion... I had stuff like this in my cleaned-up version of the script, but it got in the way of most of the other points. Here's what I had:

# construct the template in a more self-documenting way.
# this is a list of name => width values.  you can add
# more info here if you like; just adjust the loops
# that pull information out of it.  having all this
# information in one place means that you only have to
# change one place...

my @INPUT_COLUMN_INFO =
  ( unknown1 => { width => 12 },
    unknown2 => { width =>  4 },
    source   => { width => 20 },
    dest     => { width => 20 },
    ... );

# you can also use this structure to do clever things
# such as building a hash out of each input line to
# extract values by name instead of by numeric index.

# =================================================
# derived constants

my @INPUT_NAMES;    # names in original order
my $INPUT_TEMPLATE; # build up an 'unpack' template
my %INPUT_POS;      # map from name to index

for ( my $i = 0; $i < @INPUT_COLUMN_INFO; $i += 2 )
{
    my ( $name, $info ) = @INPUT_COLUMN_INFO[$i, $i+1];
    push @INPUT_NAMES, $name;
    $INPUT_TEMPLATE .= "A" . $info->{width};
    $INPUT_POS{$name} = @INPUT_NAMES-1;
}

# =================================================
# use during processing

while (<>)
{
    # construct hash using slice
    my %input;
    @input{@INPUT_NAMES} = unpack $INPUT_TEMPLATE, $_;

    # now access info from the hash
    my $src  = $input{source};
    my $dest = $input{dest};

    # ...
}
[download]

The exact method you use to extract information from the central structure doesn't matter as much as noticing the advantages you get from having all the format-related stuff in one place (especially when you consider that it might be in a database somewhere...)

Comment on Re: Simplifying my script... Select or Download Code