in reply to Re: Perl Script performance issue
in thread Perl Script performance issue

The data files are large. Initially i did consider storing them in memory, but discarded the option due to large size of the files. Account file is 182777579 byte, which varies daily, but remains more or less of the same size. Currently holds 62394 records.

Replies are listed 'Best First'.
Re^3: Perl Script performance issue
by poj (Abbot) on Dec 16, 2015 at 08:46 UTC
    ACCT NUMBER read main file, second column in main file is primary key for looking up positionfile.delim.

    If value is held in 6th column, which column in positionfile.delim is the primary key. Is is column 1 ?

    Which column in main file are these held, they can't all be the second column, or am I missing something ?

    ACCT NUMBER|positionfile.delim|2|6 PO TYPE|positionfile.delim|2|3 LOC CODE|positionfile.delim|2|47
    poj

      primary key can be same for many fields. For example: here second column would be acct number 1234, so in order to fetch details for that acct only, i am grepping that acct number (which returns single record) and then select columns as specified. Here
      LOC CODE|positionfile.delim|2|47
      LOC NAME|locationfile.delim|47|4
      For loc code, use second column from main file as primary key for looking up positionfile.delim, get 47th field.
      For loc name, 47th column from main file is primary key for looking up locationfile.delim, get the 4th field.

        For loc code, use second column from main file
        So in positionfile.delim one record provides 3 fields with the same key from column 2 of main file
        Column Field 6 ACCT NUMBER| 3 PO TYPE 47 LOC CODE
        Is the key always column 1 in the 3 lookup files
        positionfile.delim,accountfile.delim and securityfile.delim ?
        How many records in the main file ?
        Update : Try this. Opening the files just once and caching the search results might improve performance.
        poj
Re^3: Perl Script performance issue
by Laurent_R (Canon) on Dec 16, 2015 at 18:30 UTC
    The data files are large. Initially i did consider storing them in memory, but discarded the option due to large size of the files. Account file is 1827775ely tiv79 byte, which varies daily, but remains more or less of the same size. Currently holds 62394 records.
    This is indeed relatively large, but most probably small enough to fit into memory on a decently modern computer. That's what I would try anyway. Especially if you can decide to store in memory only the part of these files which is useful for your process.