in reply to term frequency and mutual info

The answer will depend on whether later refers to later in the script or at some later time.

If later in the script, then simply reading through the two files and the storing results in an array of hashes might be easiest:

$fileA[$.]{$keyword}++

If you want this for subsequent processing, then a database solution is possibly the way to go.

-- Ken

Replies are listed 'Best First'.
Re^2: term frequency and mutual info
by Anonymous Monk on Oct 22, 2010 at 08:23 UTC
    Thanks for the hint! I guess the database solution is feasible since I don't need to create it over and over each time I run my codes!