Put another way, you aren't sure about how to split the line into fields? Once you have the line split up into fields, you just extract the second element of the array, i.e. $aFields[1] and see if it is in the hash.
How you split up the lines depends heavily on the syntax/grammar of your file. Do you know what that is? You seem a bit uncertain. Are these "fields" from a row of a database or a generated report? Does the file have a documented format? Or are these actually just words in a line?
If these are just words, you could safely split the line on whitespace, like this
$line =~ s/^\s+//; # strip leading whitespace from the line $line =~ s/\s+\z//; # strip trailing whitespace from the line my @aFields = split(/\s+/,$line); #extract words
However, the above won't work if you have whitespace inside a field because it will split the field in two. If on some line of the file, the first column contains three words, then column 2 would end up in the fourth array element and you'd never know.
What are the rules that actually govern the organization of this file into rows and fields? To know whether or not you need to use regex's you first need to know the file's rules. Regex's aren't always the best solution. thezip has pointed out that unpack would be a better way to split the line into fields if you are dealing with fixed width columns. (each column/field has known-in-advance number of characters.).
On the other hand, if fields are separated from one another using a separator string or character, regex's are often good for splitting such lines into fields. However, you won't know what regex to use without knowing the format. Rules for separator delimited fields can be very simple or complex.
A simple rule might be "a tab always means column separator". If that was your rule, you could just use split("\t",$line) to break up the line into fields.
Or it could be more complicated - columns are separated by whitespace except where the whitespace is quoted or escaped. Or it could be even more complicated: the first character of each line determines the field separator for the rest of the line, plus there is an escaping/quoting mechanism. The possibilities are endless. It would be hard to advise you without knowing the intended rules of the file.
Update: various clarifications and rewordings.
In reply to Re^3: comparing columns using regular expression
by ELISHEVA
in thread comparing columns using regular expression
by rocky13
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |