Your Plan is a bit confusing to me owing to your use of the term "hash keys" which I associate with something very specific and which might not be what you meant.
In general a data search problem of this type is done by keeping your "list of names" in memory in an efficiently searchable form. A hash table would often be used for this. Is name1 in my list of names? can be answered very quickly.
Things become more complex if some amount of "sort of matches" is allowed. The question: Does name1 "look like" something in my "list of names" can be complex or computationally expensive.I'd have to have some example data to make a concrete recommendations.
So, if an exact match to one of words in your list is required, then a simple hash table of your names would suffice. Read a line of data, decide if the name matches and if so, do "something" with it, otherwise skip that line (do nothing). Read next line, rinse repeat.
Please give some more detail. Then we can discuss "What to do" in more detail (the processing algorithm). Along the way, you will need to quite a bit of learning on your own about "How to do it". A good and perhaps stepwise plan should be of interest to you along with some books and other material to read in order for you to get started.
Update: I guess one starting point would be to try to translate your awk code into Perl. The enormous execution time suggests to me that you have a very inefficient algorithm for determining if a name is relevant or not? How you are currently making that decision is one main point of mine above.
In reply to Re: Extacting lines where one column matches a name from a list of names
by Marshall
in thread Extacting lines where one column matches a name from a list of names
by mr_clean
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |