in reply to Re: Parsing a list of files to see if any contain any one of a list of comma delimited strings
in thread Parsing a list of files to see if any contain any one of a list of comma delimited strings

graf:

I guess depending on your areas of expertise and skill, it could be any kind of problem. It seems like the type of problem particularly well suited to Perl, and I think it would be a nice exercise. Eventually I would like to get to the point where I would automatically write up a Perl script in response to this problem (and then, eventually....have it work on the first go!). Since I found this site I am definitely thinking in more Perl-like mode.

However I WILL admit that I did the preliminary filtering of the pattern file with sed, one step at a time (both because I wanted to check my progress after each step and am more familiar with one-off command line usage of sed, though I would like to learn how to use Perl like that).

Finally, wouldn't one want to use File::Find for the finding part?

Thanks for the responses!

Edit: I just saw your comment that you "prefer using Unix find"--my apologies.

  • Comment on Re^2: Parsing a list of files to see if any contain any one of a list of comma delimited strings

Replies are listed 'Best First'.
Re^3: Parsing a list of files to see if any contain any one of a list of comma delimited strings
by graff (Chancellor) on Apr 22, 2006 at 06:19 UTC
    Moving from sed to perl for one-liner operations on the command line (esp. in pipes) will be a lot easier once you get acquainted with the relevant option flags for perl -- browse through perlrun for a wealth of opportunities.

    Anything you would do with sed -- and a lot more that is hard to conceive of with sed -- is possible using "-e script" along with "-p" or "-n"; awk-ish stuff is done using "-a"; and "-l" can be very handy, as is -M.

    For lots of simple things, sed is still likely to involve fewer characters to type on the command line (and of course it's likely to run a bit faster), but a lot of things are really not feasible in sed or awk (using executable code in as part of a regex replacement, handling non-ascii character data, etc), but end up being pretty short work in perl.

    (BTW, I prefer unix "find" because, on any file tree of appreciable size -- thousands of files -- File::Find took about 5 times longer the last time I checked.)