in reply to Re^2: Search through multiple files and output results to new file
in thread Search through multiple files and output results to new file

Each instance would be trying to process the same files.
Oh. I never got the impression that the OP wanted this. I fail to see why that's a benefit.
Which would mean lots of "file in use" errors and/or duplicated results.
Windows doesn't allow two processes to open the same file? That sounds like an easy DoS.
The multiple instances would be trying to redirect their output to the same file. Even if you use append (>>), Windows won't let you do that. Not sure if *nix will?
Unix allows multiple processes to write to the same file. And if the file is opened in append mode, all write()s will go to the end of the file.
  • Comment on Re^3: Search through multiple files and output results to new file
  • Download Code

Replies are listed 'Best First'.
Re^4: Search through multiple files and output results to new file
by BrowserUk (Patriarch) on Aug 24, 2010 at 21:05 UTC
    Oh. I never got the impression that the OP wanted this. I fail to see why that's a benefit.

    If you start two concurrent instances of grep with a wildcard filespec, (win or *nix), then both instances will expand that wildcard to the same list.

    So then the problem becomes, how do you distribute the file list across multiple grep instances?

    Windows doesn't allow two processes to open the same file?

    It does, but not by default. And the permissions required are not available from perl.


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      So then the problem becomes, how do you distribute the file list across multiple grep instances?
      There are several ways - the more you know about the file(s), the more options you have. However, that still uses grep.

      Perhaps the OP can first elaborate on what's he's trying to achieve before offering solutions based on guesses.

        Perhaps the OP can first elaborate on what's he's trying to achieve before offering solutions based on guesses.

        It seemed, and still seems to me, quite clear from the OP what he is hoping to do.

        Run the equivalent of  grep -h %search_string% *.txt > retuned_Results.sl in a manner than utilises his multiple cores to reduce the runtime of the query.

        Ie. Process the filelist generated from *.txt in parallel on all his cores, rather than serially on only one as the standard grep command would do; and collate the results into a single output file. I don't see any need for guesses.

        Whether this actually achieves a saving in runtime really depends upon the size of the files and complexity of the search parameters. It is actually quite difficult to compare like with like because of the affects of file system caching.


        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.