Gangabass has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,
i need your help.

I want to write a program which will search some text and replace it with another text in all files in the folder (and subfolders). The number of files is big (several thousands) and this is a CGI-script so the working time is limited.

So my first idea was File::Find but i need some stuff to make me possible to continue searching on next script start.

So

  1. i get list of all filenames in the search folder (and it's subfolders) and wrote it to a file (FILELIST.txt).
  2. in the main loop i get one line from FILELIST.txt and search and replace in it
  3. if everything goes fine i'm delete one line (first) from FILELIST.txt and save in to disk.
  4. going to point 2

My questions is:

Replies are listed 'Best First'.
Re: Find and Replace script
by hangon (Deacon) on Aug 19, 2007 at 04:28 UTC

    Assuming that this is a utility script to globally update a web site, and will not be run by visitors. I would go with your first inclination and use File::Find to walk the directory, and complete the operation in one call to the cgi script. Just print some feedback to the browser (even a single space should do) after each file update and you will not have to worry about timing out. I've used this technique often for long running cgi utility scripts. Just be sure to print a completion message so you know when the job is done.

Re: Find and Replace script
by jwkrahn (Abbot) on Aug 19, 2007 at 08:03 UTC
    This may work for you:
    use POSIX '_exit'; use File::Find; my $search_folder = '/some/folder'; find sub { return unless -f; local @ARGV = $_; defined( my $pid = fork ) or die "Cannot fork: $!"; unless ( $pid ) { local $^I = ''; while ( <> ) { s/some text/another text/g; print; } _exit( 0 ); } }, $search_folder;
    I did some testing but it may not work on your system.
Re: Find and Replace script
by moritz (Cardinal) on Aug 19, 2007 at 07:15 UTC
    I'm not quite sure I understand what you want to achive.

    Do you want to modify FILELIST.txt, or the files that are listed there?

    If you want to modify only one file, use Tie::File.

    But maybe you could tell us the underlying problem, perhaps there's a better solution to it.

      I want to modify FILELIST andfiles that are listed there.

      In order to continue from previous program execution i need to know there the program stop.

      I want to modify FILELIST andfiles that are listed there.

      In order to continue from previous program execution i need to know there the program stop.

Re: Find and Replace script
by misc (Friar) on Aug 19, 2007 at 11:11 UTC
    Hm, why do you say working time is limited ?
    Because of the timeout ?
      Yes. The buyer say that CGI exucution time is limited. Also i want to insure that i can continue searching if something bad happens (abnormal program termination).
        I'm not getting a picture..
        A limit of the execution time makes no sense to me..?

        If you've to deal with the browser timeout (I didn't formulate my question exact enough) I could think of a script which runs as a server, independent of the web server.
        You could start|control|stop the replace script through the web interface.

        I'd also say that you've more control over the separated replace script this way,
        the replace script could be restarted by a cron script if it's killed,
        oy you are able to run the script as another user as the web server/ with an other priority.

        [find and replace script]<--IPC-->[webserver cgi sript]

        The web interface could even update the progress and results through some javascript (ajax)

        This is at least how I would do this.

        beside this, storing the progress of the traversal in a filelist file seems like a good idea to me.
Re: Find and Replace script
by Anonymous Monk on Dec 02, 2008 at 16:57 UTC
    Hi Gangabass,

    I am aware that this thread is pretty old, but I have the same situation as yours. I didn't see the end of this thread so just wanted to ask how did you get this working? I have a file which has list of files that should be searched for particular regex and matched pattern should be changed. Again the list is some thousands of file. Can you please guide me what approach i should go for?
    Thanks.