in reply to Strange problem trying to clean garbage from start of mailbox file

The main culprit seems to be this line:
system("sed -e '1d' $path" . "$_[0] | more > $path" . $_[0]);

Oops! You want to edit a file "in place", but by redirecting output to the same location as your input is supposed to be, you effectively truncate that file before it can be processed.

When updating file contents you should make sure input and output don't interfere. One approach could be like this: Since you already use perl to read the first line, why don't you just read on until you find a "From" line, and then start copying that and what follows to another file. Finally you can move the result back to the original location.

Of course, perl has builtins that can do most of the work for you. Like, for example:

perl -n -i.bak -e 'print if /^From/..-1' mail_file
This snippet removes all lines before the first occurence of a line starting with the four letters F, r, o, m from mail_file, leaving a backup of the original in mail_file.bak.

You should also make sure no mails are delivered while you are working on real life mailbox hierarchies.

Replies are listed 'Best First'.
Re^2: Strange problem trying to clean garbage from start of mailbox file
by capoeiraolly (Initiate) on Feb 02, 2006 at 23:40 UTC
    I'll give the perl command a go, but the sed command does actually work... give it a go. If you have a text file with say three lines in :

    line 1
    line 2
    line 3

    The result of that system call is (I've tried it on both Debian and BSD) :

    line 2
    line 3

    Of course I will make sure that no mail is delivered to the mailbox while i'm messing around with it :)
      Your shell command line might sometimes work but the problem is precisely that it is not guaranteed to do so. The reason is that the >file part clobbers the very same file that is supposed to be read by the sed -e '1d' file part.

      If there was only one process involved, the outcome would be quite predictable. However, since you constructed a pipeline of two processes there is a chance that the first one wins the race and catches a portion of the file before the file is destroyed by the second one. However, as you already observed, you can not rely on that.

      To solve that problem you can use a temporary file (like perl -i does behind the scene) or read and write to the file through a single file handle in a single process, which may prove somewhat more difficult to get right.

      If you are interested anyway you may want to look up file access modes in perlopentut, specifically +<. You also might find the truncate function useful. The Perl Cookbook has excellent explanations of the different techniques.

Re^2: Strange problem trying to clean garbage from start of mailbox file
by capoeiraolly (Initiate) on Feb 03, 2006 at 02:41 UTC
    Works beautifully. Thanks you for the help :)