michaelp has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone, I´m simply trying to do chomp on every element of an array, which is part of a subroutine. The array is in form of a reference:
foreach my $line (@$array) { chomp $line; print "$line**"; my @array = split(/\t/,$line); ...}
The chomp which is supposed to remove the newline character doesn´t do anything. I read somewhere that you have to dereference the variable first, but $line should be dereferenced by dereferencing the array..? What am I doing wrong? Thanks, Michael

Replies are listed 'Best First'.
Re: chomp - reference
by choroba (Cardinal) on Jun 03, 2010 at 23:02 UTC
    Your code works for me. Try Data::Dumper to see what really is inside the $array.

    Sidenote: Using both $array and @array in your code decreases its readability and makes it error prone.

Re: chomp - reference
by davido (Cardinal) on Jun 04, 2010 at 12:30 UTC

    I wanted to go through your simple example line by line to try to shed light:

    foreach my $line (@$array) {

    Here $line is becoming an alias for each element in the array referenced by $array, one iteration and one element at a time. Any changes made to the contents of $line will propagate back to the corresponding element in @{$array}. So, you're on the right track.

    chomp $line;

    Assuming $line contains a trailing "newline" character ( \n ), and also assuming that the special variable, $/ (the input record separator) hasn't been tampered with, and is set to the default of \n, chomp will remove the trailing newline character. Here is one area where you may have a problem. If $/ has been changed from its default, you may be expecting to chomp a newline, when in reality you're chomping some other value. More on this later.

    print "$line**";

    Now in your output you're not printing a newline character. Since you've already chomped the input record separator out of $line, (again assuming $/=="\n") there won't be a newline printed between iterations. This can cause the output buffer to not immediately flush. This isn't necessarily related to your problem, but it could be leading to difficult to read output, unless there's something later on in the loop we don't know about that is outputting a newline.

    my @array = split(/\t/, $line);

    This is just a style point. We have a loop based on @{$array}, and a lexical within the loop named @array. With this sort of naming strategy you're not doing yourself any favors with respect to clarity. In particular, the inner lexical may be given a more descriptive name that avoids visual confusion with @{$array}.

    ...}

    The omitted code paraphrased by "..." may contain something relevant here. The question may be, "How do you know chomp isn't working?" Whatever comes next in the loop might help us, or may not. But the point is that nothing contained within your example would explain why chomp isn't doing what you expect it to be doing.

    My theory is that you may have one of the following issues:

    • $/ is set to something other than newline.
    • Your test to see if newline is being chomped could be flawed, and if that's the case, we might see the flaw in the omitted code that comprises the remainder of your loop.
    • You may have more than one trailing newline, with $/ set to "\n" instead of "".
    • There may be some other whitespace or non-printing character between two newline characters. chomp, in paragraph mode, will remove all trailing newlines. But if you have a trailing "\n\t\n", only the last "\n is removed, since the first one isn't "trailing" (there's a tab after it.

    I would start by wrapping that loop in the following code:

    { local $/ = ""; # Set paragraph mode foreach my $line ............. ....} }

    In other words, localize the special variable $/, and set it to paragraph mode to see if that resolves your issue. It's an easy test, and while it may not be the problem, it only takes a few seconds to check so that it can be eliminated as an issue. In Paragraph Mode chomp will remove all trailing newlines, even if there's more than one.


    Dave

Re: chomp - reference
by ikegami (Patriarch) on Jun 03, 2010 at 23:43 UTC

    Perhaps the value doesn't end in a newline, or perhaps someone changed $/. You can check what the values of $line and $/ using

    use Data:::Dumper; { local $Data::Dumper::Useqq = 1; print(Dumper($var)); }

    You might have better luck with the following:

    $line =~ s/\s+\z//;
    It removes all forms of whitespace from the end of the line (except NBSP U+00A0 in some circumstances).
Re: chomp - reference
by GrandFather (Saint) on Jun 04, 2010 at 00:52 UTC

    You could chomp @$array outside the loop which has the same effect as the chomp inside the loop but makes it obvious that that is happening. $line is aliased to each element of @$array so the contents of @$array is altered by the loop in any case.

    True laziness is hard work
Re: chomp - reference
by Marshall (Canon) on Jun 03, 2010 at 23:23 UTC
    #!/usr/bin/perl -w use strict; my @a = ( "asdfadsf df\n", "qrwtqtre ab\n"); my $ref = \@a; print "simple print\n"; foreach my $line (@a) { print $line; } print "\nnow with chomp\n"; foreach my $line (@$ref) { chomp $line; ## this modifies @a print "$line**"; } print "\nnow again simple print\n"; foreach my $line (@a) { print $line; } __END__ simple print asdfadsf df qrwtqtre ab now with chomp asdfadsf df**qrwtqtre ab** now again simple print asdfadsf dfqrwtqtre ab
    looks like it works, maybe there is some other piece of code that isn't shown? perhaps try s/\s*$//; to get rid of ALL white space at the end of line.
Re: chomp - reference
by sierpinski (Chaplain) on Jun 04, 2010 at 12:39 UTC
    This may be a wild stab, but I've seen it happen before. If you take a file created in Windows and transfer it to unix, it can add extra newlines, and if you're only chomping once, you're only getting the last \n from your data. Have you tried printing your raw data before processing to verify what it looks like?

    It's a simple thing, but sometimes its overlooked. I didn't catch that anyone had mentioned it, so I wanted to throw that out there.
      Thanks again for all the help! The lines ended in \r\n, so a substitution did the trick. Cheers, Michael