in reply to Misunderstood array behavior

Thank you everyone for your suggestions. I need to clarify a little more.

wfsp and GrandFather: I apologize for misplacing the 'if' statement. I accidentally pasted it outside the inner loop, but in my code it is inside the inner loop. The condition works in every instance, except on the last item in either array.

GrandFather: I thought about using a hash, but I need to gather the files in order (by column) so that I can correctly order the second one. I cannot think of how to do that with a hash. It seems that a 2D array would be optimal. Do you a suggestion on how to do it with a hash?

AnomolousMonk: What I mean by the comment about "eventually" reading the whole file, I mean that I will eventually read it into a 2D array during that loop, but I simplified it for the posting. However, it still has the same behavior as is. I kept the loop to maintain what I would do later. Is there a better way to read in each row and column?

jethro: Thank you for suggesting Dumper, I was not aware of it. It also perfectly shows my problem. When it prints the last item in both arrays, it's all messed up:

#### BEGIN ####
$VAR215 = 'MS02-19196-A6-DCIS';
$VAR216 = 'MS02-19196-A6-INVASIVE';
$VAR217 = 'MS01-9167-A7-DCIS';
';AR218 = 'MS06-1878-D2-DCIS
#### END ####

That is exactly how it prints. Also, when it gets to the if condition, the condition fails. However, at that very moment, I can print the value in the debugger with "p $sample1"

Anomolous Monk suggested that it could be a problem with extra tab(s) at the end of the line, but I have double checked that. This is really confusing to me. I also tried another file that is totally unrelated to what I'm doing, and it had the same behavior.

I would like to post the file, but it has 218 columns and I don't see a way to upload it.

Thanks for your help.

Replies are listed 'Best First'.
Re^2: Misunderstood array behavior
by jethro (Monsignor) on Sep 20, 2008 at 16:18 UTC
    $VAR217 = 'MS01-9167-A7-DCIS'; ';AR218 = 'MS06-1878-D2-DCIS

    If this is exactly what you get from data dumper, then there is a carriage return at the end of the line (hex 0D).

    It might mean that you use a msdos file on unix and your chomp only removes the Line Feed and not the Carriage return . See the man page of chomp and its dependance on $/. Setting $/ to "\r\n" would correct that, but then real unix files would not work. If you need both file types to work, use a regex instead of chomp

    About GrandFathers suggestion: Is the ordering of both files important to the result? If not you might put the second file into a hash instead of the first. But if you want helpful answers to that question you might open a new thread and tell us exactly what you want to do with those two files

      For Pete's sake...I never would have suspected that because it seemed to be stomping on memory. I've dealt with these different line endings before, but never ran into that behavior.

      Thanks a ton for everyone who helped. I have to mention that this has been the most pleasant forum I've ever worked with. Thanks!

      By the way, I'm using tchomp (http://cpan.uwinnipeg.ca/htdocs/Text-Chomp/Text/Chomp.pm.html) to solve the problem. Do you see any reason not to always use tchomp in place of chomp?

        If your separator isn't a standard newline at all, tchomp would fail. I posted an example using '>' recently.

Re^2: Misunderstood array behavior
by tinita (Parson) on Sep 21, 2008 at 10:44 UTC
    #### BEGIN #### $VAR215 = 'MS02-19196-A6-DCIS'; $VAR216 = 'MS02-19196-A6-INVASIVE'; $VAR217 = 'MS01-9167-A7-DCIS'; ';AR218 = 'MS06-1878-D2-DCIS #### END ####
    This is why I always recommend $Data::Dumper::Useqq in such situations. Putting quotes some kind of delimiters around the variables you want to debug is of course a good thing, but if you're dealing with *lines* and have a problem, just use
    use Data::Dumper; $Data::Dumper::Useqq = 1; # shows all non-printable characters print Dumper \@lines;
    (I also prefer to dump a reference, this avoids the big mess of many $VAR314159...)
    edit: I even have a useful mapping for vim on my homenode which lets you debug with only very few keystrokes. (for emacs it looks a bit more complicated)
      Thank you. That is a valuable tip.

      Do you see any reason not to always use tchomp in place of chomp?

        AZed did, when you asked the same question above.