faozhi:

Others are assisting you with the question you asked. I'm going to fly off on a couple of tangents, instead, about basic coding practices.

Commenting

Generally, if you write your code clearly, the need for comments is greatly reduced. For example, in this section of your code, the variable names are clearly file names, so the comment is redundant.

#declare all filenames $filename1 = 'one.txt'; $filename2 = 'two.txt'; $filename3 = 'three.txt';

In this section, the comment is pretty much a duplication of what the code says, so it's not helpful.

#open text file 1 open (FILE1, $filename1) or die "Unable to open $filename1 because $!\ +n";

If I felt a comment necessary, I would instead have stated in my comment the effect of what I was doing, like this:

# Store the contents of the first file into $hash{col1_col2} open (FILE1, $filename1) or die "Unable to open $filename1 because $!\ +n"; while ($line = <FILE1>) { chomp ($line); ($chrX, $chrpos, $value1, $value2) = split (/\t/, $line); $key1 = join ("_", $chrX, $chrpos); $hash{$key1}++; }; close FILE1;

Indentation

The use of indentation is supposed to clarify the structure of the code, so you can see which statements are bundled together, and to make it simple to tell which code is associated with which control-flow structure. By having all your1 control-flow statements aligned to the left margin, you make it more difficult to see the logical structure of the program. You should change from this:

while ($line = <FILE2>) { chomp ($line); ($chrX, $chrpos, $value11, $value22) = split (/\t/, $line); $key2 = join ("_", $chrX, $chrpos); if (exists $hash{$key2} > 0) { $hash{key2} = $value11 + $value1; $hash{key2} = $value22 + $value2; $hash{key2}++ } };

to this:

while ($line = <FILE2>) { chomp ($line); ($chrX, $chrpos, $value11, $value22) = split (/\t/, $line); $key2 = join ("_", $chrX, $chrpos); if (exists $hash{$key2} > 0) { $hash{key2} = $value11 + $value1; $hash{key2} = $value22 + $value2; $hash{key2}++ } };

Obviously, it's not a problem in this particular program, as you don't have anything complicated going on. But when you have a page full of code with a lot of flow-control going on, you're going to find it difficult to maintain your code.

Semicolons

While not harmful, you're putting extra semicolons in your code (specifically at the end of your while loops. It doesn't hurt anything in this case, but since they're unexpected, it *does* make the code slightly harder to read.

Subroutines

When you start writing the same code repeatedly, you should start thinking about how you can use subroutines to simplify your task. For example, this code:

open (FILE2, $filename2) or die "Unable to open $filename2 because $!\ +n"; while ($line = <FILE2>) { chomp ($line); ($chrX, $chrpos, $value11, $value22) = split (/\t/, $line); $key2 = join ("_", $chrX, $chrpos); if (exists $hash{$key2} > 0) { $hash{key2} = $value11 + $value1; $hash{key2} = $value22 + $value2; $hash{key2}++ } }; close FILE2;

is nearly identical to the code you use to process file 3. So you should think about using a subroutine to process the files. For example, you could create a subroutine like this:

sub process_file { my $filename = shift or die "Missing filename!"; open (FILE, $filename) or die "Unable to open $filename because $! +\n"; while ($line = <FILE2>) { chomp ($line); ($chrX, $chrpos, $value11, $value22) = split (/\t/, $line); $key2 = join ("_", $chrX, $chrpos); if (exists $hash{$key2} > 0) { $hash{key2} = $value11 + $value1; $hash{key2} = $value22 + $value2; $hash{key2}++ } } close FILE; }

then, in your code, you can process your second and third files like2:

process_file($filename2); process_file($filename3);

Due to correctness issues in your code, I can't tell whether it's possible or not, but frequently in programs like this, you can use the same subroutine for your first file as well--the if statements will degenerate to a single case, and only be used for the successive files. Once you clean up the other bits of your code, you might be able to take advantage of it.

use strict; use warnings;

I haven't checked to see whether the strict or warnings modules would help in this case or not, but it would be to your advantage to put them into your program before anything else. They will catch many programming errors for you. You may even find "use diagnostics" helpful. (I generally only put in "use diagnostics" when I don't understand what the error message is trying to tell me.)

I hope you find some of this useful.

...roboticus

Updates: (marked by superscripts in the above text)

  1. Changed 'you' to 'your'
  2. In the next code snippet, I corrected the second line, changing 'process-file' to 'process_file'

In reply to Re: How to add values of hash by reading from different text files by roboticus
in thread How to add values of hash by reading from different text files by faozhi

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.