in reply to regular expessions question: (replacing words)

Less example data would have been better, less scrolling to do ;-)

Also please use 'use warnings' and 'use strict'

One error I can see is in this line: @storage = $second_num; In short, the line does nothing sensible, even prevents the array from getting filled and should be removed completely.

Another problem: my ($first, $second) = split(/^\s{1}/, $organized); Using '^' in a split would only make sense if you wanted to split a file into separate lines (you also would need the regex-modifier m). But since you want to split one line, '^' makes this line simply a no-op. Also "\s{1}" is identical to "\s". So you could just write "split(/\s/, ..."

@final_storage seems superfluous, but in case you need it, copying the array would be as easy as @final_storage= @storage;

Finally, all this stuff you are doing step by step could be done by a simple regex if it is guaranteed that all elements except the first one have a space before them:

while (my $organized = <DATA2>) { $organized=~s/(\s)\w+/$1z/g; print $organized; }

UPDATE: Thanks to jwkrahn for noticing that \s in the replacement isn't really working. Changed to use parentheses and $1

Replies are listed 'Best First'.
Re^2: regular expessions question: (replacing words)
by jwkrahn (Abbot) on Sep 27, 2010 at 14:12 UTC
    $organized=~s/\s\w+/\sz/g;

    You are, for example, replacing the string " something" with the string "sz" instead of " z".

Re^2: regular expessions question: (replacing words)
by $new_guy (Acolyte) on Sep 27, 2010 at 12:36 UTC
    Hi Jethro, Thanks a lot for the useful reply. I will follow your advice and also try the script. I also just came up with this script and it worked. Any comments on it? (i would really appreciate it!!)
    #!usr/bin/perl my $FILENAME4 = "organized.txt"; open(DATA2, $FILENAME4); #remove all previous re-organized files my $remove_reorganized = "re-organized.txt"; if (unlink($remove_reorganized) == 1) { print "Existing \"re-organized.txt\" file was +removed\n"; } #now make a file for the ouput my $outputfile = "re-organized.txt"; if (! open(POS, ">>$outputfile") ) { print "Cannot open file \"$outputfile\" to write to!!\n\n" +; exit; } while (my $organized = <DATA2>) { #do some re-organizing #sort out the group numbers first $organized =~ s/(\w+)[^(^\d+)(\s)]/z/g; my $organized2 = $organized; $organized2 =~ s/z(\d+)/z/g; print POS $organized2; }
    Thanks, $new_guy
      $organized =~ s/(\w+)[^(^\d+)(\s)]/z/g;

      In a regular expression [ ... ] is a character set. What you told perl to look for is a word followed by ONE character that is not a '(', ')','^' or '+' and neither a number nor a space character.

      Since the first word in your lines seems to be a single digit number (at least in your sample data), it is just coincidence that it isn't replaced. Any word of length 1 will not be replaced. Also any word aka element with a number or any of the other characters above as last character would not be replaced.

      In short, if these lines work for you, it probably is just a coincidence

      Maybe you should use more variable test data to check for edge cases, for example try:

      5 suf 6 va7 7dra de) e+f ed ed 5z5 nu3 b +4 s 5 + 33 44 55 z5 zb zzz zb z5 4zz

      PS: Please reread my first post, I had to correct an error in my regex

      my $FILENAME4 = "organized.txt"; open(DATA2, $FILENAME4);

      You should always verify that the file opened correctly.

      my $FILENAME4 = "organized.txt"; open DATA2, '<', $FILENAME4 or die "Cannot open '$FILENAME4' $!";

      #remove all previous re-organized files my $remove_reorganized = "re-organized.txt"; if (unlink($remove_reorganized) == 1) { print "Existing \"re-organized.txt\" file was +removed\n"; } #now make a file for the ouput my $outputfile = "re-organized.txt"; if (! open(POS, ">>$outputfile") ) { print "Cannot open file \"$outputfile\" to write to!!\n\n" +; exit; }

      If you open the file for output instead of append then you don't have to delete the file first as that is a side effect when you open for output.

      # make a file for the ouput my $outputfile = "re-organized.txt"; open POS, '>', $outputfile or die "Cannot open file '$outputfile' to w +rite to because: $!";

      $organized =~ s/(\w+)[^(^\d+)(\s)]/z/g;

      You are using a regular expression that says: match one or more word characters followed by a single character that is not the character '(' or '^' or any digit or '+' or ')' or any whitespace, which does not make sense.    It could be that you do not understand how character classes work?

      Any comments on it?
      Yeah, use an indentation style that makes sense. Indentation is there for *humans* only. It isn't some sort of magical lube that makes your program runs faster, and all that matters is to have some of it.
        Dear Perl monks,

        I have a successive question. Now how do I select two columns at random, count ONLY all the z's common to both columns.

        I would like to repeat this say 10 times and finally get the mean of all counts (i.e 10 random selection).

        It gets more complicated. In the next round of random selection, I want to pick 3 columns and count the z's common to all of them, repeat this ten times. Do this .... until say n = 18 columns! getting the mean at each at the end of each instance! At the moment I have no idea on how to go about it! A hint would be really appreciated

        Thanks