angela2 has asked for the wisdom of the Perl Monks concerning the following question:

So what I hope to manage to do is:

1. read in a file

2. find some entries

3. use those entries to proceed

So first I'm changing to the directory where those files are. They are into a series of folders and subfolders and I am looping (rest of script not shown) to find them all.

Then I'm checking if this file exists and, if it does, I'm opening it for read.

Then, what I want to do but I don't know how to implement it, is find the 3 entries that exist in this file. These 3 entries are space delimited and I have already initiated them as variables from before in my script. My files look like this, for example: FOLDING SHAPE OXYGEN

so I want to match those three entries (folding, shape, oxygen) using split. These three entries already correspond to variables $pattern, $cyclo and $group. In a different file, these 3 entries might have different values, so that's why I can't match the actual words "folding" and so on.

chdir "$results$filepath" or print "cannot chdir to $results$filepath +! $!"; if (-e "test.txt") { open (my $test, '<', "test.txt") or print "Can't open file: $!"; while(my $line = <$test>){ chomp $line; split(" ", $line); print $line; } }
Obviously this is wrong. I get the message I think I need to find a way to say "split those entries and put them in those three variables", but I don't know how? Anyone can suggest how to proceed?

Replies are listed 'Best First'.
Re: split file and put contents in variables?
by choroba (Cardinal) on Feb 02, 2016 at 15:14 UTC
    Don't use print if the file can't be opened. Skip the whole program or the reading loop.

    open my $test, '<', 'test.txt' or die "Can't open file: $!"; while (my $line = <$test>) { # chomp not needed with split on ' '. my ($pattern, $cyclo, $group) = split ' ', $line, 4; print $line if 'FOLDING' eq $pattern && 'SHAPE' eq $cyclo && 'OXYGEN' eq $group; } }

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
Re: split file and put contents in variables?
by toolic (Bishop) on Feb 02, 2016 at 15:12 UTC
    You can either split into an array variable or 3 separate variables:
    my @entries = split /\s+/, $line; my ($x, $y, $z) = split /\s+/, $line;
Re: split file and put contents in variables?
by Corion (Patriarch) on Feb 02, 2016 at 15:10 UTC

    Why is your code "obviously wrong"?

    If you let it, Perl will tell you some things that seem fishy about your code. Add use warnings; as the top line of your script.

    split is a function, so you should assign its return value to something.

Re: split file and put contents in variables?
by angela2 (Sexton) on Feb 02, 2016 at 15:26 UTC

    Hi all, thanks for your effort! I managed to do this, which I think does what I want:

    if (-e "test.txt") { open (my $test, '<', "test.txt") or print "Can't open file: $!"; print "opening file \n"; while ( my $line = <$test> ) { my ($a, $b, $c) = split / /, $line; print "$a, $b, $c \n" ; } }

    What do you think? I've replaced my variable names with $, $b, $c for testing, just to be quicker.

    But I found that I have to match 5 whitespaces, so I was wondering if I can do that with a /d{5}/ or something like that? Instead of matching for / / which needs the spacebar pressed 5 times and doesn't look as nice?

      The split pattern behaves mostly like a regular expression, with the differences well documented in split. There is no such thing as a "d" meta-character-class. But there is \d. Your example of /d{5}/ would work if you just used proper metasymbol escaping: /\d{5}/.

      And any quantifier that is valid for regular expressions would be valid for the pattern used in split. If you wish, you might use +, {0,5}, {5,}, and so on.


      Dave

      What do you think?
      I think you should use the code I showed :)
      my ($x, $y, $z) = split /\s+/, $line;
      • \s+ grabs a variable number of spaces: 1, 5, whatever. This is documented in the link I showed you: split.
      • $a and $b have special meaning to sort. Don't use them here. You should probably use more meaningful names as well.

        Thanks toolic :)

        I have one more question, I just remembered actually that the first entry in my file, namely FOLDING, is made up from two variables, as in I have assigned it to two variables, $pattern and $match, used as $pattern$match. I know it looks weird but I needed to do it that way.

        So if in a different file my FOLDING pattern is "W" and matches a lipid, I would have an entry as "Wlipid". Hopefully that makes sense?

        So my question is how do I match this now? Do I just match is as my ($w$x, $y, $z) ? Or it makes no sense?

      The single space string ' ' (not a regex / /!) I suggested is special in split, as documented. It behaves as /\s+/, but it's shorter and more readable.
      ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Moeover, don't get into the bad habit of using "$a" for anything other than sort, q.v.   and even more emphatically, do use meaningful names for $var.

      That will cost you a very small bit of extra effort when writing the program... and save you or some future maintainer a huge PITA when trying to understand or modify your program some-when in the distant future... like, say, next week.


      Come, let us reason together: Spirit of the Monastery