fasoli has asked for the wisdom of the Perl Monks concerning the following question:

Hello wise monks,

I'm new here and I'm (captain obvious) new to Perl. Basically I'm a bit confused even about how to describe what I want to do. I obviously don't expect you to hand out ready code for me but more like to help me understand where to look for reading.

So, I am preparing** a bunch of files (let's say 50) by opening them, writing in the file and closing it. However, I need to be able to tell which file is which from their contents - that are just atomic coordinates, so not much help from that. The contents are arranged in columns and the first column is the name of my (chemical) molecule, let's say 1ACx, which goes down for 250 lines. Now I want this x (form 1ACx) to actually be 1 for the whole first file that I'm creating (as in 250 entries of 1AC1), 2 for the second file and so on. So in the end I want to end up with 50 different files, where their first column would go from 1AC1 to 1AC50, but not incrementing in the actual file.

**while I am in a foreach loop, where I am fishing out data from all the files in the foreach loop and saving them in my files.

What I've managed to do so far is a few syntax errors and, mostly, a counter that counts all my atoms, so in each file it goes from 1AC1 to 1AC250. Another thing that I managed to do was print all the lines of the files where I am fishing the data out from.

I honestly can't wrap my head around the concept of how to do it and what it's actually called :( I would be grateful if you could explain it a bit, as in, for example, should I be reading on how to make counters for elements in arrays?

Thank you so much in advance.

Replies are listed 'Best First'.
Re: counter of files? something else?
by kennethk (Abbot) on Apr 22, 2015 at 18:24 UTC
    What you need to do here is have your counter increment outside the scope of your main for loop. If I did this, I might code it like:
    for my $file_no (1 .. 50) { my $filetag = "1AC$file_no"; open my $fh, '>', "$filetag.mol" or die "Open fail for $filetag: $ +!"; my @lines = generate_lines($file_no); for my $line (@lines) { print "$filetag $line\n"; } }
    If this is off point, try explaining how this algorithm (in pseudocode if necessary) missed. Reading Foreach Loops in perlsyn might be informative.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: counter of files? something else?
by kcott (Archbishop) on Apr 23, 2015 at 01:06 UTC

    G'day fasoli,

    Welcome to the Monastery.

    I think your first port of call should be perlintro — Perl introduction for beginners. It's made up of a dozen very short sections which just give a brief overview of their topic; however, each also typically has multiple links to more detailed information. For the detailed information, you may want to arm yourself with the Perl Glossary.

    Next, take a look at the online perl manpage. As you can see, Perl has a lot of documentation: I'd recommend just aiming to get a feel for where information can be found. Some suggestions to get you started. I've only included what I thought to be relevant based on your question; browse the others at your leisure, if they're of interest.

    Overview
    • perlintro — refer back to it for those links to detailed information
    • perlrun — just remember: shebang line, command switches, environment variables
    Tutorials
    • perlreftut — this is short and you will eventually need to know it - recommend you read it quickly now and refer back as needed
    • perldsc — the Data Structures Cookbook - decide how you'll structure your data - this will tell you how to create, access and modify it
    • perllol — multidimensional arrays - refer as needed
    • perlfaq — the Perl FAQ is broken up into nine separate FAQs - this page has links to them, it also lists all the questions - suggest you read the questions on this page, try to answer them, then compare with the FAQ answer
    Reference Manual

    You'll probably only use a handful or so of these with any frequency. Let perlintro be your guide as to what to read first. Here's two of particular note.

    • perlopentut — a tutorial on opening files - probably should be in the Tutorials section
    • perlfunc [Caveat!] — this page is very large; I've had problems in various browsers; I avoid it (YMMV) - instead, try the Functions link in the Reference section of the sidebar - an index of links to functions and each function gets its own [nice and small] webpage.
    Miscellaneous
    • perlbook — more sources of information
    • perldoc — all this documentation from the command line

    Your question talks about columns. Are you working with a spreadsheet of some description? If so, I'd strongly recommend Text::CSV.

    -- Ken

Re: counter of files? something else?
by Marshall (Canon) on Apr 23, 2015 at 02:38 UTC
    I don't know where to start.

    If you can show us a simple example, maybe with 10-20 carefully selected lines? That would be helpful. Very helpful. Your code so far would also be helpful, very helpful.

    Enclose the data and program with

    <c> and </c> tags
Re: counter of files? something else?
by soonix (Chancellor) on Apr 23, 2015 at 07:49 UTC

    one of the monks here has a signature that says in essence: the best way to start such a task is to think how you would do it by hand.

    So, try to explain the task to a (possibly pre-school) kid, who can count, read/write letters and digits:

    how should he/she go about doing this for you with 50 files sheets of paper?
    To make the task easier, the paper is appropriately sized :-)

Re: counter of files? something else?
by fasoli (Beadle) on Apr 23, 2015 at 11:50 UTC

    Thank you all for your feedback. I really appreciate it. Let me be a bit more clear (hopefully..) about what I want to do.

    Here is a part of what I've written so far

    #count the files in my foreach loop (I'm in a foreach loop, #not shown, that loops through @files my $count = @files; my @count = (); # I'l explain what I want to do with this - Note 1. push (@count, $count); print $count; print "\n"; #now this is where I am creating my new files, getting data from the #files in the foreach loop my $pdb; open (my $pdb, '>', "$filename.pdb"); my $p; my $w; my $title = "$filename"; my $bx = 50.000; my $by = 50.000; my $bz = 50.000; printf $pdb "%s$title"; printf $pdb "\n"; printf $pdb ("%5d",$atomcount); printf $pdb "\n"; for ($p=1; $p<=$atomcount; $p++) { my $carboxyl = "@count[$w]"; #here is all that I want to do-Note 2 printf $pdb("%5d%-5s%5s%5d%8.3f%8.3f%8.3f",1,$carboxyl,@atom[$p],$ +p,@x[$p],@y[$p],@z[$p]); printf $pdb "\n"; } printf $pdb("%10.5f%10.5f%10.5f",$bx,$by,$bz); printf $pdb "\n"; close $pdb/;

    Note 1: My idea is that I'll first count the files I'm looping through (they're 40) and then I'll be able to tell perl to show me the first file (that is file 1), then the second file (file 2), all the way until 40. That's why I'm putting $count in an array, in order to tell perl "ok now show me the first time you find this type of file; then the second time and so on". That way... (takes us to note 2)

    Note 2: ...I will be able to get this $carboxyl entry unique every time. In the first pdb file it will be equal to 1, then 2, up to 40. So I will never have two same $carboxyl values in any of my pdb files. That way I will be able to distinguish between them by looking at the contents: the first one will have $carboxyl = 1 all the way down (250 lines), the second one will have $carboxyl = 2 all the time and so on.

    The problem is that I'm doing it wrong because obviously I'm just printing the actual count number. Also I think the approach is wrong, because it might be better to actually assign numbers to $carboxyl based on how many pdb files I open and create, that would make more sense, right? Rather than how many files I loop through to get data FOR the pdb file I'll create later.

    Am I at least getting close in terms of understanding this? I know I have to do reading - I already have two books (Learning Perl and Beginning Perl) and opened all the links you've recommended (in order to read, not just to close them back). I'm panicking though because this will take me days to correct - been trying since yesterday for a simple counter thing :'(

      • Does @files change during the loop? otherwise, you'll end up with $count having always the same value.
      • if you are inside a loop, the my variables are only for that one iteration. Is this what you want?
      • $count is not a very good variable name
      I think you want:
      • before the foreach loop: my $file_number = 0;
      • in the loop: $file_number += 1;
      Plus you seem to get the sigils wrong - see the first part of → perldata:
      @whatever is an array, a single element of that array is $whatever[123]. @whatever[123] is an array slice.

        - No, @files doesn't change through the loop.

        - That's a good point. I'll try it differently.

        - Ok, thank you for the suggestion.

        I'll try the two final bullet points!

        As for arrays/elements/slices: thank you, I normally get them right, I mixed it up a bit when trying to explain myself.

      I am not sure why you are worried so much about counting files. If you have an array of files, perl will happily skip though them, one at a time, in order, using for. There is lots of stuff here though, that is not accounted for. You are not opening any files to read, $atomcount is a mystery. And the @atom array (probably better used at $atom[$p]) is also unmentioned before now.

      But anyway, here is a go at wrangling your code a little. It probably won't compile, but may give you some useful hints. If you can post a more complete listing of your code, we may be able to help more

      my @files = qw(file01 file02 file03 file04 file05); my $carboxyl = 0; # you may want to start at 1 instead for my $filename (@files) { open my $pdb, '>', "$filename.pdb"; my ($p, $w); my $bx = 50; # Never used again ? my $by = 50; # Never used again ? my $bz = 50; # Never used again ? print $pdb "\t%filename\n"; printf $pdb "%5d\n", $atomcount; for my $p (1 .. $atomcount) { # Perl likes to help printf $pdb "%5d%-5s%5s%5d%8.3f%8.3f%8.3f\n", 1,$carboxyl,@atom[$ +p],$p,@x[$p],@y[$p],@z[$p]; } $carboxyl++; printf $pdb "%10.5f%10.5f%10.5f\n",$bx,$by,$bz; close $pdb; }

      Update

      is 'pdb' http://en.wikipedia.org/wiki/Protein_Data_Bank_(file_format) ?

      Cheers,
      R.

      Pereant, qui ante nos nostra dixerunt!

        I want the files counted as I think of it as the only way to get different $carboxyl entries. All I want is just a number, from 1 to whatever, next to my $carboxyl entry, so that I will be able to tell the files apart.

        This number doesn't have to be associated with the files, it doesn't have to be a file counter, it just has to be consecutive and different every time, for each of the resulting pdb files.

        It's just because I'm a newbie that I thought that associating my goal with a file counter would make more sense and would be easier to implement.

        $atomcount and @atom aren't unaccounted for, they're just further up in my script (sorry about that, I didn't know how to include a concise part without copying the whole thing). Also, I'm opening the pdb files and printing coordinates (x,y,z) in them, from the files that are already open (again, opened further up in my script and these are the files of the foreach loop)

        Thank you (all) for you help. I'll try your suggestions and update you, hopefully I'll get it right - at least in my mind or on paper.

        Update: Yes, pdb is protein data bank files. Although I'm an idiot and instead of pdb it should read gro, as I changed to gro files (and thus the format is wrong for pdb files). I'm really sorry about that, thankfully it doesn't change my question though so not a disaster (I'm embarrassed)

Re: counter of files? something else?
by fasoli (Beadle) on Apr 23, 2015 at 14:31 UTC

    Thank you so much, soonix!

    This worked:

    before the foreach loop: my $file_number = 0;

    in the loop: $file_number += 1;

    Now all I need to do is understand why it worked (reading, homework) and then wonder why I never thought about it if it was a matter of 2 lines (more reading, more homework).

    As a general comment on my short Perl experience, I found it quite easy to stop making syntax errors and quite easy to begin to understand what's happening. For example I had a much worse time when attempting to learn some Python, I found it too complex. Perl seems a bit "easier" and I like the regex. Although for a chemist (me) it all is a bit confusing. Been coding on and off for about 4 months now and not very happy with my progress, I hope that the books will help.

    Thank you all so so much, I appreciate your help and suggestions A LOT :)

    Off to more homework :)