agustina_s has asked for the wisdom of the Perl Monks concerning the following question:

Hi perlmonks.... I want to ask concerning a flat file database whose records are separated by "//\n" ( double slash and new line). I have a very huge database (input.db) looks partly like:
COMMERCIAL SUPPLIERS SEQUENCE /toxin_start="19-94" /translation="MKFFLMCLIIFPIMGVLGKKNGYPL + CWGACYCFGLEDDKPIGPMKDITKKYCDVQIIPS" AA ALIGNMENT full-length{%scorpProtein}, sodium-family{%scorpNA}, putative-group1{% +Nag01}, putative-group1c{%Nag01c} ORIGIN 1 acaaaataaa gtgaacttct gaaatcagca cgataaaaag aaacgaaaat gaaatttttc // COMMERCIAL SUPPLIERS SEQUENCE /toxin_start="19-94" /fragment_start="?" CWGACYCFGLEDDKPIGPMKDITKKYCDVQIIPS" AA ALIGNMENT full-length{%scorpProtein}, sodium-family{%scorpNA}, putative-group1{% +Nag01}, putative-group1c{%Nag01c} ORIGIN 1 aaaataaagt gaacttctga aatcagcacg ataaaaagaa acgaaaatga aattt +ttctt 61 aatgtgtctt atcatcttcc caattatggg agtgcttggc aaaaagaacg gatat +cctct //
I'm going to read the input, make some changes and then write it back to the output files. I am a little bit confused with the input separator since at first I have set it to //\n and then for each records I want to examine it line by line.

I have tried to put a counter in my code but there the program never stop running.

I compile the prog using: perl prog.pl input.db result #!/usr/bin/perl my $input = $ARGV[0]; my $output = ">" . $ARGV[1]; my $counter=1; open(INPUT, $input) or die "Can't open $input."; open(OUTPUT, $output) or die "Can't open $output."; $/="\/\/\n"; # Use shorthand for reading file. while (<>) { print OUTPUT $_; print $counter; $counter++; } print "Success\n"; close (INPUT); close (OUTPUT);
I have tried to do
> while(<INPUT>)
instead of just while(<>) but it only print the counter once. Is there something wrong with the code? And if I want to read/modify each line in the record where should I set the $\ back to \n? Thank you in advanced.

Sincerely

Replies are listed 'Best First'.
Re: about separator
by tachyon (Chancellor) on Jan 31, 2002 at 13:00 UTC

    What you want to do is called inplace editing. Here is an example:

    #!/usr/bin/perl -i.bak { local $/ = "//\n"; while (<>) { # grab a line into $_ s/this/that/; # munge away print; # output the munged line to file } } continue { print STDOUT ++$counter, "\n"; }

    The -i.bak tells perl to do an inplace edit on the argument files and write a backup called <file>.bak where <file> is the argument file

    You set the input separator as shown - it is localised to the block. The while than loops though each "line" assigning it to $_. We can make mods to $_ and then the print prints it back to the original file overwriting the original line. The contents of the original file (unmodified) are written to <file>.bak. We can print to STDOUT by specifying it and use a continue block to do something (like print the counter) with each line.

    Just call the script like this: perl munge.pl <file1> <file2>

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: about separator
by Hofmator (Curate) on Jan 31, 2002 at 13:46 UTC

    The construct while (<>) {do_something} reads either from STDIN or automatically opens the files in @ARGV one by one and reads from them. So this is not what you want in your code. You need the while (<INPUT>) {...} construct.

    Instead you might use something like this:

    #/usr/bin/perl # call this with perl prog.pl output_file input.db my $output = shift; # remove output_file from @ARGV open(OUTPUT, ">", $output) or die "Can't open $output: $!"; $/="//\n"; # now read from the remaing files in @ARGV (or STDIN) # and let perl handle the opening ... while (<>) { # do something }

    To process your record linewise, you can split it on newline:

    # to be inserted into the while loop above my @lines = split /\n/, $_;
    Setting back the input line separator $/ is not an option, because you are no longer reading the record from a file. You just read it and now want to do something linewise with it.

    In addition to that note that you might not have to do the breaking up in lines. The substitution e.g. is happy to work on multiline strings as well (but see the /s and /m modifiers).

    -- Hofmator

Re: about separator
by talexb (Chancellor) on Jan 31, 2002 at 13:24 UTC
    You can combine
    print $counter; $counter++;
    into the single line
    print $counter++;
    because the ++ does a post-increment -- the current value is output, then the counter is incremented.

    --t. alex

    "Of course, you realize that this means war." -- Bugs Bunny.