Re: Perl Parser, need help
by QM (Parson) on May 14, 2005 at 04:21 UTC
|
It would have been nice if you'd specified exactly what you wanted, plus the example. But guessing at the format (untested):
#!/your/perl/here
use strict;
use warnings;
my $sequence;
while (my $line = <>)
{
# Sequence keyword
if ($line =~ /^Sequence:\s+(contig\d+)/)
{
$sequence = $1; # save the sequence
next;
}
# Parameter keyword
next if ($line =~ /^Parameters:\s+(.*)$/);
# print any other non-blank lines
if ($line != /^\s*$/)
{
print "$sequence $line";
next;
}
}
I don't know what your spec is for line breaks, or whether blank lines have meaning, or what to do for lines that don't begin with Parameter or Sequence keywords.
Please use <code> tags to show code and output, otherwise we have to check the page source to see the actual format. (And I still might have got the input format wrong.)
Update: Corrected Parameters: error as pointed out by Ovid. The Parameters: line is just skipped.
Update 2: Adding missing semicolon to Parameter line.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
| [reply] [d/l] |
|
|
#!/your/perl/here
use strict;
use warnings;
my $sequence;
while (my $line = <>)
{
# Sequence keyword
if ($line =~ /^Sequence:\s+(contig\d+)/)
{
$sequence = $1; # save the sequence
next;
}
# Parameter keyword
#next if ($line =~ /^Parameters:\s+(.*)$/)
# print any other non-blank lines
if ($line != /^\s*$/)
{
print "$sequence $line";
next;
}
}
QM..
I have tried your script, before i mark the parameter keyword out, the script have compilation error at line 24. After i mark that line, i manage to run the script with error, the compiler stated : isn't numeric in numeric ne (!=) at trf.pl line 19. But I still manage to get the output, after edited in excel. But funny thing is that, for those result starting from 1 (eg: 1 45 7 6.4 7 97 0 72 35 44 4 15 1.67 CCTAAAC CCTAAACCCTAAACCCTAAACCCTAAACCCTAAGCCCTAAGCCCT, it won't show in the parsed result... i wonder what's with this.. but anyway thank u so much!
| [reply] |
|
|
| [reply] |
|
|
|
|
Re: Perl Parser, need help
by Ovid (Cardinal) on May 14, 2005 at 04:11 UTC
|
Since this is science related, I want to help, but help from you would be good. First, I would recommend listing the code you used to solve your first problem. Second, I would more clearly state the steps necessary to transform your input to your output.
I took a wild swing at solving your problem, but here are the assumptions I made (and they could be erroneous).
- This isn't homework (I hope it's not!)
- You appear to be discarding the parameter information so I did, too.
- The input and output formats were vague, so I guessed after looking at the raw source of your node.
- Data will always start with a "Sequence: ..." (and sequences will not overlap)
- The "contig$digits" information will prepend every subsequent line until the next "Sequence".
My stab at it:
| [reply] [d/l] |
Re: Perl Parser, need help
by davidrw (Prior) on May 14, 2005 at 14:34 UTC
|
The above approaches are probably more robust, but here's a quick & dirty alternative:
# DOS quoting:
perl -ne "$contig=$1 if /^Sequence:\s+(\S+)/; print \"$contig $_\" if
+$_!~/^(\S+:)/ && /\S/" dat.txt
# *nix quoting:
perl -ne '$contig=$1 if /^Sequence:\s+(\S+)/; print "$contig $_" if $_
+!~/^(\S+:)/ && /\S/' dat.txt
Either just redirect the output to a file (or pipe it elsewhere), or you could use the -i argument (man perlrun) as well. | [reply] [d/l] |
|
|
Maybe it's just too late in my caffeine cycle, but I didn't think you could simply escape double-quotes in DOS, can you? That would make it too easy, and I'm pretty sure I remember otherwise.
-QM
--
Quantum Mechanics: The dreams stuff is made of
| [reply] |
|
|
I too was shocked to see it work, and yesterday was a caffeine-less day for me, so it must be true! Yesterday was on win2k -- i just tried it with success on win98 as well!
C:\temp>echo "blah \" ad"
"blah \" ad"
C:\temp>echo blah \" ad
blah \" ad
| [reply] [d/l] |