Re: file reading issues
by jdporter (Paladin) on Aug 03, 2005 at 18:47 UTC
|
First of all, I would re-write
$line = <FILENAME>;
while ($line ne "")
{
print "$line";
$line = <FILENAME>;
}
as
while (<FILENAME>)
{
print ;
}
if the expected behavior is to print out every line of the file.
One way to tweak the above to get your desired output is as follows:
while (<FILENAME>)
{
last if /<!-- Begin -->/;
}
while (<FILENAME>)
{
last if /<!-- End -->/;
print ;
}
| [reply] [d/l] [select] |
|
I would think that if there's text after <!-- BEGIN --> but on the same line, it would not be printed, as well as the text before <!-- End -->
--------------------------------
An idea is not responsible for the people who believe in it...
| [reply] |
|
OK, this one worked! Thanks, jdporter!
_____________________________________________________mojobozo
word (wûrd)
interj. Slang. Used to express approval or an affirmative response to something. Sometimes used with up. Source
| [reply] |
Re: file reading issues
by sgifford (Prior) on Aug 03, 2005 at 20:45 UTC
|
The .. (dot-dot) operator was designed for this:
while (<>)
{
if (/<!-- Begin -->/ .. /<!-- End -->/)
{
print;
}
}
Or, more concisely:
(/<!-- Begin -->/ .. /<!-- End -->/) && print while (<>);
See Range Operators in perlref(1) for more information.
| [reply] [d/l] [select] |
Re: file reading issues
by dtr (Scribe) on Aug 03, 2005 at 20:08 UTC
|
You will get all manner of horrible things happen to your current code if the page that you're editing happens to contain the string "</textarea>" in it anywhere.
You should escape the HTML that you are printing inside the text box to get around this. At a minimum, replacing all instances of "<" with "<" should do the trick. There are modules such as HTML::Sanitizer which try to do this in a more sophisticated way.
| [reply] |
Re: file reading issues
by kwaping (Priest) on Aug 03, 2005 at 18:45 UTC
|
There are a couple ways I can think of to do that, offhand. If it's a small file and you can read it all into memory, you can do something like this:
my $file = '/path/to/file.txt';
open(IN,"<$file") || die $!;
read IN, my $html, -s $file;
close(IN);
$html =~ s/<!-- Begin -->(.*?)<!-- End -->/$1/s;
Or, if the file is large and you'd like to read it line by line, you might want to try setting a flag when Begin is encountered, then turning it off again when End is hit.
my $flag = 0;
while (<FILE>) {
$flag = 1 if (/<!-- Begin -->/);
$flag = 0 if (/<!-- End -->/);
process_line($_) if ($flag);
}
| [reply] [d/l] [select] |
|
read IN, my $html, -s $file; # your code
my $file = do{ local $/; <IN> }; # The way I usually use.
Other monks... can anybody tell me if there is an advantage in one way or the other?
Update The secode line of cone above is the way I am used to seeing... I forgot to complete the comment. My question regards the difference between using $/ as opposed to using read to slurp in an entire file.
They say that time changes things, but you actually have to change them yourself. Andy Warhol
| [reply] [d/l] [select] |
|
What is the way you're used to seeing? Maybe I am the one in need of enlightenment and your way is superior.
To answer your question, there is no reason why I do it that way except that's the way I learned how to do it. Maybe there was a good reason to do it like that back then which has now been made moot by advances in Perl - I don't know.
| [reply] |
|
|
|
|
| [reply] |
Re: file reading issues
by bofh_of_oz (Hermit) on Aug 03, 2005 at 19:13 UTC
|
A regex will parse the file just fine. Just grab the whole file into a variable, then do this:
#Sample data, multiline
$line = "<!-- Begin -->Line one\nLine two\nthree\nfour<!-- End -->";
#process them
$line =~ s/<!-- Begin -->(.*)<!-- End -->/$1/s;
print $line;
I tried to do the same while reading the file line-by-line... The code was so ugly that I simply recommend to read in the whole file at once and do a multiline regexp above...
HTH
--------------------------------
An idea is not responsible for the people who believe in it...
| [reply] [d/l] |
Re: file reading issues
by mojobozo (Monk) on Aug 03, 2005 at 19:17 UTC
|
Follow up question: How can I strip off the leading blank spaces on each line? Keep in mind that I'm using jdporter's snipit of code for grabbing between the comments. I like to indent my html for readability (mine) but don't need all those spaces in the form.
_____________________________________________________mojobozo
word (wûrd)
interj. Slang. Used to express approval or an affirmative response to something. Sometimes used with up. Source
| [reply] |
|
Using [id://jdporter]'s code:
while (<FILENAME>)
{
last if /<!-- Begin -->/;
}
while (<FILENAME>)
{
last if /<!-- End -->/;
s/^\s*//; #<- new line here
print ;
}
| [reply] [d/l] |
Re: file reading issues
by wfsp (Abbot) on Aug 04, 2005 at 10:21 UTC
|
I have comments in the file and want to grab the stuff between them.
I would consider using HTML::TokeParser. I use the following.
#!/bin/perl5
use strict;
use warnings;
use HTML::TokeParser;
my $file = 'index.html';
my $tp = HTML::TokeParser->new($file)
or die "Couldn't parse $file: $!";
my ($start, $html);
while (my $tag = $tp->get_token) {
if (
$tag->[0] eq 'C' and
$tag->[1] eq '<!-- article start -->'
)
{
$start++;
next;
}
next unless $start;
if (
$tag->[0] eq 'C' and
$tag->[1] eq '<!-- article end -->'
)
{
last;
}
$html .= $tag->[4] if $tag->[0] eq 'S';
$html .= $tag->[1] if $tag->[0] eq 'T' or $tag->[0] eq 'C';
$html .= $tag->[2] if $tag->[0] eq 'E';
}
print "$html\n";
# ["S", $tag, $attr, $attrseq, $text]
# ["E", $tag, $text]
# ["T", $text, $is_data]
# ["C", $text]
# ["D", $text]
# ["PI", $token0, $text]
| [reply] [d/l] |