in reply to Re^4: question on multi line pattern matching for html formatting
in thread question on multi line pattern matching for html formatting

A hash of arrays (HoA) might help (see Perl Data Structures Cookbook).
#!/usr/bin/perl use strict; use warnings; my @files = qw{file1.txt file2.txt}; my %paragraphs; for my $file (@files){ open my $fh, q{<}, $file or die qq{cant open *$file* to read: $!\n}; local $/ = q{}; while (my $para = <$fh>){ chomp $para; push @{$paragraphs{$file}}, sprintf(q{<p>%s</p>}, $para); } } for my $file (sort keys %paragraphs){ printf qq{\n*** %s ***\n\n}, $file; for my $para (@{$paragraphs{$file}}){ print qq{$para\n\n}; } }
Two text files.

file1.txt

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi consectetur, tortor pulvinar dapibus sollicitudin, est sem portt +itor orci, vestibulum mollis mauris purus et lacus. Curabitur cursus imperdiet eleifend. Sed porttitor ligula vel leo venenatis bibendum. Phasellus ultricies euismod quam non posuere. Suspendisse faucibus tortor in neque dictum ornare. Suspendisse egestas dui erat, sed placerat diam. Sed vulputate porttitor dapibus. Pellentesque blandit, est a viverra imperdiet, turpis enim ornare elit +, consequat porttitor arcu velit id massa. Suspendisse metus nisl, malesuada id ultricies id, posuere a odio. Duis convallis interdum dolor, vel rhoncus ante adipiscing ut. Ut non ultrices tortor. Morbi at erat velit. Sed iaculis aliquam nunc et accumsan. Nunc vitae augue ac ligula pharetra malesuada. Etiam id massa sit amet orci aliquam porta. Etiam nec enim dui. Donec quis sapien at justo lacinia semper eu non lacus. Aliquam libero lorem, blandit eu pretium ut, convallis at leo. Aenean lobortis sagittis ipsum, a molestie justo laoreet interdum. Aliquam volutpat, libero vel condimentum dictum, nunc turpis commodo m +assa, at porttitor mauris sem ut arcu. Donec a lacus diam, vitae auctor quam.

and file2.txt

Nam mollis aliquam nunc, eu tristique lectus euismod nec. Nunc porttitor rutrum consectetur. Ut sem dui, luctus nec iaculis sit amet, ornare quis est. Duis et diam augue, in cursus metus. Donec non nibh eu magna ornare volutpat. Aliquam erat volutpat. Proin sit amet mattis dui. Vestibulum vel molestie ante. Suspendisse tempus nibh ut enim mattis eu bibendum neque tincidunt. Maecenas vel turpis augue. Duis neque nibh, interdum eu consequat vitae, sagittis eu sapien. Fusce eget risus at mi faucibus vestibulum nec nec augue. Ut volutpat lectus et velit rhoncus vitae hendrerit nibh pellentesque. + In at libero nisi, a ultrices est. Etiam non nisi eros. Proin eu augue quis risus scelerisque commodo. Vestibulum ornare turpis non orci mollis porta. Aliquam aliquam rhoncus eros, id ultricies urna sollicitudin sed. Nunc mattis, mauris vel aliquam fringilla, risus diam vulputate sapien +, id congue enim sem a massa. Vivamus id nulla sit amet arcu rutrum tempor eu eu tellus. Ut massa dui, scelerisque non sollicitudin in, auctor non eros. Cras pulvinar diam eu arcu suscipit ut dictum leo interdum. Etiam tristique metus ut nunc laoreet varius at et arcu. Suspendisse potenti. Vivamus ut tellus purus, sed rutrum elit. In hac habitasse platea dictumst. Duis ultrices mi ut diam laoreet congue.

output:

*** file1.txt *** <p>Lorem ipsum dolor sit amet, consectetur adipiscing elit. Morbi consectetur, tortor pulvinar dapibus sollicitudin, est sem portt +itor orci, vestibulum mollis mauris purus et lacus. Curabitur cursus imperdiet eleifend. </p> <p>Sed porttitor ligula vel leo venenatis bibendum. Phasellus ultricies euismod quam non posuere. Suspendisse faucibus tortor in neque dictum ornare. </p> <p>Suspendisse egestas dui erat, sed placerat diam. Sed vulputate porttitor dapibus. Pellentesque blandit, est a viverra imperdiet, turpis enim ornare elit +, consequat porttitor arcu velit id massa. </p> <p>Suspendisse metus nisl, malesuada id ultricies id, posuere a odio. Duis convallis interdum dolor, vel rhoncus ante adipiscing ut. Ut non ultrices tortor. </p> <p>Morbi at erat velit. Sed iaculis aliquam nunc et accumsan. Nunc vitae augue ac ligula pharetra malesuada. </p> <p>Etiam id massa sit amet orci aliquam porta. Etiam nec enim dui. Donec quis sapien at justo lacinia semper eu non lacus. </p> <p>Aliquam libero lorem, blandit eu pretium ut, convallis at leo. Aenean lobortis sagittis ipsum, a molestie justo laoreet interdum. Aliquam volutpat, libero vel condimentum dictum, nunc turpis commodo m +assa, at porttitor mauris sem ut arcu. </p> <p>Donec a lacus diam, vitae auctor quam. </p> *** file2.txt *** <p>Nam mollis aliquam nunc, eu tristique lectus euismod nec. Nunc porttitor rutrum consectetur. Ut sem dui, luctus nec iaculis sit amet, ornare quis est. </p> <p>Duis et diam augue, in cursus metus. Donec non nibh eu magna ornare volutpat. Aliquam erat volutpat. </p> <p>Proin sit amet mattis dui. Vestibulum vel molestie ante. Suspendisse tempus nibh ut enim mattis eu bibendum neque tincidunt.</p +> <p>Maecenas vel turpis augue. Duis neque nibh, interdum eu consequat vitae, sagittis eu sapien. Fusce eget risus at mi faucibus vestibulum nec nec augue. </p> <p>Ut volutpat lectus et velit rhoncus vitae hendrerit nibh pellentesq +ue. In at libero nisi, a ultrices est. Etiam non nisi eros. </p> <p>Proin eu augue quis risus scelerisque commodo. Vestibulum ornare turpis non orci mollis porta. Aliquam aliquam rhoncus eros, id ultricies urna sollicitudin sed. </p> <p>Nunc mattis, mauris vel aliquam fringilla, risus diam vulputate sap +ien, id congue enim sem a massa. Vivamus id nulla sit amet arcu rutrum tempor eu eu tellus. Ut massa dui, scelerisque non sollicitudin in, auctor non eros. </p> <p>Cras pulvinar diam eu arcu suscipit ut dictum leo interdum. Etiam tristique metus ut nunc laoreet varius at et arcu. Suspendisse potenti. </p> <p>Vivamus ut tellus purus, sed rutrum elit. In hac habitasse platea dictumst. Duis ultrices mi ut diam laoreet congue.</p>

update: pasted the wrong output - fixed.

Replies are listed 'Best First'.
Re^6: question on multi line pattern matching for html formatting
by tallCoolOne (Initiate) on May 20, 2009 at 06:14 UTC
    OK, confession time. Still a newbie. Hashes scare me. I am still struggling with getting my head around the basic concept of hashes, and have sat through several bad explanations of what they are(and have the cranial stretch marks to prove it). But your example code above helps a fair amount. Still and all, I can't say that I exactly understand what is taking place, except from your output, it's what I am after. Which is fantastic of course. But secondly, I am running this script on about 430+ files. So listing them as you have in the braces would be a bit problematic. I have the list of text files in an array @textfile, which I get from:

     @textFile = `cat textfiles*.txt`;

    I do some line by line processing on the first few lines of each file, and then the rest of the file winds up in the array @ThisFileArray. At this point, it's just paragraphs of text, which I am trying to wrap with the paragraph formats. If I can use my array of text file names @textfile inside the braces, it would appear that would be a better way to get the list of file names into the code structure. Then I need to stare long and hard at your code to better understand what takes place there.
    It would seem that I could change your

    my @files = qw{file1.txt file2.txt};

    To my @textfile;, and go from there. I just have to understand what going from there entails, and where I would be going.
    I really appreciate your help and support with all this, and it will go a long way to help me realize my efforts to put together a clean, cool looking website.
    Thanks very much for your time and efforts to help me out.
    Mark