I have written the following code to turn brace delimited code like this
{bob is {a cool guy}}
into XML like this
<bob> is <a> cool guy </a></bob>
Here's the code I've written (edited for stuff that I know is fine/user outputs/etc)
#!/usr/bin/perl -w
use diagnostics;
use Parse::RecDescent;
use Text::Balanced qw(
extract_bracketed
);
#input
print "Name of file to be inputted? :";
$infile=<>;
print "What should the output file be named? The file will be automati
+cally created. :";
$outfile=<>;
chomp $infile;
$/=undef;
open INFILE, "<$infile";
$text=<INFILE>;
close INFILE;
#processing
my $counter = 0;
while($next = (extract_bracketed($text, '{}', '[^{}]*' ))[0])
{
$holder = $next;
while($bext = (extract_bracketed($next, '{}', '(?s).*?(?=\{code)'
+))[0])
{
$bolder = $bext;
while($cext = (extract_bracketed($bext, '{}', '(?s).*?(?=\{escape)
+' ))[0])
{
$colder = $cext;
$cext =~ s/\{([^ \s|\}]*?)\}/<$1\/>/gix;
$cext =~ s/\{([\w|-]*)(.*)\}/<$1>$2<\/$1>/osi;
$bext =~ s/$colder/$cext/sgi;
}
$bext =~ s/\{(\w*?)\s(.*)\}/\<$1\>$2<\/$1>/gosix;
$bext =~ s/\{metavar(.*?)\}/<metavar>$1<\/metavar>/gosix;
$bext =~ s/\}/ebrac/g;
$bext =~ s/\{/obrac/g;
$next =~ s/$bolder/$bext/sgi;
}
$next =~ s/\{([^ \s|\}]*?)\}/<$1\/>/gix;
$next =~ s/\{([\w|-]*)(.*)\}/<$1>$2<\/$1>/osi;
$text =~ s/$holder/$next/sgi;
print "Sync check \#$counter\n";
print "$next\n";
$counter++;
}
#output
open FILEOUT, ">$outfile";
print FILEOUT $text;
close FILEOUT;
print "\nYour result is stored in file $outfile\nGoodbye.\n";
Ok, so this code does the job fine. The problem is, the application of this program needs to be very broad....including processing some truly huge files. The problem I run into is that when I am running this script on a truly humongous file (my test file is 19,000 lines) the script runs until a certain point and then stops (stalls is more like it) and goes no further. The first two times this happened it stopped on the exact same call/line number, which seemed pretty fishy. So as an experiment (and to make sure it wasn;t the input at fault) I cut out everything for about 50 lines surrounding the stalling line and ran everything again. This time it got about 10 more calls/60 more lines and stalled. I ran it again and it stalled in the same place. So, in a fit of annoyance (from start to stall takes about 30 minutes) I cut the original input file down the center and made it into 2 files......which ran through perfectly. So I'm thinking maybe a memory leak in my program or something.(?) I've read a bit about the subject, but in terms of its manifestations in Perl I'm pretty lost. Now, for the short term, I suppose I can just cut up especially big files, but in the long term I hope to not be the sole user of this program and I don't want to have to tell everyone that they need to butcher their input. So my question is.......SOLVE MY PROBLEM FOR ME! DO IT! DROP WHAT YOU'RE DOING AND DO IT NOW! hahhaha, just kidding, but if some knowledgable Monk should come along and see something in my code, I'd sure appreciate a helpful suggestion or two. I'd assume that since Text::Balanced is a well respected and widely used module that my problem does not originate there. Any ideas? Thanks, Monks!
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.