artist has asked for the wisdom of the Perl Monks concerning the following question:
I need the help with text conversion job.
The heart of the program is..
$source->parsing takes the text and devide into serval small segements and creates hash like $source->{_section1}, $source->{_section2} etc. $source->{_section2} is also further sectioned into $source->{_section21}, $source->{_section22} etc..while(<IN>){ my $source = new Source(_text => $_); $source->parsing; $destination = new Destination(_source => $source); $destination->conversion; $destination->print; }
$destination->conversion takes the source object and creates its own parts such as $destination->{_part1}, $destination->{_part2} etc..
$destination->print combines the different parts of destination and prints the final text appropriately
Here is the problem.
If I have to process single line of text input at at time it works fine. But, some of the input lines are in continuation of the previous line and can be identified by a special marker sign '\' at the beginning of $source->{_section2} which can be achieved only after $source->parsing.
Now it's possible to have 0,1 or more continuation lines.
What I would like to do is to attach 'new' $source->{_section2} with previous $source->{_section2} if new $source->{_section2} has a continuation marker, so when I pass the $source object as a parameter to $destination it should be single source object (which includes data from any continuation lines it has). In other words, I like to wait for next input line to see if it is in continuation with the current line.
Also note, there is no indication whether the current line has further continuation or not.
Size: The single line size is around 400 characters at max and there 200,000 lines in the file. Each section can be from 0 to 300 characters
Frequency: This is not a one time job
I appreciate Any Suitable architecture.
Thanks,
Artist
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Text Conversion
by talexb (Chancellor) on Dec 30, 2002 at 20:20 UTC | |
|
Re: Text Conversion
by Mr. Muskrat (Canon) on Dec 30, 2002 at 20:41 UTC | |
|
Re: Text Conversion
by pg (Canon) on Dec 30, 2002 at 20:22 UTC | |
by artist (Parson) on Dec 30, 2002 at 20:54 UTC | |
by artist (Parson) on Dec 30, 2002 at 21:52 UTC | |
|
Re: Text Conversion
by poj (Abbot) on Dec 30, 2002 at 22:06 UTC | |
|
Re: Text Conversion
by John M. Dlugosz (Monsignor) on Dec 30, 2002 at 21:36 UTC |