comment on

I would like to learn Perl by working through specific cases where I need it. This is the first such case. I have a situation much like the one described in an earlier discussion (Extracting blocks of text). Specifically, I have a number of old WordStar files in plain text. Each such file contains multiple .pa-delimited documents (consisting of various numbers of lines and paragraphs of text) that should be broken out into separate files. For example, one of these WordStar files might contain something like this:

Text text text
.pa
Other text text text
.pa
[download]

In that example, resulting file no. 1 would contain "Text text text," and resulting file no. 2 would contain "Other text text text."

I assume, but am not certain, that every .pa appears at the left margin, and is followed by no other characters on the same line.

The earlier discussion suggested this solution, where the delimiter was the word "term" rather than ".pa":

#! perl -slw
use strict;

my @array = split 'term', do{ local $/; <DATA> };
shift @array; ## Discard leading null

print '---', "\n", $_, "\n" for @array;

__DATA__
term {
yada yada
12345
() ...
}

term only occurs here {
could be 30 lines here
but never that word again until
another block starts
yadada
}

term, etc.
[download]

My questions, from that example:

1. That old discussion mentioned RAM concerns when slurping. My system has 16GB RAM. The files I am working on are small. But I may adapt the solution to other, larger files. When does RAM become an issue?

2. How would I adapt this solution to refer to a separate input file? In the suggested solution, the Perl code seems to be added to the start of the text file. I would rather have a separate Perl script and specify the target file at runtime.

3. What would be the best reference source, for purposes of interpreting the few Perl codes suggested in that solution?

4. Which version of Perl should I install, to run this code?

Many thanks.

In reply to Learning Perl by Doing by raywood

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.