I am in the midst of a project now that requires me to make some changes to a ~30MB file. My script will be acting like an lpr filter script to modify incoming postscript jobs. This script may be running multiple instances simultaneously, so I can't slurp the file into an array. I have examined a variety of methods to solve the following problem.
When the current line is a chapter end I have to do 3 things.
- Insert a chapterization command
- Insert a command to change the paper output tray after the next page start.
- On the third page following the end of a chapter, insert a command to send the newly started chapter to the binder
I can't know ahead of time how many lines, or pages will be in these chapters. I have no problem with using simple regexes to match and change the text. It is the chapter level process that has me stumped.
Here is my current code
#!/usr/bin/perl
use strict;
use warnings;
#since first page gets treated like the end of a chapter
#we start with end_chapter being true.
my $end_chapter = 1;
my $time = time();
#line required to redirect output to postprocessor (ie perfectbinder)
my $to_pp = '<</OutputType(postprocessor)>>setpagedevice \n';
#line to force chapterization
my $chapter = 'true [] /110pProcs /ProcSet findresource /setchapters g
+et exec \n';
#my $outfile = "/var/spool/drop_box/autoq/".$time.".print";
my $outfile = "d:/customer files/WPS/Quint.out";
my $infile = $ARGV[0];
open (OUT, ">".$outfile) or die "Can't create temp file!!!!!!!!!";
open (IN, "<".$infile);
while (<IN>){
#if its the KDKHost line, skip it
my $line = del_KDKHost($_);
#handle chapter endings - currently denoted by null OutputType
$line = chapterize($line);
if ($end_chapter) {
handleSeps();
}
else {
print OUT "$line";
}
}
sub del_KDKHost {
#if this the KDKHost line, delete it
my $line = shift;
if ($line =~ m/^%KDKHost:/){
$line = "";
}
return $line;
}
sub chapterize {
my $line = shift;
if ($line =~ m!<</OutputType \(\)>>setpagedevice!) {
$line = $chapter;
}
$end_chapter = 1;
return $line;
}
sub handleSeps {
#if we just made a chapter, or this is the first page of the file
#we have to make the next 2 pages come out of the top exit
my $counter= 0;
#if this is the pagenumber line, increment counter
#we only need work with 2 pages
while (<IN> && $counter<=3) {
#if we have started a new page
#increment page counter and if we are on the 3rd page
#since chapter break, insert line for output to perfect binder
if ($_ =~ /%%BeginPageSetup/){
print "STARTED NEW PAGE";
$counter++;
if ($counter==3){
$_ .= "\n $to_pp";
$counter = 0;
}
#if it is the OutputType line for this page, change to top out
+put
elsif ($_ =~ m!<</OutputType\(Stacker\)>>setpagedevice!){
$_ =~ s/Stacker/top/;
}
}
print OUT $_;
}
$end_chapter = 0;
}
Currently, I only get the first line of the file repeated. Its as if it is never reading past line 1. Is it possible to read from a file within a sub that is called from a while loop that is based on reading the same file? I got the idea and the term inner read from
this node. Am I insane? Is there an elegant way to do this?
As always, thanks for the pointers,
digger
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.