in reply to Re: how to read big postscript files
in thread how to read big postscript files

Hi Corion

Thanks for your response

Below I have pasted my code

use strict; use warnings; use Cwd; my $filename; my $filepath; if($ARGV[0]=~m/((.*)[\\\/])?(.*?)\.ps$/i) { $filename=$3; if(defined($1)) { $filepath=$1; } else { $filepath=cwd(); $filepath=~s!/!\\!gi; $filepath.="\\"; } } else { Win32::MsgBox("Incorrect argument, Please check", 0, ""); exit; } open(F1, "$ARGV[0]") or Win32::MsgBox("Input File cannot be opened", 1 +6, "Error Message"); undef $/; my $line = <F1>; close F1; my @imgrem; my $imgno = 0; while($line =~ s/\n\%\%BeginObject\: image(.*?)\n\%\%EndObject/<img$im +gno>/msi) { my $tmp = $&; push(@imgrem, $tmp); $imgno++; } $line =~ s/\(\\266\)D r\n/\(\)D r\n/msgi; while($line =~ m/\[\/Action \<\< \/Subtype \/URI \/URI \((.+?)\) \>\> +\/Rect \[(\d+) (\d+) (\d+) (\d+)\] \/Border \[0 0 0\] \/LNK pdfmark\n +/gi) { my $temp = "$&"; my $contents = $1; my $originalcontents = $contents; my $x1 = $2; my $y1 = $3; my $x2 = $4; my $y2 = $5; $y1 = $y1 - 100; if($contents !~ /^(http|www|mailto)/i) { $contents =~ s/&ndash;/\-/gi; $contents =~ s/&equals;/\=/gi; $contents =~ s/&percnt;/\\%/gi; $contents =~ s/&ast;/\*/gi; $contents =~ s/&(l|r)squo;/\'/gi; $line =~ s/\[\/Action \<\< \/Subtype \/URI \/URI \((.+ +?)\) \>\> \/Rect \[(\d+) (\d+) (\d+) (\d+)\] \/Border \[0 0 0\] \/LNK + pdfmark\n/\[\/Action \<\< \/Subtype \/Caret \/Contents \($contents\) + \/Rect \[$x1 $y1 $x2 $y2\] \/Title \(Original Text\) \/Subj \(Insert +ed Text\) \/Border \[0 0 0\] \/Color \[0 0 1\] \/ANN pdfmark\n/i; } } #while($line =~ s/<img([0-9]+)>/$imgrem[$1]/si){}; open(F2, ">$filepath$filename-out.ps"); print F2 $line; close F2; print "\n\nEnd time ", time() - $^T;

The above coding run successfully in the files upto some sizes. for eg. it runs on 100mb file

Thanks

srikrishnan R.

Replies are listed 'Best First'.
Re^3: how to read big postscript files
by Corion (Patriarch) on May 25, 2010 at 09:02 UTC

    So you don't have a problem with reading large Postscript files, you have a problem with processing them.

    Maybe it would be less hard on your machine if you didn't process the whole file in one go. For example, you could write the images to disk instead of keeping them around in memory. Also, you can do the replacements on the parts of the file instead of doing the replacements on the file at once.

    Also, I'm quite unclear what the replacement loop is supposed to be doing, but maybe you can rewrite that code using /ge from perlre. It seems to make heavy use of $&, which tends to be slow.