Re: how to read big postscript files

Replies are listed 'Best First'.
Re^2: how to read big postscript files by srikrishnan (Beadle) on May 25, 2010 at 08:53 UTC
Hi Corion Thanks for your response Below I have pasted my code use strict; use warnings; use Cwd; my $filename; my $filepath; if($ARGV[0]=~m/((.)[\\\/])?(.?)\.ps$/i) { $filename=$3; if(defined($1)) { $filepath=$1; } else { $filepath=cwd(); $filepath=~s!/!\\!gi; $filepath.="\\"; } } else { Win32::MsgBox("Incorrect argument, Please check", 0, ""); exit; } open(F1, "$ARGV[0]") or Win32::MsgBox("Input File cannot be opened", 1 +6, "Error Message"); undef $/; my $line = <F1>; close F1; my @imgrem; my $imgno = 0; while($line =~ s/\n\%\%BeginObject\: image(.?)\n\%\%EndObject/<img$im +gno>/msi) { my $tmp = $&; push(@imgrem, $tmp); $imgno++; } $line =~ s/$\\266$D r\n/D r\n/msgi; while($line =~ m/\[\/Action \<\< \/Subtype \/URI \/URI $(.+?)$ \>\> +\/Rect \[(\d+) (\d+) (\d+) (\d+)\] \/Border \[0 0 0\] \/LNK pdfmark\n +/gi) { my $temp = "$&"; my $contents = $1; my $originalcontents = $contents; my $x1 = $2; my $y1 = $3; my $x2 = $4; my $y2 = $5; $y1 = $y1 - 100; if($contents !~ /^(http\|www\|mailto)/i) { $contents =~ s/–/\-/gi; $contents =~ s/=/\=/gi; $contents =~ s/&percnt;/\\%/gi; $contents =~ s/&ast;/\/gi; $contents =~ s/&(l\|r)squo;/\'/gi; $line =~ s/\[\/Action \<\< \/Subtype \/URI \/URI $(.+ +?)$ \>\> \/Rect \[(\d+) (\d+) (\d+) (\d+)\] \/Border \[0 0 0\] \/LNK + pdfmark\n/\[\/Action \<\< \/Subtype \/Caret \/Contents $$contents$ + \/Rect \[$x1 $y1 $x2 $y2\] \/Title $Original Text$ \/Subj $Insert +ed Text$ \/Border \[0 0 0\] \/Color \[0 0 1\] \/ANN pdfmark\n/i; } } #while($line =~ s/<img([0-9]+)>/$imgrem[$1]/si){}; open(F2, ">$filepath$filename-out.ps"); print F2 $line; close F2; print "\n\nEnd time ", time() - $^T; [download] The above coding run successfully in the files upto some sizes. for eg. it runs on 100mb file Thanks srikrishnan R.	[reply] [d/l]
Re^3: how to read big postscript files by Corion (Patriarch) on May 25, 2010 at 09:02 UTC
So you don't have a problem with reading large Postscript files, you have a problem with processing them. Maybe it would be less hard on your machine if you didn't process the whole file in one go. For example, you could write the images to disk instead of keeping them around in memory. Also, you can do the replacements on the parts of the file instead of doing the replacements on the file at once. Also, I'm quite unclear what the replacement loop is supposed to be doing, but maybe you can rewrite that code using `/ge` from perlre. It seems to make heavy use of `$&`, which tends to be slow.	[reply] [d/l] [select]