I have hundreds of files in subdirectories. I need to go into each file in turn, create a backup copy of each original file, change permissions on the original file, then pattern search each file for two specific lines. The first line always begins with <IMB SRC and ends with BR>, the second line always begins with Figure followed by a space and one or more numbers. There could be any kind of character in the lines in between these two liness, spaces, ", (), /\<>., etc. I want to throw out everything between the two lines I need. After I find the two lines I need, I want to put them in a hash, with the Figure line as the key and the <IMG SRC line as the value. Then I want to search the entire file for any other occurrances of the matching Figure line that exist without matching <IMG SRC lines. At that point I will hopefully create a link from the hash value to that found Figure line. After one file is completed, it should proceed to the next file and start again, running through the 600 or so files. When complete each "Figure digit" will be linked to the correct <IMG SRC line.
I have the first part working. I am drilling down to the correct directories, making the backup copies and changing the permissions. I get no errors when I run this; however, it doesn't find anything. At least, nothing prints. I'm a beginner at this, and have been through the mountain of books I bought to help me, I'm still stuck. Any and all help would be more than appreciated. Thank you.
This is an excerpt from one file, showing the lines I need to get with the pattern search:
<IMG SRC="/CSS/tpubs_graphics/L-/5/7/L-57174.00000001.gif">
1. Check Valve
(whitespace) 2. Check Valve (Altair)
(whitespace) #82-22-02
(whitespace) Fuel Control Water Signal Check Valve REMOVAL-01
(whitespace) Figure 301 Page 302
# ARGUMENTS: engine_figurelinks.pl xx_manual_vvv
# where xx = manual code, vvv = version
#
# MODIFICATIONS:
#
#---------------------------------------------------------------------
+-
use warnings;
use diagnostics;
use Env qw(SERVER_NAME);
use CGI qw(:standard :netscape);
use File::Copy; # Perl supplied module for making copies
new CGI;
#---------------------------------------------------------------------
+-
($manualdir_param) = @ARGV;
$working_dir = $manualdir_param;
$working_dir =~ s/manualdir=//i;
$data_area = "/tmp";
$html_dir = "$data_area/$working_dir";
#---------------------------------------------------------------------
+-
# Loop to locate HTML files, change permissions, and make working temp
+orary copies
opendir( HTMLSTORIES, "$html_dir") || die "HTML dirs do not exist: $1"
+;
@FigureArray = grep{/^(09)(\w{1,5})(00)$/} readdir ( HTMLSTORI
+ES );
foreach $FigFile (@FigureArray) {
opendir( HTMSTORY, "$html_dir/$FigFile" ) || die "File
+s do not exist: $1";
@FileArray = grep{/a.htm$/} readdir ( HTMSTORY
+ );
foreach $DirFile (@FileArray) {
copy ("$html_dir/$FigFile/$DirFile", "
+$html_dir/$FigFile/$DirFile.bak") or die "Can not make backup copy of
+ file: $1";
chmod 0600, "$html_dir/$FigFile/$DirFi
+le";
while (< "$html_dir/$FigFile/$DirFile" >) {
$Figures = "$html_dir/$FigFile/$DirFil
+e" =~ /(<IMG.*?BR>)...(Figure\d*)/i;
print $Figures;
}
}
}
closedir HTMSTORY;
closedir HTMLSTORIES;
Edited by planetscape - removed unnecessary br tags
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.