Dear Monks,
I am migrating a previously static website of nearly 4,000 .html pages into .php. I need to change all of the include calls from SSI to PHP.
For example
<!--#include virtual="/ssi/edfooter.txt"-->
to
<?php include($_SERVER['DOCUMENT_ROOT'].'/ssi/edfooter.txt'); ?>
The included file paths can remain the same, just what's wrapped around the paths needs to change. There are some pages that will have more than one include to be updated.

I understand that a command line solution for this will be suggested, I was unsuccessful when trying to get the regex to not throw errors for unescaped characters when trying a command line solution. I'd rather have the code in script form. I think what I am doing now is one semi-correct way of doing it, but I'm never entirely sure.

What I'm not doing right is editing the file, either in place or with a temporary file. I keep either wiping out the file completely or not doing anything to it. This isn't a script that is going to be called on any regular basis, so I'm not worried about it being efficient in the long-term, though I would like to understand some of the better ways of doing this.
Thanks!

#!/usr/bin/perl use warnings; use strict; use File::Find::Rule; #find all html files in specified directory #this is for a specific directory right now for testing, but #will eventually be going through all the subdirectories under /htdocs my $dir = "/home/devcorp/htdocs/plan/norms"; my $rule = File::Find::Rule->file->name("*.html")->start( $dir ); #keep track of the changed files in a file open(OUTFILE,">changed_files.txt") || die "cant open changed_files.txt +, $!\n"; while ( my $html_file = $rule->match ) { #open file to replace string in open FILE, "<$html_file"; my @lines = <FILE>; for (@lines) { #replace <!--#include virtual="[document path]"--> #with <?php include($_SERVER['DOCUMENT_ROOT'].'[document path] +'); ?> if (s/<!--#include virtual="(.*)"-->/<?php include(\$\_SER +VER['DOCUMENT_ROOT'].'$1');?>/){ my $result = $1; #print the file changed and the document path for the inc +luded file print OUTFILE "$html_file: $result\n"; } } close FILE; } close OUTFILE;

Output returned in changed_files.txt is
/home/devcorp/htdocs/plan/norms/index.html: /ssi/edfooter.txt
but obviously nothing is changed in the file itself because I'm not doing that part right.


I learn more and more about less and less until eventually I know everything about nothing.

In reply to modify file in place in script? Regex for changing includes from SSI to PHP by hmbscully

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.