pindar has asked for the wisdom of the Perl Monks concerning the following question:

Dear monks, this is very much a newbie question, but I rely on your kindness: I have a set of ligature instructions that I want to add to TeX property-list files. The general idea is simple: the original file reads
(LABEL O 50) (KRN O 200 R -0.05) ...etc
and this should become
(LABEL O 50) (LIG O 50 O 2) (KRN O 200 R -0.05) ...etc
So I have this
$_ =~ s/\(LABEL O 50\)/\(LABEL O 50\)\n \(LIG O 50 O 200\)/;
The problem is: not all the LABEL O XXX lines are already in the original file. Sometimes, I will have LIG instructions for a LABEL that does not exist yet. I know, e.g., that the instructions for the as yet nonexisting LABEL O 61 etc... must be placed between existing LABEL O 60 and LABEL O 62, but how can I instruct perl to include these instructions at the right address? Thanks so much for your help!!

Replies are listed 'Best First'.
Re: adding lines at specific addresses
by sgifford (Prior) on Oct 10, 2005 at 16:06 UTC
    A regex may be too simple for what you're trying to do. One straightforward way to approach the problem would be to go through the lines in the file one at a time. When you see a (LABEL O n) line, if you have items with n less than that, print those first; if you have items with n equal to that line, print that instead (and arrange to skip all lines until the next (LABEL ...) line); otherwise just print that line and continue printing lines until the next (LABEL ...).
Re: adding lines at specific addresses
by Zaxo (Archbishop) on Oct 10, 2005 at 17:40 UTC

    Tie::File will be a big help - it will let you treat the file as an array of lines. You'll be able to splice in a line while examining the surrounding context, and the modifications will appear on disk. You simply address the lines with an array index.

    After Compline,
    Zaxo

Re: adding lines at specific addresses
by Skeeve (Parson) on Oct 10, 2005 at 16:06 UTC

    I on't know much about TeX, but it seems the best advice would be to search whether some CPAN module will help you in parsing your files to find the proper location.


    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e;`$_'`
      OK, I've been thinking about the problem some more and have sort of an idea; what do you think about it:

      1. define a list of all the LABELS I need and store them as variables $label1, $mylabel2 etc.;

      2. with an if ... elsif loop, look for every variable $label1 and store it into a new variable; if it finds a match, this variable is defined, if not, it is undefined;

      3. a number of if... statements: if variable defined, extract the data belonging to it, print the new lines, then the old data; if variable is undefined, print the new LIG instructions only.

      Not sure if I have expressed this in an understandable way and if it can work, but I'll see whether I can cook up something.

        I think your three-step plan is making things more complicated than they need to be. For one thing, the list of labels that you need (assuming that you really do need these) should be stored in a hash. Just figure out what piece of information about each label will make sense in your app as a hash key (i.e. the thing you'd want to use later for looking things up), and what information will be useful (if anything) as a hash value.

        Use more than one hash if that makes things easier (but often, a single hash will do).

        As for working out how to construct the output file, it would be better, if possible, for the changes to be made and output produced as you're reading the input. For example, if you somehow know that you need to insert a "(LIG O 51 ...)" line and you input a line that has "(LABEL O 52)", do you have enough information at this point to output the line(s) that should precede "(LABEL O 52)" ?

        If you need to read the whole input before resolving that sort of problem, that's okay -- but again, it may be better to hold the input data in a suitable structure (AoH or HoA or somesuch) to reduce the risk of scrambling it beyond recognition.

        You showed us an "easy" example in the OP, but you haven't given us a clear example of a "hard" case -- what it looks like on input, and what you'd like it to look like on output -- maybe I'm missing something, but this part isn't clear to me based on what you've said so far.