ddrew78 has asked for the wisdom of the Perl Monks concerning the following question:

Hello once again Monks, I hope this will be my last question for some time. My file looks something like this:
Class control priority 5 Class voip priority 30 Class video priority 40 Class control priority 10 Class voip priority 25 Class video priority 45
What I need to do is keep the first occurrance of the word 'priority', but substitute duplicate occurrences before the blank lines. Basically, it should look something like this:
Class control priority 5 Class voip bandwidth 30 Class video bandwidth 40 Class control priority 10 Class voip bandwidth 25 Class video bandwidth 45
While I know I can use 'sed' to replace it, I need to keep the first occurrance of it in every paragraph, for lack of a better term. I'm not even sure if this is possible, since it will have to keep every first occurance per paragraph. Also, this will be part of a 600+ line script, so I didn't add the code that got me to this point either. Any ideas will be greatly appreciated, I don't even know where to start on this.

Replies are listed 'Best First'.
Re: Replacing duplicate string
by Corion (Patriarch) on Apr 15, 2009 at 17:10 UTC

    Nevermind that in the end it's going to end up in a 600 line script. Show us the code you wrote for this small, self-contained task.

      Therein lies my problem, I've got nothing. sed will replace every occurance, grep will find me every occurance. What I need is something to get me started, that will find/replace all occurrances of the word 'priority' except the first one in each paragraph. I had even considered condensing each paragraph into one line and potentially doing it from there, but even then I would still have the same issue. I'm completely out of ideas, don't know how to even start.

        How about remembering whether you've seen the word 'priority' already, and if so, replacing it, and not replacing it if you haven't seen it at least once already? You have to forget that you've seen 'priority' already if you see an empty line.

      Ok, so I've found this online, but i keep getting compilation errors. At any rate, do you think this is something I can build on to do what I'm trying to accomplish?
      open(MYINPUTFILE, "cos2"); $/ = ''; while (<MYINPUTFILE>) { while ( /\b([\w'-]+) (\s+\1)+\b/gi ) { sed 's/$1/bandwidth/'; } }
      I really appreciate the help on this, it's been driving me crazy.

        Maybe you can tell me where in the Perl documentation you found the sed command?

        As a next step, I would recommend that you add comments indicating where you wrote the code that you think implements the steps I outlined for you to undertake.

        Also, it often helps us to help you better if you also tell us what the error message is, possibly together with the input data.

        A minor note regarding modifying the $/ variable. While it is useful for slurping a file, it is best done with local inside a code block to minimize any possible problems later on in the script. e.g.,
        { local undef $/; while (<MYINPUTFILE>) { ... } }
        That way the changes are limited to that block only. Very useful when messing with the special variables.
        -----
        "Ask not what you can do for your country. Ask what's for lunch."
        -- Orson Welles
        Here's some pseudocode to help you get started:
        while (readline into $line) if $line starts with 'priority' print $line if 'priority not already seen' remember that priority was seen end else if $line is blank forget that priority was seen end else print $line end end

        while (readline into $line) if $line starts with 'priority' replace 'priority' with 'bandwidth' if 'priority already seen' remember that priority was seen end else if $line is blank forget that priority was seen end print $line end

        It's basically what Corion already said, but it has the basic structure of what your code might look like. Now all you have to do is translate it.

        Update: Changed pseudocode to do substitution rather than skip printing.

Re: Replacing duplicate string
by CountZero (Bishop) on Apr 15, 2009 at 21:44 UTC
    A solution which will depend on Class voip always coming directly after the first priority:
    use strict; use warnings; while (<DATA>) { if (/^Class voip/ .. /^\s*$/) { s/priority/bandwidth/; } print; } __DATA__ Class control priority 5 Class voip priority 30 Class video priority 40 Class control priority 10 Class voip priority 25 Class video priority 45
    Output:
    Class control priority 5 Class voip bandwidth 30 Class video bandwidth 40 Class control priority 10 Class voip bandwidth 25 Class video bandwidth 45

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Replacing duplicate string
by johngg (Canon) on Apr 15, 2009 at 21:38 UTC

    Read the file in paragraph mode as others have suggested then process each paragraph by spliting into words and the separating whitespace (by capturing the whitespace you split on), doing the substitution when appropriate.

    use strict; use warnings; open my $inFH, q{<}, \ <<EOD or die qq{open: <<HEREDOC: $!\n}; Class control priority 5 Class voip priority 30 Class video priority 40 Class control priority 10 Class voip priority 25 Class video priority 45 EOD { local $/ = q{}; while( <$inFH> ) { my $foundPriority = 0; $_ = join q{}, map { unless( m{\bpriority\b} ) { ; } elsif( $foundPriority ++ ) { s{priority}{bandwidth}; } else { ; } $_; } split m{(\s+)}; print; } } close $inFH or die qq{close: <<HEREDOC: $!\n};

    The output

    Class control priority 5 Class voip bandwidth 30 Class video bandwidth 40 Class control priority 10 Class voip bandwidth 25 Class video bandwidth 45

    I hope this is useful.

    Cheers,

    JohnGG

Re: Replacing duplicate string
by CountZero (Bishop) on Apr 16, 2009 at 09:43 UTC
    And this solution works always (it does no longer depend on Class voip coming right after the first priority):
    use strict; use warnings; while (<DATA>) { if (((/^priority/ .. /^\s*$/) || 0) > 1) { s/priority/bandwidth/; } print; } __DATA__ Class control priority 5 Class voip priority 30 Class video priority 40 Class control priority 10 Class voip priority 25 Class video priority 45
    It takes advantage of the fact that the return value of the .. operator in scalar context is 1 the first time the start condition is met.

    CountZero

    A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James

Re: Replacing duplicate string
by AnomalousMonk (Archbishop) on Apr 16, 2009 at 01:33 UTC
    You would have done well to have taken a few deep breaths and calmly considered the clear advice given by Corion in Re^2: Replacing duplicate string (and also lostjimmy's pseudocode), but since others have given detailed replies, so shall I.

    You don't say how the data is held (In a scalar? An array of lines? In an unopened file?), but you and others are going with processing a file line-by-line, which generalizes nicely to processing an array, so I'll take that approach.

    use warnings; use strict; my $bol_priority = qr{ \A priority }xms; # 'priority' begins a line my $blank_line = qr{ \A \s* \z }xms; my $saw_priority = ''; while (<DATA>) { s{ $bol_priority }{bandwidth}xms if $saw_priority; $saw_priority = /$bol_priority/ .. /$blank_line/; print; } __DATA__ Class control priority 5 Class voip priority 30 Class video priority 40 Class control priority 10 Class voip priority 25 Class video priority 45
    Output:
    Class control priority 5 Class voip bandwidth 30 Class video bandwidth 40 Class control priority 10 Class voip bandwidth 25 Class video bandwidth 45
    Update: Added reference to lostjimmy's reply.
Re: Replacing duplicate string
by apok (Acolyte) on Apr 15, 2009 at 18:52 UTC
    This should work:
    while (<FILE>) { # set modification flag to 0 my $do_mod = 0; if ($do_mod) { # if modification flag is true (>0) # apply substitution if matches s/priority/bandwidth/g; # if line is empty, clear modification flag $do_mod = 0 if (/^$/); } else { # Found first priority, mark modification flag true $do_mod++ if (/priority/); } print; }
    Does that make sense?
    -----
    "Ask not what you can do for your country. Ask what's for lunch."
    -- Orson Welles
      I don't think that code does what you think it does.
        It prints to STDOUT instead of inline editing, but that's an easy redirect. Otherwise, it works just fine on my end.
        -----
        "Ask not what you can do for your country. Ask what's for lunch."
        -- Orson Welles