in reply to Re^2: Substitute in a subparagraph
in thread Substitute in a subparagraph

I don't want to see more data. I want to see your code. Here's a start:

use warnings; use strict; ... while (<DATA>) { ... } ... __DATA__ 285 blabla_data[28] OUTPUT ( 286 REQUIRED ( 287 lead_up 0.193118 br fclk clkdom(3) 288 early_lea clkdom(3) 289 late_trail_dn -0.084738 br fclk clkdom(3) 290 late_tra 291 ) 292 cext %0.00151055 299 min_ceff_up %0.034 300 TEXT TO BE REMOVED 301 ) 302 blabla5 [145] OUTPUT ( 303 REQUIRED ( 304 early qclk clkdom(6) 305 early_l qclk clkdom(7) 306 late_trail_dn -0.125163 bf qclk clkdom(7) 307 late_t(7) 308 TEXT TO BE REMOVED 309 )

just fill in the dotted parts.


Perl reduces RSI - it saves typing

Replies are listed 'Best First'.
Re^4: Substitute in a subparagraph
by repellent (Priest) on Oct 19, 2008 at 07:28 UTC
    If your paragraphs go from "blabla" to "blabla", a trick would be to set your input record separator to return records that end with "blabla", instead of the usual newline "\n". Perhaps something like:
    use warnings; use strict; ... local $/ = "blabla"; while (<DATA>) { s/TEXT TO BE REMOVED// unless m/^5/; print; ... }

      Ok, I think I need to be more specific. I attached the exact data right now. I need to match the paragraphs that their headlines contain (string)clk(digit in brackets) and leave the "QUALIFIED" there. In all other paragraphs I should delete it. So in that example I have 3 paragraphs. The first one matches what I want so I should leave the "QUALIFIED" but the third one does not meet my needs so I should delete it.
      qclk[6] INPUT ( ! "asdk fd sasd" VALID ( late_lead 3 ar qclk slope 20 late_lead 3 af qclk slope 04 early_dn 8 ar qclk slope 6 early_up 6 af qclk slope 6 ) cext %0.00394757 cmax %0.005504 QUALIFIED ) clkout_qclk_61[3] OUTPUT ( ) clkout_qclk_61[2] OUTPUT ( REQUIRED ( earlyp 0.5 br qclk clm(2) latel_up 5 bf qclk clk(2) ) REQUIRED ( early_lead_dn 0.004 bf qclk clkdom(2) late_trail_dn 0.005 br qclk clkdom(2) ) cext %0.0647336 max_ceff_up %0.187 QUALIFIED )
        Here's my go using a flag.
        #!/usr/bin/perl use warnings; use strict; my $del_qual; while (<DATA>){ if (/^\S/){ if (/^\w+clk\[\d+\]/){ $del_qual = 0; } else{ $del_qual = 1; } } next if $del_qual and /\s+QUALIFIED/; print qq{$_}; } __DATA__ qclk[6] INPUT ( ! "asdk fd sasd" VALID ( late_lead 3 ar qclk slope 20 late_lead 3 af qclk slope 04 early_dn 8 ar qclk slope 6 early_up 6 af qclk slope 6 ) cext %0.00394757 cmax %0.005504 QUALIFIED ) clkout_qclk_61[3] OUTPUT ( ) clkout_qclk_61[2] OUTPUT ( REQUIRED ( earlyp 0.5 br qclk clm(2) latel_up 5 bf qclk clk(2) ) REQUIRED ( early_lead_dn 0.004 bf qclk clkdom(2) late_trail_dn 0.005 br qclk clkdom(2) ) cext %0.0647336 max_ceff_up %0.187 QUALIFIED )
        qclk[6] INPUT ( ! "asdk fd sasd" VALID ( late_lead 3 ar qclk slope 20 late_lead 3 af qclk slope 04 early_dn 8 ar qclk slope 6 early_up 6 af qclk slope 6 ) cext %0.00394757 cmax %0.005504 QUALIFIED ) clkout_qclk_61[3] OUTPUT ( ) clkout_qclk_61[2] OUTPUT ( REQUIRED ( earlyp 0.5 br qclk clm(2) latel_up 5 bf qclk clk(2) ) REQUIRED ( early_lead_dn 0.004 bf qclk clkdom(2) late_trail_dn 0.005 br qclk clkdom(2) ) cext %0.0647336 max_ceff_up %0.187 )
        I've assumed that the start of each record is a non space character in the first column.

        I personally believe that since the text you're operating on seems highly structured, and more precisely a proper language, then a fully reliable solution would comprise to write a parser. Perhaps one exists already. However, if you're fairly sure that the your data is regular enough then you may want to slurp it all at once and process it with somewhat naive regexen. The following program follows such an approach does work as expected on your sample, but be warned that it may fail on the full data.

        --
        If you can't understand the incipit, then please check the IPB Campaign.
        You may refine this further:
        use warnings; use strict; local $/ = "\n)\n"; while (<DATA>) { my $got_end = chomp; s/\s*QUALIFIED$// unless m/^.*?clk\[\d+\]/; print; print $/ if $got_end; }

        What is the simulation language you are working with? If it is widely used and the sort of editing you want to do is common it may be that there is already public domain code available.


        Perl reduces RSI - it saves typing
Re^4: Substitute in a subparagraph
by guyov1 (Novice) on Oct 19, 2008 at 07:21 UTC
    Well my code does not mean much at the moment because I don't know how to handle the file yet. All I know now is how to remove "QUALI..." from the entire text. I need to think on a way to match my certain string (blabla5) and then take my paragraph ( bounded by ((...)), the string to be removed is just after that), and then leave the "QUALI..." just there and other places alike. Maybe if I could start a new search starting a line number, or save the data after my match in some kind of database... Thanks.
    #!/usr/bin/perl -w print ("Please enter the location:\n"); chomp ($stuff=<STDIN>); open STUFF, $stuff or die "Cannot open $stuff for read :$!"; while (<STUFF>) { s/QUALI_C//g ; print "$_/n"; } ;