in reply to newline in unix

But, when I add even one letter, it fails.

Maybe I don't understand what you are saying, so I must ask: When you add even one letter where?

Replies are listed 'Best First'.
Re^2: newline in unix
by Anonymous Monk on Jul 15, 2004 at 04:01 UTC
    This is my data
    
        CONNAME(163.231.99.129)                 CURRENT
        CHLTYPE(SVRCONN)                        STATUS(RUNNING)
        LSTMSGTI(03.45.47)                      LSTMSGDA(2003-11-21)
    

    Here is the regex (thanks for helping clean it up)
    
    if (/CONNAME\((\d.+)\)\s.*CURRENT\s.*/is){
    print $&;} 
    

    It's still not showing the brackets around my character classes in my post, sorry.

    Where it fails is when trying to match this from CONNAME through CHLTYPE.
    I can match CONNAME through to CURRENT with the newline character.
    But if I try and match any characters on the next line, the program returns no output.


      How are you getting your data into your code? I ask because you might be doing this:

      while (<>) { .... }

      ...which gets a line at a time, explaining why adding one more character after your regex fails. I put your data into a file named /tmp/corpus.txt and did this, which worked:

      cat /tmp/corpus.txt | perl -le '$_ = join("", <>); \ print $& if /CONNAME\((\d{1,3}(\.\d{1,3}){3})\)\s*CURRENT\s*CHL/;' CONNAME(163.231.99.129) CURRENT CHL

      The different regex (\d{1,3}(\.\d{1,3}){3}) is slightly better at validating an IP address. Probably not essential unless you think your data might get munged; it could still match bogus things like "999.99.9.999", but it'll filter out bits like "1.2.3.4.5" or "2555.254.0.3".

      Note that loading $_ like that is often frowned upon; you might consider:

      my $corpus = join("", <>); print $& if $corpus =~ /CONNAME\((\d{1,3}(\.\d{1,3}){3})\)\s*CURRENT\s*CHL/;

      Hope that helps!

      --j

      First of all, look up Writeup Formatting Tips -- it explains how to post code coherently, which is like this:

      <code>

      # perl code here, with literal brackets intact: [blah]
      </code>

      As for the regex problem itself, now that I see what the data "really" looks like (though it's hard to be sure how many whitespace characters there really are), maybe something like this would work better:

      if ( /\w+.([\d.]+).\s+\w+\s+\w+/ ) { print $&, $/; }
      Or, if you really want to be specific about the characters you want to match:
      if ( /\w+.([\d.]+).\s+CURRENT\s+CHLTYPE/ ) { print $&, $/; }
      I did try those out on your data, and the print-out includes the linefeed where it belongs.

      Now, I presume that your real goal is something other than that odd looking output from print, and depending on what your real goal is, maybe a regex isn't your best choice -- e.g. how about using split()?

      update: having seen ercparker's reply below, I should point out that I was assuming all along that you already had all three lines of text stored together in $_ -- but if you've actually been reading and matching one line at a time (as most people usually do), then ercparker is right: you can't match across a newline if $_ does not contain anything after the first newline.

      Here we go...

      
      if (/CONNAME\(([\d.]+)\)[\s.]*CURRENT[\s.]*/is){
      print $&;} 
      
      


        Sorry, I'm a forum retard.

        This works:
        
        
         if (/CONNAME\(([\d.]+)\)[\s.]*CURRENT[\s.]*/is) {
         print $&;} 
         


        This does not:
        
         if (/CONNAME\((.*\..*\..*\..*)\)\[\s.\]*CURRENT\[\s.\]*CHL/is) {
         print $&;}
        


        My last post wasn't a solution to my problem, but was the actual regex with brackets included.