in reply to String contents

The problem that we all seem to be having with your question is that print prints the contents of the string already. If you want something different from what print does, you'll need to be more specific. Wanting to see "the structure of the string" is not explicit enough. Are you saying that you want whitespace converted to something more visible? Do you want one of those little arrow things everywhere a \n shows up? Do you want little dots in place of spaces, and long arrows in place of tabs? Do you want color highlighted sentence structure charts?

Maybe you want to know all of the octets, dumped in hex format. Maybe you want HTML::Entities to encode your strings so that ampersands show up with the & encoding.

Whatever it is that you want, you need to tell us exactly what it is. Otherwise, the best answer is print, because it prints the contents of a string.


Dave

Replies are listed 'Best First'.
Re^2: String contents
by perlyr (Novice) on Jun 29, 2012 at 08:56 UTC
    Sorry about my earlier posts. Here is the regex in a nice format:
    /^\s*(\w+),\s*(\w+ \w+)(.+?\s*LLP)/m;
    my question now is: why did I extract "New" instead of "New York"? Any suggestions?

      My suggestion is to post in one node the sample text, and the regular expression that is failing to match where you expect it to. Post them well formatted, using the tips found in Writeup Formatting Tips, and be sure that you're posting actual full copy and pastes of the text and code that fail.

      When I test the text you provided, and the regexp you provided in the preceding node, I got the following results:

      Capture variables:

      • Digit Captures
        • $1 => Melville
        • $2 => New York
        • $3 =>                             /s/KPMG LLP
      • ${^PREMATCH}  => ended June 30, 2001 in conformity with accounting principles generally accepted
        in the United States of America. Also in our opinion, the related financial
        statement schedule, when considered in relation to the basic consolidated
        financial statements taken as a whole, presents fairly, in all material
        respects, the information set forth therein.
        
      • ${^MATCH}     => 
        Melville, New York                            /s/KPMG LLP
      • ${^POSTMATCH} => 
        September 26, 2001
        STR
      • $^N           =>                             /s/KPMG LLP
      • @- => (352,354,364,372)
      • @+ => (411,362,372,411)

      The text I used was exactly this:

      ended June 30, 2001 in conformity with accounting principles generally + accepted in the United States of America. Also in our opinion, the related fina +ncial statement schedule, when considered in relation to the basic consolida +ted financial statements taken as a whole, presents fairly, in all materia +l respects, the information set forth therein. Melville, New York /s/KPMG LLP September 26, 2001 STR

      And the regexp I used was exactly this:

      /^\s*(\w+),\s*(\w+ \w+)(.+?\s*LLP)/m

      Try it yourself with my regexp tester, here: Perl Regex Tester


      Dave

        Sorry, first time here. Still trying to find my way around. Yes, that code works now, but when I was trying to generalize it, I failed. Here is the new code:
        /^\s*(\w+|\w+ \w+|\w+ \w+ \w+),\s*(\w+|\w+ \w+|\w+ \w+ \w+)\s*(.+?\s*L +LP)/m
        I have to modify it, because sometimes, the state could be "District of Columbia" or "Virginia"