Here is a problem that may be harder or simpler than it looks.

The task is to extract double quoted strings from another string. For the string

'"foo" & "bar"'

the matched strings would be "foo" and "bar" including the quotes.

The thing that makes this a little tricky is that double quotes in the quoted string are escaped using a pair of quotes. I think this escape format comes from Basic. Thus, the string " He said "maybe" " would be escaped as follows:

'" He said ""maybe"" "'

This leads to some nice pathological cases such as '""""""""""' and '" """&" """'

Here is a test framework if you care to bend your brain to this puzzle. I'll post my own answer later.

Problem. Write a function find_quote() which takes a string and adds <> around the double quoted strings.

For example

'"foo" & "bar"' -> '<"foo"> & <"bar">'

Double quotes within the string are escaped as a pair of double quotes. The quotes will always be balanced. Apart from double quotes there can be any arbitrary characters before or after the strings. The test below should demonstrate the types of string to expect.

#!/usr/bin/perl -w use strict; use Test::More 'no_plan'; # Test input and target output my %strings = ( '' => '', '""' => '<"">', '""""' => '<"""">', '""&""&""&""&""' => '<"">&<"">&<"">&<"">&<"">', '""""""""""' => '<"""""""""">', '""""&""""' => '<"""">&<"""">', '"""&"""' => '<"""&""">', '"foo"&"bar"' => '<"foo">&<"bar">', ' "foo" & "bar" ' => ' <"foo"> & <"bar"> ', '" "' => '<" ">', '" "" "' => '<" "" ">', '" """&" """' => '<" """>&<" """>' , '" "" "' => '<" "" ">', '" "" "&" "" "' => '<" "" ">&<" "" ">', '" "" & "" "' => '<" "" & "" ">', '""&""""' => '<"">&<"""">', ' "" ' => ' <""> ', ' """" ' => ' <""""> ', ' ""&""&""&""&"" ' => ' <"">&<"">&<"">&<"">&<""> + ', ' """""""""" ' => ' <""""""""""> ', 'test("foo","bar")' => 'test(<"foo">,<"bar">)', ); # Add your code here sub find_quote { my $str = $_[0]; # Broken example $str =~ s/(".*?")/<$1>/g; return $str; } # Run the tests while (my($input, $result) = each %strings) { is(find_quote($input), $result, "for string:\t" . "'$input'"); }

--
John.


In reply to Matching escaped quoted strings by jmcnamara

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.