Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone , I have an array which contains the following
href="/something/something1/html" href="/abc.html" href="blah.html" href="/dir1/dir2/dir3/test"
my question is there a way to parse the array so that i get only the values of href , like belo
/something/something1/html /abc.html /blah.html
and so on and so fourth ,

please do help me out
thank you very much

update (broquaint): added formatting

Replies are listed 'Best First'.
Re: Help on parsing an array
by broquaint (Abbot) on May 13, 2003 at 13:42 UTC
    Assuming your data is as simple as you present it
    my @hrefs = qw( href="/something/something1/html" href="/abc.html" href="blah.html" href="/dir1/dir2/dir3/test" ); ## new arrray my @links = map /"(.*)"/, @hrefs; ## modify @hrefs s/href="(.*)"/$1/ for @hrefs; print "links - \n", map("\t$_\n", @links), $/; print "hrefs - \n", map("\t$_\n", @hrefs), $/; __output__ links - /something/something1/html /abc.html blah.html /dir1/dir2/dir3/test hrefs - /something/something1/html /abc.html blah.html /dir1/dir2/dir3/test
    See. map and perlre for more info on the code above.
    HTH

    _________
    broquaint

Re: Help on parsing an array
by Wonko the sane (Curate) on May 13, 2003 at 13:42 UTC
    There are at least 101 ways to do this. Here is one of them.
    my @urls = map { /href=([^\n\r]+)/ } @strings;

    I would wonder if at a higher level your task may be better suited for one of the HTML Parser modules though.:-)

    Wonko.

      i used they way u showed me to , is there a way to get rid of the "" that would be really wonderful here is the output as i am getting it "/forwardBookmark.jsp" "/search/searchhelp.jsp" thanks again for ur help
      got the quotes i was just playing with it and figured it out thank you very much for ur help
Re: Help on parsing an array
by Limbic~Region (Chancellor) on May 13, 2003 at 14:10 UTC
    Anonymous Monk,
    Let's forget for a second that this looks like a URL and you should be using a module from CPAN (see this search for some ideas). What you are really asking is to get what is between the first quote and the last quote. This assumes there are no quotes that are escaped in the date.
    /"([^"]*)"/
    This is called a negated character class. See Death to Dot Star! for more information.

    Cheers - L~R

Re: Help on parsing an array
by hmerrill (Friar) on May 13, 2003 at 13:58 UTC
    Your question is not very clear - are you saying that you have an array like this:
    my @input_array = ("href=\"/something/something1/html\"", "href=\"/abc.html\"", "href=\"blah.html\"", "href=\"/dir1/dir2/dir3/test\"")
    if that IS what you are starting with, then this should work - use 'split' to split each line:
    foreach $array_element (@input_array) { my ($href_text, $href_directory) = split /=/, $array_element; ### now take off the leading and trailing double quotes ### my $href_dir_no_quotes = ""; if ($href_directory =~ /^"(.*)"$/) { $href_dir_no_quotes = $1; } }
    Use with a grain of salt, since this is completely untested, but it should be close. I'm not sure if each double quote needs to be escaped in the regular expression.

    HTH.