in reply to Extracting info from URL into an array

UPDATE:
I didn't see that you are pulling the URI's out of files the first time i read your question. There really is no reason to use URI::Find if you have already found the URI's. ;) This code reads all text files (.txt extension) in your test1 folder. I used an absolute path in the glob instead of the .. metacharacter. I also assume that the files will always have the URI on the first line (that starts with a scheme) and will always end with the .txt extension.
use strict; use warnings; use URI; use File::Basename; my @suffix = qw(.jsp .html .asp .htm); for (</path/to/test1/*.txt>) { open (FH,$_); my $uri = URI->new(<FH>); close FH; next unless $uri->scheme; my %q = $uri->query_form; my (undef,@key) = split( /\//, dirname($q{content}) ); push @key, basename($q{content},@suffix); print "<Textfile>\n", "filename: {", basename($_), "}\n", "Keys: {", join(',',@key), "}\n", "</Textfile>\n", ; }

ORIGNAL POST:
Well, you request is confusing at best. If you want to parse URI's, URI::Find is a fine tool for doing so. Simply pass it a reference to a scalar (in my example i use the built-in DATA filehandle) and it will find the URI's for you. You can also pass a reference to a subroutine (or an anonymous sub) and URI::Find will call it every time it encounters a URI. Here is some code that sort of Does What You Want. File::Basename is used to remove the extension ... but i am starting to think that a better approach would be to remove any extension and split on the forward slash. Anyways, it's a start:
use strict; use warnings; use URI::Find; use File::Basename; # add more if needed my @suffix = qw(.jsp .html .asp .htm); # optionally open a file here and replace DATA # with the name of the filehandle you opened my $data = do {local $/;<DATA>}; my $finder = URI::Find->new(\&call_back); $finder->find(\$data); sub call_back { my $uri = shift; my %q = $uri->query_form; my $content = $q{content}; # using split like this is a hack ... improvements anyone? my (undef,@key) = split(/\//,dirname($content)); # this will add the file name minus its extension push @key, basename($content,@suffix); # you could push these to an array instead of printing print "Filename: {", basename($content), "}\n"; print "Keys: {", join(',',@key), "}\n\n"; } __DATA__ http://www.yyy.com/store/application/meraqf?origin=rrr.jsp&event=link( +goto)&content=/asp/administrative/catalog/products/Network/benefits.j +sp is this text automatically 'ignored'? yes, it is ;) http://foo.com/?content=/asp/management/catalog/products/Network/propa +ganda.asp http://foo.com/?content=/path/to/bar.html

jeffa

L-LL-L--L-LL-L--L-LL-L--
-R--R-RR-R--R-RR-R--R-RR
B--B--B--B--B--B--B--B--
H---H---H---H---H---H---
(the triplet paradiddle with high-hat)

Replies are listed 'Best First'.
Re: (jeffa) Re: Extracting info from URL into an array
by Anonymous Monk on May 27, 2003 at 15:07 UTC
    Jeffa
    Thank you very much for your help!! :). I will try it now!!