use split?

Replies are listed 'Best First'.
Re: use split? by fruiture (Curate) on Nov 01, 2002 at 11:28 UTC
Imho split() is NOT a good idea: `http://host.com/some/uri/whatever?some/query/string` [download] -- http://fruiture.de	[reply] [d/l]
Re: Re: use split? by Enlil (Parson) on Nov 01, 2002 at 23:02 UTC
The problem with either method is that there are special cases which one might miss unless they understand exactly what a URL might look like (or for that case any data you have to parse through). Personally, I would use a module if someone has already taken the time to do the leg work of what specifications an URL has to meet. When I initially coded up a regex for this, and then didn't post it because I don't wish to do someone elses homework, but rather posted the method I took, and I completely neglected the special case that fruiture mentions above. But I don't see a problem with using split(s). Anyhow, on to the code (granted no guarantees that it will work for all cases, I would use URI): use strict; use warnings; while ( my $url = <DATA> ) { chomp($url); my $dup_url = $url; if ( length($url) > 49) { $url =~ s!(?: (^https?://[^/]+/)./(.)\?.* ) \| (?: (^https?://[^/]+/)./(.) ) ! ($1\|\|$3) . '(...)/'. ($2\|\|$4) !ex; my $http = (split /\/\//,$dup_url)[0]; my ($url_start, $url_end) = (split /\// ,(split /\?/,$dup_url)[0]) +[2,-1]; $dup_url = "$http//$url_start/(...)/$url_end"; } print "REGEX: $url\n"; print "SPLIT: $dup_url\n\n"; } __DATA__ http://some-shop.com/dir1/dir2/buystuff.cgi?x=1&y=2&z=3 http://somewhere/with/a/vastly/deep/structure/virus.exe http://host.com/some/uri/whatever?some/query/stringthatis/here https://some-shop.com/dir1/dir2/buystuff.cgi?x=1&y=2&z=3 https://somewhere/with/a/vastly/deep/structure/virus.exe https://host.com/some/uri/whatever?some/query/stringthatis/here [download]	[reply] [d/l]
Missed pun opportunity! by cebrown (Pilgrim) on Nov 01, 2002 at 00:21 UTC
I should said that I can't post code because I have to `split`.	[reply] [d/l]