in reply to Split on regex, don't match partial regex
There are more than one id tag and url in the example you gave.
Let's take the example you gave as one line as it arrives as given in the OP. Please note that how those line(s) are inputted (or arrives to use your words) into the perl script was not shown.
You can do like so:
use warnings; use strict; my $line = 'itag=44&url=http://o-o---preferred---sn-u5a3u5a3-h5oe---v13---lscache +3.c.youtube.com/videoplayback?upn=8kbZJLkF5PA&sparams=cp%2Cid%2Cip%2C +ipbits%2Citag%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=927101%2C9230 +06%2C922401%2C920704%2C912806%2C913419%2C913546%2C913556%2C919349%2C9 +19351%2C925109%2C919003%2C920201%2C912706&key=yt1&expire=1348823962&i +tag=44&ipbits=8&sver=3&ratebypass=yes&mt=1348800611&ip=92.22.37.231&m +v=m&source=youtube&ms=au&cp=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL& +id=1100a4b92b939cd6&type=video/webm;+codecs="vp8.0,+vorbis"&fallback_ +host=tc.v13.cache3.c.youtube.com&sig=8353F6329CDA8168C4F7F29E20F2AE3F +6509D85F.C582D63C02534232CE8E28D5ADC5B119AAEF2963&quality=large,itag= +35&url=http://o-o---preferred---sn-u5a3u5a3-h5oe---v11---lscache4.c.y +outube.com/videoplayback?upn=8kbZJLkF5PA&sparams=algorithm%2Cburst%2C +cp%2Cfactor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&fexp=92 +7101%2C923006%2C922401%2C920704%2C912806%2C913419%2C913546%2C913556%2 +C919349%2C919351%2C925109%2C919003%2C920201%2C912706&expire=134882396 +2&algorithm=throttle-factor&burst=40&ip=92.22.37.231&itag=35&sver=3&k +ey=yt1&mt=1348800611&mv=m&source=youtube&ms=au&ipbits=8&factor=1.25&c +p=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&id=1100a4b92b939cd6&type=v +ideo/x-flv&fallback_host=tc.v11.cache4.c.youtube.com&sig=885C9C098DF9 +D80E780177E01CF944BC4F9564FE.9A374618A2BE8C2E562C8622DCB449A7071E37BD +&quality=large,itag= ...AND SO ON'; if ( my @arr = ($line) =~ m/itag=(.+?)&url=(.+?=large)/ig ) { print join "\n" => @arr; }
If then you need put your data in an hash, you could simply just do44 http://o-o---preferred---sn-u5a3u5a3-h5oe---v13---lscache3.c.youtube.c +om/videoplayback?upn=8kbZJLkF5PA&sparams=cp%2Cid%2Cip%2Cipbits%2Citag +%2Cratebypass%2Csource%2Cupn%2Cexpire&fexp=927101%2C923006%2C922401%2 +C920704%2C912806%2C913419%2C913546%2C913556%2C919349%2C919351%2C92510 +9%2C919003%2C920201%2C912706&key=yt1&expire=1348823962&itag=44&ipbits +=8&sver=3&ratebypass=yes&mt=1348800611&ip=92.22.37.231&mv=m&source=yo +utube&ms=au&cp=U0hTTVhNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&id=1100a4b92b +939cd6&type=video/webm;+codecs="vp8.0,+vorbis"&fallback_host=tc.v13.c +ache3.c.youtube.com&sig=8353F6329CDA8168C4F7F29E20F2AE3F6509D85F.C582 +D63C02534232CE8E28D5ADC5B119AAEF2963&quality=large 35 http://o-o---preferred---sn-u5a3u5a3-h5oe---v11---lscache4.c.youtube.c +om/videoplayback?upn=8kbZJLkF5PA&sparams=algorithm%2Cburst%2Ccp%2Cfac +tor%2Cid%2Cip%2Cipbits%2Citag%2Csource%2Cupn%2Cexpire&fexp=927101%2C9 +23006%2C922401%2C920704%2C912806%2C913419%2C913546%2C913556%2C919349% +2C919351%2C925109%2C919003%2C920201%2C912706&expire=1348823962&algori +thm=throttle-factor&burst=40&ip=92.22.37.231&itag=35&sver=3&key=yt1&m +t=1348800611&mv=m&source=youtube&ms=au&ipbits=8&factor=1.25&cp=U0hTTV +hNUV9LTENOM19QR1VKOkFyQWNVSVFNbmNL&id=1100a4b92b939cd6&type=video/x-f +lv&fallback_host=tc.v11.cache4.c.youtube.com&sig=885C9C098DF9D80E7801 +77E01CF944BC4F9564FE.9A374618A2BE8C2E562C8622DCB449A7071E37BD&quality +=large
Since, you will always have itag ids and urls(except otherwise, which was not told us), then your hash look like so:my %hash_contain; if(...){ ... %hash_contain = @arr; }
Hope this helps.35 => ..., 44 => ...,
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re^2: Split on regex, don't match partial regex
by aldo (Initiate) on Sep 28, 2012 at 08:18 UTC | |
by 2teez (Vicar) on Sep 28, 2012 at 09:02 UTC | |
by aldo (Initiate) on Sep 28, 2012 at 09:33 UTC |