Bman70 has asked for the wisdom of the Perl Monks concerning the following question:

Hi geniuses,
I'm processing some video data from Google's API. I have an array, @vidids, that could have up to several thousand elements which are video IDs. I want to get the view count for each video ID. I want to loop through the array, taking 50 elements on each pass (which is the max allowed) and getting the view counts for those. These counts I want to stuff into array @counts, until the @vidids is empty. With limited understanding of perl, I imagine starting out something like:
my $idees = join ',', @vidids[0..49]; #the id's must be comma separated
foreach (@vidids) { my $uri = "https://www.googleapis.com/youtube/v3/videos?part=items(vie +wCount)&id=$idees&key=$API_KEY"; ##$idees is the 50-max comma separa +ted string of video IDs my $result = get($uri); my $json = decode_json($result); for my $i( @{$json->{items}} ) { push @counts, $i->{viewCount}; } $idees = "the next 50 elements from @vidids..." #and keep looping unti +l all are processed }

I have all of it working except the looping through @vidids. I just don't have the grasp of perl operations to put it together well. Also, what if there aren't 50 left at the end? I don't want to create null elements. Thanks!

Replies are listed 'Best First'.
Re: Next 50 array elements during each loop?
by choroba (Cardinal) on Jun 03, 2017 at 09:47 UTC
    I'm not sure I understood everything, but you can use splice to shift more than one element from an array.
    while (my @section = splice @vidids, 0, 50) { my $idees = join ',', @section; my $uri = "https://www.googleapis.com/youtube/v3/videos" . "?part=items(viewCount)&id=$idees&key=$API_KEY"; my $result = get($uri); my $json = decode_json($result); push @counts, map $_->{viewCount}, @{ $json->{items} }; }

    Update: If you don't want to destroy the array, you can use an index:

    while ($from <= $#vidids) { my @section = @vidids[ $from .. $to ]; my $idees = join ',', @section; my $uri = "https://www.googleapis.com/youtube/v3/videos" . "?part=items(viewCount)&id=$idees&key=$API_KEY"; my $result = get($uri); my $json = decode_json($result); push @counts, map $_->{viewCount}, @{ $json->{items} }; } continue { $from += 50; $to = $from + 50; $to = $#vidids if $to > $#vidids; }

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

      Alternative non-destructive approach:

      my $vidis_ = sub { \@_ }->(@vidis); while (my @section = splice @$vidis_, 0, 50) { ... }

      None of the elements are copied, so this is quite efficient.

      I like the conciseness of this one. I could just copy the array and destroy the copy. Thanks I'll plug it in and see what happens
Re: Next 50 array elements during each loop?(ICE & a slice.)
by BrowserUk (Patriarch) on Jun 03, 2017 at 11:51 UTC

    A perfect opportunity to use that much denigrated Infernal C Entity: the C-style for loop:

    for( my $i=0; $i < $#vidids; $i += 50 ) { my @counts = @vidids[ $i .. $i + 50 ]; ## use @counts... } ## or avoid the copy and use the slice directly.

    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority". The enemy of (IT) success is complexity.
    In the absence of evidence, opinion is indistinguishable from prejudice. Suck that fhit
Re: Next 50 array elements during each loop?
by huck (Prior) on Jun 03, 2017 at 15:41 UTC

    I have all of it working except the looping through @vidids.

    Im not sure you have the rest working either. I use Oauth2, and so include a Authorization: Bearer ... header rather than a api key via LWP.

    You dont check for an error return, part=items(viewCount) returns content of

    "error": { "errors": [ { "domain": "youtube.part", "reason": "unknownPart", "message": "items(viewCount)", "locationType": "parameter", "location": "part" } ], "code": 400, "message": "items(viewCount)" }
    and a return code of 400. You will want part=statistics instead. To limit the fields returned you use the fields= parm. And there is no requirement that the api returns all of the ids requested or in any certain order. you will want to get the id field as well fields=items/id,items/statistics/viewCount So you will want to save both the id and viewcount. a hash works well for this.
    sub api_videos { use Data::Dumper; my $ua=LWP::UserAgent->new(keep_alive=>1,timeout=>100); my @vidlist0=qw/ bH4tEHg-Wtc ShI5jhe3bi4 bbpyN89Vquk EgJ5GGD_2QQ YoRnMC2MlqU 7M39G0Yxy +dk 1ov2NQ2Sz5w HmourJ3O6Ss sWSzjBPI1cY czt_a0fKvFA x6GDUIUHSqg 4oEWS33lt9M KlTwG3UNqE8 ACVgSl9OpIQ mbsGOyNjsNQ L2qpMI6mN +pU 6a1Ziw0-4X4 3Z08nsbhlAY vMLGUrVyyMk pjmyy0qKN2o qyXxrPTcFKE d2SFFHlHNCY akSp8m4JAnY AEDw975DAlM ypl033on69k LM7GxrK-h +go L6qq5aVOEmc tGEnzlNkGKI Aq2QQYY8YSA ZnnzFSGGJP4 Nt1qRbAlrCs cnGQ2nE3USI YuzE5jomGLk wSz2kCpuZ6E v40RcxCpvgM bUVLymf4c +00 -_rQwhvJDd4 jmaB41hx8b0 lcLSo0dijv8 HtBbRvj2cfc lQVBX6aqC8o rbq4RZGP8 +YU qy5NfKg1fJo jtjOcSBBoiE aRhqb-pljGk iRbqaMfpOrA tSKGYQFwAHs 7sHXl8q7u +DY d4u8Wj0xY7A Rtzk1I0BhSI x /; my $vidlist=[]; my $n=0; my @vidlists=(); my $testct=10; my $back={}; for my $vid (@vidlist0){ $n++; if ($n>$testct) {push @vidlists,$vidlist; $n=1; $vidlist=[];} push @{$vidlist},$vid; } push @vidlists,$vidlist; my $head='https://www.googleapis.com/youtube/v3/videos?fields=items/ +id,items/statistics/viewCount&part=statistics&id='; for my $vl (@vidlists){ my $getline=$head.join(',',@{$vl}); my $req = new HTTP::Request (GET => $getline ,HTTP::Headers->new( 'Authorization' + => 'Bearer '.$agrp->{access_token} ) ); my $request = $ua->request ($req); unless ($request->is_success) { print "\n** Can't get -- :".$request->status_line."\n"; print $getline."\n"; } else { my $jsondata; eval{ $jsondata=$json->allow_nonref->decode($request->content +);} ; if ($@) { print 'json decode error:'.$@ ."\n"; } elsif ($jsondata->{error}) { print Dumper(\$json); } else { for my $item( @{$jsondata->{items}} ) { $back->{$item->{id}}=$item->{statistics}{viewCount}; } # item } # no error } # success } # vl for my $vid(@vidlist0) { unless (exists $back->{$vid} ) {print "video $vid not returned\n";} } print Dumper({back=>$back}); }
    result
    video x not returned $VAR1 = { 'back' => { 'bH4tEHg-Wtc' => '39', 'EgJ5GGD_2QQ' => '40', 'mbsGOyNjsNQ' => '150', 'bUVLymf4c00' => '51', 'qy5NfKg1fJo' => '20', 'bbpyN89Vquk' => '22', 'Rtzk1I0BhSI' => '87', 'rbq4RZGP8YU' => '60', 'qyXxrPTcFKE' => '46', 'ZnnzFSGGJP4' => '66', 'AEDw975DAlM' => '69', '7M39G0Yxydk' => '31', 'ACVgSl9OpIQ' => '89', 'wSz2kCpuZ6E' => '52', 'L6qq5aVOEmc' => '69', 'cnGQ2nE3USI' => '43', 'v40RcxCpvgM' => '50', 'ypl033on69k' => '63', 'tSKGYQFwAHs' => '51', 'jtjOcSBBoiE' => '170', 'pjmyy0qKN2o' => '236', 'vMLGUrVyyMk' => '151', 'LM7GxrK-hgo' => '405', 'ShI5jhe3bi4' => '30', 'jmaB41hx8b0' => '186', 'L2qpMI6mNpU' => '162', 'czt_a0fKvFA' => '152', '1ov2NQ2Sz5w' => '19', 'sWSzjBPI1cY' => '58', 'YuzE5jomGLk' => '32', 'Nt1qRbAlrCs' => '57', 'akSp8m4JAnY' => '131', 'x6GDUIUHSqg' => '77', 'lcLSo0dijv8' => '62', 'lQVBX6aqC8o' => '36', 'tGEnzlNkGKI' => '93', 'aRhqb-pljGk' => '181', 'KlTwG3UNqE8' => '49', 'd2SFFHlHNCY' => '48', '-_rQwhvJDd4' => '108', '6a1Ziw0-4X4' => '71', 'HtBbRvj2cfc' => '52', 'd4u8Wj0xY7A' => '51', 'YoRnMC2MlqU' => '59', '3Z08nsbhlAY' => '86', '7sHXl8q7uDY' => '17', 'HmourJ3O6Ss' => '147', '4oEWS33lt9M' => '38', 'Aq2QQYY8YSA' => '101', 'iRbqaMfpOrA' => '51' } };

      Great catch, you're right the part is statisics, and fields are items>statistics>viewCount. I constantly duplicate my .pl file to edit while saving my tracks, so older versions might not have all the URL syntax correct. Of course in this case it's just the loop I'm after

        Ive been programmatically watching my view counts since 2007. Since they just changed the API yet again and I had to modify the code again it stood out. I am currently checking about 1000 videos a few times an hour.

        Since you dont use LWP::UserAgent and so dont have access to the return code you need to test for valid json, i do that via the eval to trap the die signal. It is not uncommon for the api to barf or timeout and return improper json.

        For the same reasons you also need to test for the error case where no items array gets returned. While not as common as barfing there have been times the api seems to have forgotten that statistics was a valid part for a while.

        And if a video gets deleted or marked private there will be no entry in the items array for that video, so the hash containing the id and count is useful. And as i pointed out at times the order of the items array has also gotten changed from the order i asked for, so the id/viewcount hash is also useful in that case rather than rely on a 1-1 correspondence to the asking order that your counts array only provides.

        Another thing to be aware of is negative deltas on the view count. It seems at times they will back out some update and the delta will be negative, then an hour or so later they will reapply that transaction set and cancel it out. Negative deltas can also happen if they decide views were programmatically generated somehow.

        In regards to versioning you may wish to consider some versioning system instead of relying on copies. They made me use CVS at work about 17 years ago, so ive been using it at home since then as well. I dont do fancy things with it so i havent felt any need to move to one of the more fancy systems like git and loose my prior history. I commit whenever i have decently working code and that allows me to back out changes or see what i have changed since the last commit. It also allows me to test on one box and then easily roll out the updates to the "production" box when i am finished. And as i have a mixture of linuxen and win boxes it also takes care of the proper \n for the architecture when i checkout or update.

Re: Next 50 array elements during each loop?
by betterworld (Curate) on Jun 03, 2017 at 11:22 UTC
Re: Next 50 array elements during each loop?
by beech (Parson) on Jun 03, 2017 at 09:59 UTC

    Hi

    This is how you approach that , start a new file

    #!/usr/bin/perl -- # file: array-next50.pl # use strict; use warnings; use Data::Dump qw/ dd /; my @vidids = 1 .. 422 ; my @counts; dd( \@vidids, \@counts ); while( ... ){ my @ids = ...; push @counts, GetCounts( \@ids ); } dd( \@vidids, \@counts ); sub GetCounts { my( $ids ) = @_; dd( $ids ); return @$ids; ## its a test :D } __END__

    The ... parts are for you to fill in

    The condition in the while loop is where you check to see if the array is empty or if it has any more elements left

    The @ids assignment is where you remove 50 elements from @vidids, so you're not just removing the first 50, you're splicing them from the top

    For the answer to the .. parts you can search the Perl documentation or just check the perlfaq or search the free perl pdf book Modern Perl or check from our Tutorials section Arrays: A Tutorial/Reference

Re: Next 50 array elements during each loop?
by dbander (Scribe) on Jun 03, 2017 at 14:36 UTC

    my @Counts = (); my $vidcount = 0; my $idees = ''; foreach my $vidid (@vidids) { if (!($vidcount % 50)) { fetchCounts($idees); $idees = ''; } else { $idees .= ','; } $idees .= $vidid; $vidcount++; } fetchCounts($idees); exit; sub fetchCounts { my $idees = shift; if ($videes ne '') { my $uri = "https://www.googleapis.com/youtube/v3/videos?part=i +tems(viewCount)&id=$idees&key=$API_KEY"; ##$idees is the 50-max comm +a separated string of video IDs my $result = get($uri); my $json = decode_json($result); for my $i( @{$json->{items}} ) { push @Counts, $i->{viewCount}; } } }