PeterKaagman has asked for the wisdom of the Perl Monks concerning the following question:

Hi there

The school I work for used to create teams for classes/groups using somthing MS calls SDS. For all sort of reason (all my fault) this system broke. Lately I've been wordking on a replacement.

One of the things involved in this is getting all the teams (groups in fact) from O365 which are is use for these classes. I'm using LWP to make requests to MS Graph to do this. After I download a list of groups I itterate through the list to get the owners (teachers) and member (students) of groups.

When I analysed the results I found there were (too) many groups without members. Checking up on them (trought the normal admin interface in MS Entra) it turned out some of those memberless groups do in fact have members. So the result is unthrustworthy.

I did allready notice delays in the requests. Sometimes I noticed delays up to 120 seconds. My first thought was that Im experiencing timeouts on the Graph service, but I allso found that the default timeout for LWP is 180 seconds. And my (clumsy) error handling should have reported that. I did test at one time by setting LWP timeout to 1 second. My script stopped complaining about a return code of 500 (and endpoint not being connected). Not really sure what that means atm, but It did learn me that my script does in fact "fails" on an error, it does react on return codes other than 200.

I allso learned MS does throttling on the Graph endpoint. Not complaining about that, I would problably do the same if I was them. And since I'm requsting owners and members for 2700 groups (making it 5400 request) I would assume that I'm a valid candidate for being throttled.

According to MS documentation on this I should receive a return code of 429 on being throttled. I'm not getting that return code on any time. And this made me wonder. Does LWP handle throttling on its own? I would suspect LWP just makes request and return the result whatever it is. Leaving the handling of it to me. But I'm kinda lost by what is happening, can't explain it.

I would appreciate any insight from you on this subject.

Finally some code snippets of what is going on in my script:

I've put all my MS Graph code in some objects/modules so I can reuse it. While looping the list of groups I create a group object using the group id an subsequently request the owners and member

# Create a group object to get owners and members my $group_object = MsGroup->new( 'app_id' => $config{'APP_ID'}, 'app_secret' => $config{'APP_PASS'}, 'tenant_id' => $config{'TENANT_ID'}, 'login_endpoint'=> $config{'LOGIN_ENDPOINT'}, 'graph_endpoint'=> $config{'GRAPH_ENDPOINT'}, 'select' => '$select=id,displayName,userPrincipalName', 'id' => $group->{'id'}, ); # Get the owners my $owners = $group_object->fetch_owners(); # $owners is een AOH foreach my $owner (@$owners){ # do something usefull with the owner } # Get the members my $members = $group_object->fetch_members(); # $members is een AOH foreach my $member (@$members){ # do something usefull with the members } }

The method fetch_members is responsible for composing the correct URL for the request. The example is for members, the one for owners is basicly the same:

sub fetch_members { # {{{1 my $self = shift; # get a reference to +the object itself my @members; # an array to hold the + result # compose an URL my $url = $self->_get_graph_endpoint . "/v1.0/groups/".$self->_get +_id."/members/?"; # add a filter if needed (not doing any filtering though) if ($self->_get_filter){ $url .= $self->_get_filter."&"; } # add a selectif needed, have in fact a select => see object creat +ion if ($self->_get_select){ $url .= $self->_get_select; } $url .= '&$count=true'; # adding $count just to be sure do_fetch($self,$url, \@members); # actual fetch is done in do_fetc +h() return \@members; # return a reference to the resul }# }}}

After composing the URL do_fetch is called. This function stores the results it get in an array and can be called recursively if needed. This function allso reacts to return codes other than 200

sub do_fetch { # {{{1 my $self = shift; # get a reference to +the object my $url = shift; # get the URL from the + function call my $found = shift; # get the array refe +rence which holds the result my $result = $self->callAPI($url, 'GET'); # do_fetch calls call +API to do the HTTP request # Process if rc = 200 if ($result->is_success){ my $reply = decode_json($result->decoded_content); while (my ($i, $el) = each @{$$reply{'value'}}) { push @{$found}, $el; } # do a recursive call if @odata.nextlink is there if ($$reply{'@odata.nextLink'}){ do_fetch($self,$$reply{'@odata.nextLink'}, $found); } #print Dumper $$reply{'value'}; }else{ # Error handling print Dumper $result; die $result->status_line; } } # }}}

Finally the the callAPI method is called with an URL and method. This is were LWP comes into play:

sub callAPI { # {{{1 my $self = shift; # Get a refence to the object + itself my $url = shift; # Get the URL from the functio +n call my $verb = shift; # Get the method form the fun +ction call my $try = shift || 1; my $ua = LWP::UserAgent->new( # Create a LWP useragnent (be +yond my scope, its a CPAN module) 'timeout' => '5', ); # Create the header my @header = [ 'Accept' => '*/*', 'Authorization' => "Bearer ".$self->_get_access_token, 'User-Agent' => 'curl/7.55.1', 'Content-Type' => 'application/json', 'Consistencylevel' => $self->_get_consistencylevel ]; # Create the request my $r = HTTP::Request->new( $verb => $url, @header, ); # Let the useragent make the request my $result = $ua->request($r); # adding error handling # rc 429 is throttling if (! $result->{"_rc"} eq "200"){ print Dumper $result; } return $result; } # }}}

As you can see I allready started putting some errorhandling in there. I was planning on reacting to an RC 429 by waiting the suggested time and redoing the request. But I've yet to see an RC 429

Really hope you guys can give me some insight!

Peter

Replies are listed 'Best First'.
Re: Throttling while using LWP for MS Graph request?
by hippo (Archbishop) on May 14, 2024 at 09:06 UTC
    Does LWP handle throttling on its own?

    No, it has no hidden, automated throttling. That would break the principle of least surprise.

    The school I work for

    Schools (and similar institutions) often have draconian and perplexing network policies. Don't trust them not to mess with your traffic.

    According to MS documentation on this I should receive a return code of 429 on being throttled.

    In your shoes I would not trust MS any more than the school.

    Your code looks OK on first glance, if not entirely idiomatic. What I would recommend is taking all the bits which are not specific to the endpoint, copy them out to a different script/application and run them against a solid, reliable 3rd-party API from a client which is not within the school network (eg. from your own office or even from a cloud server). That should convince you one way or the other than your generalised API client itself is solid.

    Assuming that your client passes the above tests you can then simply allow for the fact that either the school or the MS service (or both) is causing the issue. You could debug this if you have great patience, or you could just handle those occurrences where the responses are not what you expect by binning them and re-requesting them. Having had to deal with both schools and MS in the past I would opt for the latter.

    Best of luck.


    🦛

      Ha ha ha... Draconian meausures! Good thing that I'm fully aware of what we're doing being the network administrator ;). We do tend to close things up for sure.

      I do trust our own network, that being said I don't really have a choice but to use the schoolnetwork. The other side of the data is a SIS system which I can only reach from within the schoolnetwork. MS on the other hand.... I do not trust. Taking your advice seriously. Re-requesting suspicious results does sound like a good option.

Re: Throttling while using LWP for MS Graph request?
by etj (Priest) on May 14, 2024 at 13:22 UTC
    You're having to make 2700 requests to get each thing, having first got a list of the things. This sounds like a classic "1 + n" problem for REST APIs. Are you able to use GraphQL to get everything in one request? It looks like Microsoft experimented with that, see https://github.com/microsoftgraph/graphql-demo.

      It's even worse: 2700 request for the owners and 2700 requests for the member. Firing them off as fast as I can. Did I mention I'm a candidate for throttling ;)
      I know, not really efficient. Reminds me of my SQL days before I learned about joins.

      Never heard of GraphQL, but wil most certainly have a look. Thanks for the reply

        To be clear, GraphQL is a technology that's designed for this situation, but it has to be provided by the service. To learn more, see https://www.graphql.org/, and for the Perl implementation (ported by yours truly) see GraphQL.