in reply to Fetching 'http://search.cpan.org/uploads.rdf' (from cpan) LibXML error is triggered

Is this expected?

$ curl -LD - http://search.cpan.org/uploads.rdf HTTP/1.1 301 Moved Permanently Connection: keep-alive Content-Length: 5 Content-Type: text/plain Location: https://metacpan.org/recent.rdf Cache-Control: max-age=31536000 Strict-Transport-Security: max-age=31536000; includeSubDomains Via: 1.1 varnish, 1.1 varnish Accept-Ranges: bytes Age: 1292181 Date: Thu, 06 Nov 2025 00:03:16 GMT X-Served-By: cache-fra-etou8220024-FRA, cache-yyz4534-YYZ X-Cache: HIT, HIT X-Cache-Hits: 606, 0 X-Timer: S1762387396.373864,VS0,VE4 HTTP/2 200 set-cookie: _fs_ch_st_FSBmUei20MqUiJb9=Aephdq6cZPgwr6cwu2UjlkQwWIqAvL1 +yxfsxYlbHcotSsvW5voanZ2qZLXp3LmvoCKcxHBrVeOl1ELcETsxUN_oeDechM414wPNI +KKcPEkER4X7yF-6hFr2aQX2828d9zf-DYMr75ceP9YVWuJO8mRR9yR1xj17sfnyCHUX2c +vgPz-ZjgucO9v4rKVad96rVCuOlWSLmN4EDqfEEJR5FWBsbpa3J8jOpOMc8Q1Wqk2pMts +0_RJunCCVvUe3MBQF-ZpyL9rv5guZElmjyJKL6PBmN_envwRs0f2uLbNiHoh_lMSAnQzP +HJQDP3DLsi2T3YgOCYfyi6QLuYYmnxN8ylriMPI2_plQvh7KVzA==; Max-Age=10; Ht +tpOnly; Path=/ content-type: text/html; charset=utf-8 cache-control: private, no-store accept-ranges: bytes via: 1.1 varnish, 1.1 varnish date: Thu, 06 Nov 2025 00:03:16 GMT x-served-by: cache-fra-etou8220098-FRA, cache-fra-etou8220192-FRA, cac +he-yyz4561-YYZ x-cache: MISS, MISS x-cache-hits: 0, 0 x-timer: S1762387397.572748,VS0,VE109 vary: Accept-Encoding strict-transport-security: max-age=31557600 <!DOCTYPE html> <html lang="en"> <head> <meta http-equiv="Content-Security-Policy" content="default-src 'self'; img-src 'self' data:; media-src 'se +lf' data:; object-src 'none'; style-src 'self' 'sha256-o4vzfmmUENEg4c +hMjjRP9EuW9ucGnGIGVdbl8d0SHQQ='; script-src 'self' 'sha256-KXex2o39zx +tnzVWK4H5rW07g2+BlwSPtn+aguzsWkNg=';" /> <link href="/_fs-ch-1T1wmsGaOgGaSxcX/assets/inter-var.woff2" rel="preload" as="font" type="font/woff2" crossorigin /> <link href="/_fs-ch-1T1wmsGaOgGaSxcX/assets/styles.css" rel="style +sheet" /> <meta name="viewport" content="width=device-width, initial-scale=1 +" /> <title>Client Challenge</title> <style> #loading-error { font-size: 16px; font-family: 'Inter', sans-serif; margin-top: 10px; margin-left: 10px; display: none; } </style> </head> <body> <noscript> <div class="noscript-container"> <div class="noscript-content"> <img src="/_fs-ch-1T1wmsGaOgGaSxcX/assets/errorIcon.svg" alt="" role="presentation" class="error-icon" /> <span class="noscript-span" >JavaScript is disabled in your browser.</span > <p>Please enable JavaScript to proceed.</p> </div> </div> </noscript> <div id="loading-error" role="alert" aria-live="polite"> A required part of this site couldn’t load. This may be due to a + browser extension, network issues, or browser settings. Please check you +r connection, disable any ad blockers, or try using a different br +owser. </div> <script> function loadScript(src) { return new Promise((resolve, reject) => { const script = document.createElement('script'); script.onload = resolve; script.onerror = (event) => { console.error('Script load error event:', event); document.getElementById('loading-error').style.display = ' +block'; loadingError.setAttribute('aria-hidden', 'false'); reject( new Error( `Failed to load script: ${src}, Please contact the ser +vice administrator.` ) ); }; script.src = src; document.body.appendChild(script); }); } loadScript('/_fs-ch-1T1wmsGaOgGaSxcX/errors.js') .then(() => { const script = document.createElement('script'); script.src = '/_fs-ch-1T1wmsGaOgGaSxcX/script.js?reload=true +'; script.onerror = (event) => { console.error('Script load error event:', event); const errorMsg = new Error( `Failed to load script: ${script.src}. Please contact th +e service administrator.` ); console.error(errorMsg); handleScriptError(); }; document.body.appendChild(script); }) .catch((error) => { console.error(error); }); </script> </body> </html>
  • Comment on Re: Fetching 'http://search.cpan.org/uploads.rdf' (from cpan) LibXML error is triggered
  • Download Code

Replies are listed 'Best First'.
Re^2: Fetching 'http://search.cpan.org/uploads.rdf' (from cpan) LibXML error is triggered
by hippo (Archbishop) on Nov 06, 2025 at 10:27 UTC

    If you mean the redirect, then yes. That's been in place since search.cpan.org was shuttered back in 2018.

    If you mean the javascript, then perhaps. MetaCPAN put the fastly anti-LLM-bot-challenge in front of their content a few months back. It's mostly a minor annoyance until you want to do something like this. Almost all of the automated accesses to MetaCPAN now is supposed to happen via their API instead. However, I'm inclined to think that RSS/RDF endpoints on the main site should be excluded from the bot protection because after all we expect bots to hit them, don't we?

    Update: There's also MetaCPAN::Client. MetaCPAN::Client->recent returns more detail than might be necessary but at least makes that available without the need for a JS-enabled, bot-detector-defeating browser in the meantime.


    🦛