hankcoder has asked for the wisdom of the Perl Monks concerning the following question:

I would like to ask which of feed (RSS) method is less resources (CPU) consuming, faster and efficient? compare in between (1) direct file open read and (2) using HTTP parse request like LWP::UserAgent

The feed data is on same website (server) as of perl execute.

For method (1), I'm not sure if it is good idea to have frequent file open read.

Any suggestions are very much appreciated. Thanks.

Replies are listed 'Best First'.
Re: Which feed method?
by Your Mother (Archbishop) on May 08, 2016 at 05:20 UTC

    WAT? File read and process is probably always going to be less intensive than HTTP request. You have to parse the RSS for both. Don’t read the file repeatedly all the time; either respect the TTL from the feed or stat the file to see if it’s fresh. The HTTP requires, depending on webserver technique, a server side file read too so there are additional clock costs hidden there. How you parse the RSS is going to matter too. I am a bit out of the loop but I would assume that XML::RSS::LibXML would be the most performant choice. You know what they say about assumptions and performance. :P

      Thank you man. I'm not very good in technical details. However I have learn something new from your suggestions. I didn't know I can check file stat of modified age.

      Could you correct me if I'm wrong, is it using -M $filename > n days ? -M (Modification age)

        Yeppers. -X: -M Script start time minus file modification time, in days. See also stat.

        Actually, as YourMother pointed out, -M is script start time minus file modification time, in days. So, you will get different results every time your script is started.

        Using stat, however, will give you the actual modification time in seconds since the epoch.1 You will always get the most recent time the file was modified no matter when your script was started.


        1If you don't know what the epoch is, don't worry, it's just a fixed point in time defined in a standard.

Re: Which feed method?
by hankcoder (Scribe) on May 08, 2016 at 07:03 UTC

    Another question, where should I store the last modified time? Whenever feed/page load, I will need to check and compare if last modified time changes. My current thought would be store in browser cookie. Is there any other better options?

      …Mmmmmmaybe… not? Could you describe your whole problem? If you are monitoring feeds on your own server there are probably better ways than to get browsers involved.

        Sir, please suggest. I could only think of either store in file (which I think is not a good options as I will need to open read it every time just to compare), another is store in browser cookie.

        I'm open for any suggestions. This is my first time working on feeds. My entire site were load by perl, and of course I can do read file or read cookie for further decision making before entire page print out. Or I can use ajax if not entire page reload.

      An RSS feed is carried over HTTP. As such, it is expected that the client use the HTTP "if-modified-since" request header to indicate what it wants. Usually the time/date in this field is the same as the time/date in the "last-modified" response header the last time the same resource (file, or in your case, RSS feed) was received via HTTP.

      An RSS feed can be as simple as a file, for example, "feed.rss", updated as needed. In that case, the file can be served by an ordinary webserver and the value of the "last-modified" response header will be the modified time of the file.

      Or, an RSS feed can be generated on demand. In that case, the "last-modified" response header will often be the time of generattion. It could also be the time of the most recent item the feed generator is tracking.

      Either way, the RSS client can specify what it last "saw" using the "if-modified-since" request header. The webserver or feed generator can then act accordingly.

        Thanks RonW, that gives me clearer view about feeds.