Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Re: designing a program - your wisdom needed

by bliako (Monsignor)
on Jan 20, 2022 at 18:58 UTC ( [id://11140655]=note: print w/replies, xml ) Need Help??


in reply to designing a program - your wisdom needed

I would also go with what other Monks suggested: to use a sub for each of your proposed scripts' logic. One of their input parameters should be the DB handle. Once you have the three subs, you can still create 3 scripts, as per your original proposal, which reads in DB credentials, create a DB handle and call the appropriate sub. So you have both worlds (for whatever reason). (Edit: or call the three subs from one script which at the beginning asks or reads DB credentials once.)

One point is unclear to me, you mention "join". Does it refer to SQL JOIN or to concatenate?

Regarding your proposed "temporary table". It may not be necessary to create temporary tables if all you need is to pull results from DB, filter or concatenate and save. Perhaps you can save these results in Perl variables and do the transformations in Perl, but only if you are operating from a single Perl script calling the 3 subs (and data is small, though SQL can do it better in the DB, if one can write it that is ;) ). I don't know if this will be preferable than using the DB as temporary storage, it depends on your SQL knowledge and size of results to be transformed.

Also, I have recently used Redis which is a nosql (temporary) data store, which more-or-less acts like a Perl Hash, but accessible from many programs within the same machine (or remote). I used it to share temp data between scripts (like your proposed 3 scripts) when I was too lazy to write SQL and do it within the DB. That data was like a string blob, JSON string etc. But Redis did not care about any structure. Which was very convenient in storing it and retrieving it, no questions asked regarding structure. If you find this idea interesting but you are not allowed to install Redis, here is a very lame implementation in pure Perl: Simple data-store with Perl . Note that you can store Perl data structures, e.g. nested hashes, into files or DB blobs by using Data::Serializer or Storable

  • Comment on Re: designing a program - your wisdom needed

Replies are listed 'Best First'.
Re^2: designing a program - your wisdom needed
by SpaceCowboy (Acolyte) on Jan 20, 2022 at 21:59 UTC
    Thank you for your message. I meant an sql join. Like other monks and yourself have suggested, I will look at creating subroutines and get back to the thread. The tables are in few million rows.
    So, you are using Redis as staging? I belive Redis is memory hungry and may take a lot of physical memory.
    I was unable to understand to fully understand the "simple data store wth perl" link, what I gather is that it connects wtih a remote server and loops into some kind of regex expression? pardon my ignorance, I would love to learn your intention here and how this applies to relational tables.
      I meant an sql join.

      so, it's better to do that within the DB at least between scripts 1 and 2. (and to be clear on my above writing, doing things into DB is usually much better than retrieving data and transforming it in your Perl script *provided you write the correct SQL* - which I find almost impenetrable, that's why I am always looking for alternative, albeit roundabout and perhaps inefficient, ways. Bottomline: sticking with the DB is better, usually.)

      I belive Redis is memory hungry and may take a lot of physical memory.

      I have not noticed anything upnormal there. You can always try it out and see. It was quite fast for me. Can't say anything about memory usage.

      The "simple data store", I linked, acts as a server which clients (let's say your scripts) contact in order to either store a string (which can be some JSON or XML or with minor modification to be a binary serialised perl hash or array, possibly nested) by key, or retrieve a string by key. The regex you refer to implements the simplest API to do that. That is, it checks if client gave it something like KEY1=VALUE1, in which case it stores it. Alternatively, client can send a KEY1=, in which case it looks in its private hash-store if KEY1 exists and sends back to client the value stored. Warning: all checks for storing and retrieving data done in DBs is absent, e.g. handle race conditions, etc. Also, there is no encryption or password protection, all scripts can see all data provided they know the key.

      The "simple data store" does not apply to relational tables directly, it just offers a way for separate, independent scripts or programs (in various languages) to share some temporary data between them. It does not replace a database. It just offers a way to avoid using the DB as a temp data store. One example of use: script1 does data scraping and processing at irregular intervals. It places raw results in the data store. script2 run as a "CGI" checks if results exist in data store and converts them to HTML for viewing. For me script1 was in C, and script2 in Perl. It saved me a lot of trouble to do that (edit: ) instead of (end edit) with temp DB tables (ok, sqlite is a bit easier).

      That's my experience which is not industrial. Others here have industry experience.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11140655]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2024-04-26 08:09 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found