Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi I have been assigned a task of writing a parsing engine for a legacy system. I will be extracting information from reports which are dumped as text files and need to move the information into a MSFT SQL Database.

The infornmation comes in multiple formats so I want to define xml files with the information about the file to be parsed. It will hold header info, field names, delimeters,etc.

Then I want to pass the config.xml file and the text dump file to the engine to output to the database.

Does this sound like a job for perl? I have been told this is the best approach as this is what perl does best.

Any pointers to some code which would get me started on this would be appreciated.

Sal

  • Comment on looking for some guidance before I dive in

Replies are listed 'Best First'.
Re: looking for some guidance before I dive in
by jdtoronto (Prior) on Oct 21, 2003 at 13:55 UTC
    Whilst chuckling at DrHydes sledgehammer reference (I think he has a good point there!) I think you have been well advised. One of my clients handles vast amounts of data that comes in a variety of formats. Some of it even as Excel spreadsheets! A small fistful of Perl tools including a 'universal data format convertor' I wrote recently to handle pretty much any text file layout allows him to easily get the files and put them into a database.

    For routine jobs that are done every day we handle the files from an email server, the incoming files are always sent to an appropriate user name and a Perl programme collects the email, reads the attachment, processes the data and adds it to a database to be checked before it is made available to the larger pool of users.

    This is indeed the stuff at which Perl excells!

    jdtoronto

Re: looking for some guidance before I dive in
by DrHyde (Prior) on Oct 21, 2003 at 13:30 UTC
    With the caveat that I wouldn't use the sledgehammer of XML to crack this sort of nut, yes, probably. Perl excels at reading, parsing, mangling and writing text.
      Thanks for your reply. We use XML configuration files as a standard here and I just figured I would carry them over to this project as well. This way the developer who is working on the web front end to create and modify the configuration file is not confused (Which seems like an easy think to do). Regards Sal
        That's sensible thinking. I trust that you had the same thoughts about your developers' skills when figgering out what language to use :-)
Re: looking for some guidance before I dive in
by inman (Curate) on Oct 21, 2003 at 14:55 UTC
    Converting the reports into a data format that can be sucked into MS SQL sounds just the job for Perl.

    You can either write a system that uses configuration files or just a number of similar scripts. The choice here depends on whether you are doing a one-time transition of legacy data or something that needs to be repeated and maintained.

    In terms of the 'getting data into SQL Server' question, I would look at writing a Perl script that used the Text::CSV module as a means of writing correctly formatted CSV files. These files can then be sucked into SQL server using the Bulk Import facility which is much better than trying to do SQL Updates.

    If you are looking to configure your script using XML files then take a look at XML::Simple which should allow you to read XML files with the minimum of fuss.

    Enjoy!
    Inman

      With the caveat that you lose a lot of control by going with Bulk Importing facilities, I would agree with you. If you need any sort of special handling, especially for errors, I would take the time and build the SQL. It doesn't take that long, frankly.

      ------
      We are the carpenters and bricklayers of the Information Age.

      The idea is a little like C++ templates, except not quite so brain-meltingly complicated. -- TheDamian, Exegesis 6

      ... strings and arrays will suffice. As they are easily available as native data types in any sane language, ... - blokhead, speaking on evolutionary algorithms

      Please remember that I'm crufty and crochety. All opinions are purely mine and all code is untested, unless otherwise specified.