Foxyirish1987 has asked for the wisdom of the Perl Monks concerning the following question:

ok so as a start i am a complete beginner at all of this. My name is Luke 22 year old from ireland. I have a text file and I need to get certain data out of it. the text file is hundreds of lines long. It is all about testers etc. What I need to do is take certain data from teh text file and input it in an excel file in certain colums and rows. It is sort of ahrd to explain what I need with out having the report uploaded somewhere so is there anywhere I can do this as a start? I have been onto some ppl and they put me onto perl as the best way to do it but they're all busy so they can't help. I repeat i'm useless at programming so any help would be much appreciated.

  • Comment on Gettin certain lines of text from a text file!

Replies are listed 'Best First'.
Re: Gettin certain lines of text from a text file!
by Utilitarian (Vicar) on Sep 15, 2009 at 11:26 UTC
    How do you eat an elephant?
    One bite at a time

    You need to break this problem down into sub sections that are manageable. Bfeider már seo:

    • extract data from a text file
      • CSV or plain text?
      • How is the data identifiable?
      • Have you managed to extract the data yet, if not how far have you got?
      • Show us what you have tried
    • Add it to the Excel file
      • Select the appropriate worksheet in the Excel file
      • Select the correct row and column
      • Write your data as a string/number to this cell
    This is the heart of what programming is, breaking a problem down to manageable sub-problems. Of course there are those who can do this faster, "take larger bites", see an inspired way to grease the elephant and swallow it in one go (stop with the metaphors already). However baby steps first ;)

      ok here is the first part of the text file. i need to extract the date from this

      From: g027778@msx.stp.guidant.com Sent: Thursday, September 03, 2009 08:33 To: Steidl, Charles (STP); Gubbins, John (CLN); Horan, Laurence (CLN); Duppong, Lawrence (STP); Donovan, Leighton (CLN); Deasy, Luke (CLN); Walmsley, Mark (STP); Smith, Noel (CLN); Keating, Patrick (CLN); Doyle +, PJ (CLN); Heineman, Scott (STP); Maher, Stephen (CLN); Williams, Thoma +s (STP)(Test Eng); Nguyen, Thuan (STP) Subject: ** Summary of yield for Product: Insignia__Hyb_E1 **** Yield Report **** **** Product Name : Insignia__Hyb_E1_581394 **** Software Product : 581394-201-A **** Test Level : Hyb_E1 **** Part Number : 401666-312 **** Model Number : h666 **** RMI DataBase : prod **** Analysis Date Range : 09/02/2009 00:00:00 to 09/02/2009 23:59:59 **** Tested First/Last : 09/02/2009 02:06:04 to 09/02/2009 19:20:16
      After analysis date range i need to take out the 09/02/2009. this date will change every day so searching for it will not work. i'm trying to use an array to fins analysis date range and then take the next 'x' amount of characters and output them to the excel file. this would be the first step. what i'm comming up with is that I can find the analysis date range but my array etc is not working. And yes I as dropped royally in it given this 2 weeks ago and i'm finishing work next week. only an intern here for the summer. electrical engineer undergraduate and we don't do this sort of stuff often/ever. thanks a mil for all the help though. and I have it broken down. above is the first part!

        So, when you're looking for the date range, you can find the "Analysis Date Range", but then how do you decide what is the date coming after that?

        I would say that you're looking for:

        1. First, the string "Analysis Date Range : " with an arbitrary amount of whitespace around the ':' (/Analysis Date Range\s+:\s+/)
        2. Then, at least one digit, which we want to remember (/([0-9]+)/)
        3. Then, a slash (which needs a backslash to escape it) followed by at least one digit, and record the digits but not the slash (/\/([0-9]+)/)
        4. Then, another slash followed by at least one digit, and again record the digits not the slash (/\/([0-9]+)/)
        5. And the rest doesn't matter
        Combining that all into one:

        # i modifer to make it case insensitive, just in case. if ($line =~ /Analysis Date Range\s*:\s*([0-9]+)\/([0-9]+)\/([0-9])+/i +) { print "I found an Analysis date with month/day $1, $2 in year $3\n"; }else{ print "This line did not have an analysis date that I recognize in i +t\n"; }

        Another way to go, would be to grab larger chunks and then split them later:

        1. "Analysis Date Range : "
        2. Then a pile of digits-and-slashes (start date) ([0-9\/]+)
        3. Then whitespace and digits-and-colons (start time) \s([0-9:]+)
        4. Then more whitespace and some digits-and-slashes (end date)
        5. Finally, whitespace then digits-and-colons (end time)

        PS: Is that date string "month day year" or "day month year"?

        So you need to extract a date once which appears after Sent and keep this value and use it in combination with the Analysis Date Range for each report?
        open (FILE, "<" ,"datafile"); my $sent_date=""; while (<>){ my $this_date_range=""; ($sent_date)=/^Sent: (.*)$/; #Could be more precise, but you're + only expecting one instance of Sent at the start of a line ? ($this_date_range)= /^\*{4}\s+Analysis Date Range\s+:\s(\d{2}\/ +\d{2}\/\d{4})/; add_to_spreadsheet($sent_date,$this_date_range); }
        So, half way there ;) Now you need to implement add_to_spreadsheet(); Answer the questions and use the module suggested initially, it's easy enough to use once you know the steps you intend to take.
Re: Gettin certain lines of text from a text file!
by Zen (Deacon) on Sep 15, 2009 at 13:52 UTC
    If you can't program, you should tell your manager to adjust their expectations. Pulling out a solution from PM every time isn't a solution. In the coming semester, maybe you could focus on it if it's something you expect to be able to do. It sounds like the hard part for you is understanding how to develop an algorithm, though, not perl.
Re: Gettin certain lines of text from a text file!
by Anonymous Monk on Sep 15, 2009 at 10:43 UTC

      read them already, thats why I have not posted untill now trying to figure it out. Thanks but i'm just useless so they did not help at all lol, only helped to give me an idea how to do it but still no where near what I need. at it 2 weeks now so said i'd post and see what happens. Thank you though

        but i'm just useless

        STOP THAT!
        You are not useless, you have been dropped in it.

        Describe the problem to yourself, in writing, in English (or Gaelic, or whatever). Look at the data. How do you choose the data to extract? Describe that in natural language, then you might try converting that into a regular expression, if one is needed. If regular expressions make you weak at the knees then that probably means you are normal - but you can always ask here.

        You might start with very simple code reading one line at a time then just printing them out, then gradually refine what it is that you are printing. Once you have what you are looking for then try writing into a spreadsheet. Or create a CSV and manually load it. Small steps.
        You're telling me the "Files and I/O" and "Simple matching" sections didn't help? Maybe you need a book.