in reply to Columnwise parsing of a file

BrowserUk got in ahead of me. Here is one way to implement his solution:

#! perl use strict; use warnings; use Data::Dump; my @matrix; my $row = 0; push @{$matrix[$row++]}, split while <DATA>; dd @matrix; print 'element at col 3, row 2 is ', get_element(\@matrix, 3, 2), "\n" +; sub get_element { my ($matrix_ref, $col, $row) = @_; return $matrix_ref->[$row - 1][$col - 1]; } __DATA__ 20 30 40 60 70 80 90 100 49

Output:

23:50 >perl 548_SoPW.pl ([20, 30, 40], [60, 70, 80], [90, 100, 49]) element at col 3, row 2 is 80 23:50 >

Update: A simpler syntax for populating the array:

my @matrix; push @matrix, [ split ] while <DATA>;

Hope that helps,

Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Replies are listed 'Best First'.
Re^2: Columnwise parsing of a file
by ghosh123 (Monk) on Feb 26, 2013 at 09:51 UTC

    Hi

    Thanks for replying ! It surely helps.
    But what if the data file is having the following content

    Product ID Product Name Product Type Cost
    1 TV set Entertainment 10k
    How in this case I would handle the spaces each cell (eg: col 0 row 1 etc) content is having.

    Please notice the space in cell name 'Product ID', it is not 'ProductID'. Also on the other hand only 'Cost' is another cell name with no space in between .

    Here col 2 row 1 should give me : Product Name
    and
    col 2 row 2 should give : TV set

      But what if the data file is having the following content

      If you have fields with embedded spaces, separated by spaces, and no quoting, you're stuffed.

      Are you producing this file or getting it from someone else?

      Are you sure that the fields are separated by spaces and not tabs?


      With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

      If your data is formatted that liberal, you are completely on your own. There ought to be rules for determining where fields/columns start and end. If there are no rules, you cannot parse. Period.

      Is the current "format" the only possible format? Can the "data" be generated as something that does have rules, like CSV? When the data is well-formatted CSV, you can use Text::CSV_XS to parse the data and use all advice already given, or even easier, use Spreadsheet::Read (in combination with Text::CSV_XS) to get direct access to every "cell" in your dataset.


      Enjoy, Have FUN! H.Merijn
      p

      In the absence of field delimiters you need to make use of the structure of your data.

      In this case is the Product ID always an integer? Is the cost always a single word ( [\w\d]+ )? and is the Product type a single word?

      Without some such pattern, a general solution, may not be possible.

      How in this case I would handle the spaces each cell (eg: col 0 row 1 etc) content is having.

      If you are using spaces as field delimiters, and you have (non-escaped) spaces in your data, you don't handle it. You're asking the wrong question and trying to solve the wrong problem.

      The problem isn't how to parse the data, it's how to get valid data. Data in a format that can't be cleanly parsed is what we usually call garbage data.

      A well known maxim in the Database world (and elsewhere in IT), is "Garbage in, Garbage out". If you can't provide good data to process, or come up with a way to clean up your data before processing, you will never valid, reliable, trustworthy results out.

      Additionally, once you find a way to either get clean data or properly clean up your data, the parsing will likely be much simpler to figure out.

      Christopher Cashell