Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??
Yes, it is possible. Did you try and if so, how? (see How do I post a question effectively?)

update after the OP's update

Documentation

You get, as you might know already, documentation for split, hash and map typing e.g. perldoc -f split on the command line of your shell.

About perl being cryptic

The cryptiness of perl expressions will vanish as you become familiar with it's main concepts, two of them being data types (scalars, arrays, hashes and references, which are, well, scalars) and context. The latter is perl's curse and blessing and the origin of much of it's perceived cryptiness, because many functions behave different depending on the context they are invoked in. Even simple assignments do different things to the right-hand-side before the assignment is done to the left-hand-side of =, depending on what type the lvalue is.

The text file contains just one record per line, right?

Bilbo Baggins, Under The Hill Sam Gamgee, Bagshot Row

in, say, a file named addrfile.txt; then I would say e.g. (TIMTOWTDI - There's More Than One Way To Do It)

1 #!/usr/bin/perl 2 3 my $file = 'addrfile.txt'; 4 5 open I, '<', $file 6 or die "Can't open '$file' for reading: $!\n"; 7 8 chop (%hash = map { split /\s*,\s*/,$_,2 } grep (!/^$/,<I>)); 9 10 # print out that hash 11 12 print "$_ => $hash{$_}\n" for keys %hash;
which results in
Bilbo Baggins => Under The Hill Sam Gamgee => Bagshot Row
and looks pretty cryptic.

Explanation, per line (except empty ones :-)

Line 1 tells the OS which interpreter to use.
Line 3 assigns the file to be processed to the variable $file.
Line 5 tries to open the file for reading associating it with the filehandle I, which on
line 6 leaves to a program abort (die) on failure to do so.

Line 8 is where things get interesting. It "just" contains an invocation of chop (LIST)
chop operates on what is inside the outer round parens, which is the result of an assigment - the %hash; so chop removes line endings on the values of the hash %hash. chop is context sensitive.
The right hand side (rhs) of the assignment inside the round parens for chop is a map statement: map BLOCK LIST. The LIST is returned by grep which operates on a LIST. So, the second argument to the grep function is evaluated in list context, which forces the <> operator (which is a funny way to say readline(FILEHANDLE)) to return a list containing the lines of addrfile.txt.
The first argument to grep (!/^$/) says "gimme all that isn't an empty line - see perlre. This list is passed to map.
In map each element is processed by what is contained in BLOCK ({ }), and the results of that processing (the results from the last evaluated statement) are returned as a list - which in this case are key/value pairs resulting from the split operation inside the block.
Now, split (split /\s*,\s*/,$_,2) just splits each line as returned from grep (and assigned to $_ inside map) into two elements via the regular expression /\s*,\s*/, meaning "zero or more whitespace chars, a comma, and zero or more whitespace chars" are taken as boundary between elements.

There. The result of map is a sequence of key/value pairs, which are assigned to a hash, which is then chopped.

Line 12 just prints out the hash as key => value.

The above could be written more verbosely like this:

while (my $line = <I>) { chop $line; push(@lines,$line); } @lines = grep (!/^$/, @lines); foreach my $line(@lines) { my ($name, $addr) = split /\s*,\s*/,$_,2; $hash{$name} = $addr; }
but in order to address most of the points in your post I showed you the "cryptic" one first :-)

Also, if you examine this code, you will find that the contents of the hash will allocate memory twice with both my examples - first building up a list, then assigning that list to a hash. For small files that's ok, but for larger files you'd say rather

while (my $line = <I>) { chop $line; my ($name, $addr) = split /\s*,\s*/,$_,2; $hash{$name} = $addr; }
as shown by other replies to your post.

--shmem

update: small fixes (grammar and such)

_($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                              /\_¯/(q    /
----------------------------  \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}

In reply to Re: Using Split to load a hash by shmem
in thread Using Split to load a hash by Grey Fox

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others exploiting the Monastery: (4)
As of 2024-04-18 04:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found