Grabbing first column of text

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I have a problem -- I have a text file like this:

This is a line.
Another line is here.
  Here's a line.

Line five is this one.

What I need to do is get the first column of each of these (columns being ended by spaces), so that my output looks like this:

This
Another


Line

As you can see, if the line begins with spaces or is empty, I just want to print out a blank line so I can tell that there was something there. I would do this on my own, but this is just a little part of a much bigger project of mine, and I'm still just learning Perl and I'm really bad with regular expressions. Any help would be much appreciated (and I promise to learn from your answers before I return!). Thanks!

Comment on Grabbing first column of text

Replies are listed 'Best First'.
Re: Grabbing first column of text by japhy (Canon) on Aug 02, 2001 at 20:15 UTC
You simply want to use `($first) = $line =~ /(\S)/;` _____________________________________________________ Jeff `japhy` Pinyan: Perl, regex, and perl* hacker. `s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;`	[reply]
Re: Re: Grabbing first column of text by twerq (Deacon) on Aug 02, 2001 at 22:40 UTC
`You simply want to use ($first) = $line =~ /(\S*)/;` [download] I believe that the original poster also wanted to capture a series of blank spaces, if that was what occupied the first element of a line. \S matches non-whitespace characters only. . . . . . the split solutions seem much more well-suited. --twerq	[reply] [d/l]
Re: Re: Re: Grabbing first column of text by japhy (Canon) on Aug 02, 2001 at 22:53 UTC
No, the original post says: As you can see, if the line begins with spaces or is empty, I just want to print out a blank line so I can tell that there was something there. He either wants the column of text, or nothing. That is what I give him. _____________________________________________________ Jeff `japhy` Pinyan: Perl, regex, and perl hacker. `s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;`	[reply]
Re: Re: Re: Grabbing first column of text by Anonymous Monk on Aug 02, 2001 at 22:52 UTC
No, if you'll read the entire post, you'll notice that I said "... if the line begins with spaces or is empty, I just want to print out a blank line..." Anyway, the first two examples here do just what I'm looking for. Thanks, guys!	[reply]
Re: Grabbing first column of text by CheeseLord (Deacon) on Aug 02, 2001 at 20:13 UTC
Try one of these one-liners (they have the same output, but I thought the second might be a little easier to understand): `perl -ple '($_) = /^\S+/g' filename perl -ple '$_ = (/^\S+/g)[0]' filename` [download] Basically, it reads a line, then sets it to the first group of non-whitespace stuff at the beginning of the line, and then prints the changed line out. Hope this helps! His Royal Cheeziness	[reply] [d/l]
Re: Grabbing first column of text by tachyon (Chancellor) on Aug 02, 2001 at 20:37 UTC
Here is a nice simple example. We read from the DATA file handle but it could just as well be a file handle you open with an `open FILE, "<file.txt" or die "Oops $!\n"` `while (<DATA>) { ($first) = $_ =~ m/^(\S+)/; print "$first\n"; } __DATA__ This is a line. Another line is here. Here's a line. Line five is this one.` [download] The while iterates over the filehandle assigning each line to $_. The next line is a standard perl idiom to capture a regex match. We match all non whitespace at the begining of the each line. We specify the beginning with the ^ and the non whitespace with the \S the + after \S specifies 1 or more of (and as much as possible) You can also do this with split. Split returns an array of values so (split/\s/,$_)[0] is the first value when we split $_ on whitespace. Note that in the example I don't bother to specify $_ as this is the default string that split works on. `while (<DATA>) { $first = (split/\s/)[0]; print "$first\n"; }` [download] Getting completely carried away you can also do it using substr and index. Index gets you the postion of the first space and substr gets you the string. `while (<DATA>) { $first = substr $_, 0, index $_, " "; print "$first\n"; }` [download] Hope this helps. cheers tachyon s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print	[reply] [d/l] [select]
Re: Grabbing first column of text by Hofmator (Curate) on Aug 02, 2001 at 20:47 UTC
Your question has already been answered. I just want to give you some hints for the more general case (n-th column or you need more than one column): Don't try to use a regex there, use split: `my @columns = split /\s/, $line; # or my $col3 = (split /\s/, $line)[2]` [download] Depending on what you would like to get, split on / /, /\s*/ or ' '. Or whatever your column delimter might be. -- Hofmator	[reply] [d/l]