in reply to Capturing Space with Split

Listen to the Brethren - show us your problem! The code below is taken from I know what I mean. Why don't you? and demonstrates a very simple way to include test data in your post to demonstrate a problem.

while (<DATA>) { print $_; } __DATA__ Hello world

Let's edit that a little to perhaps get closer to the question you want to ask:

use warnings; use strict; while (<DATA>) { chomp; my @sentences = split /(?<=\.)\s*/; print '>', join ("<\n>", @sentences), '<'; } __DATA__ Hello world. Hello Bretheren. Goodbye spaces. All I really want are se +ntences.

Which prints:

>Hello world.< >Hello Bretheren.< >Goodbye spaces.< >All I really want are sentences.<

Now, your job is to modify that to demonstrate the problem you are having and to show us the output you expect to get.


DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Capturing Space with Split
by Gavin (Archbishop) on Mar 13, 2006 at 21:20 UTC
    Perhaps I could have been more specific at the outset, but explaining what you would like the code to do when you are not very sure yourself how to go about the task is rather difficult!
    use warnings; use strict; while (<DATA>) { chomp; my @sentences = split /(?<=\.)\s*/; #print '>', join ("<\n>", @sentences), '<'; foreach(@sentences){ print "$_\n"; } } __DATA__ Hello world. Hello Bretheren. Goodbye spaces. All I really want are se +ntences. [download] Which prints: Hello world. Hello Bretheren. Goodbye spaces. All I really want are sentences. Your code >Hello world.< >Hello Bretheren.< >Goodbye spaces.< >All I really want are sentences.< I would like: Hello world. Hello Bretheren. Goodbye spaces. All I really want are sentences.
    Your code print '>', join ("<\n>", @sentences), '<'; does exactly what I want except for the > < I would like the data without the space to left after the first line.

      If I understand your reply, you now have a solution to your problem. The "trick" was to gobble up the spaces following the full stop with the \s* in split's regex: split /(?<=\.)\s*/.

      The reason for the angle brackets around the text was to demonstrate that there were no hidden spaces - white space can be hard to see. Printing without them is as simple as: print join "\n", @sentences;.

      While you are exploring the Monastery I strongly recommend that you wander into the tutorials section and have a good browse there. There is a lot of material that should help someone starting out on their Perl journey.

      Oh, and welcome to The Monastery.


      DWIM is Perl's answer to Gödel