uajith has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks, I am writing a PERL script to manipulate a big text which contains around 5 paragraphs. Is it advisable to store the entire text in a single variable ? Or which will be the best approach ?

Replies are listed 'Best First'.
Re: Big paragraphs in Perl
by Happy-the-monk (Canon) on Jun 21, 2013 at 11:39 UTC

    which will be the best approach

    It depends TM... on how big is big? and how are you going to manipulate the text.

    If it isn't actually 4 GB of xml but a simple text, just try any way that seems to be the right way
    ...and deal with any problems at the time they arise. The monks will be here for future consultation.

    Cheers, Sören

    (hooked on the Perl Programming language)

      Et tu, Sören? Trademarking stifles!

Re: Big paregraphs in Perl
by Eily (Monsignor) on Jun 21, 2013 at 15:21 UTC

    As irah said, processing the document line by line could be a good idea. But there's probably no problem with doing otherwise.

    A Perl "trick" you may find useful is changing your script definition of what a line is, for exemple you could read your input file paragraph by paragraph instead of line by line, if a single line isn't enough information for you to work with. Check the documentation on $/ for that. For example, if your paragraphs are separated by a ----- line you could write :

    { local $/ = "\n-----\n"; # We make sure to localize the reading behav +iour to the inner block while(my $paragraph = <$yourInputFile>) { # code that processes the data } } # At this point we go back to a normal reading behaviour
Re: Big paregraphs in Perl
by kcott (Archbishop) on Jun 22, 2013 at 00:53 UTC

    G'day uajith,

    Welcome to the monastery.

    Replies already posted regarding what your concept of "big" is are valid.

    I'll just point out that Perl provides a way to read files in paragraph mode, as follows:

    { local $/ = ""; while (<$file_handle>) { # $_ contains a paragraph of text read from $file_handle # Process each paragraph here } }

    Search for $INPUT_RECORD_SEPARATOR in perlvar for details.

    -- Ken

Re: Big paregraphs in Perl
by irah (Pilgrim) on Jun 21, 2013 at 11:52 UTC

    If you are going to manipulate a text, then my choice would be parse line by line instead of paragraph.

Re: Big paregraphs in Perl
by locked_user sundialsvc4 (Abbot) on Jun 21, 2013 at 13:00 UTC

    But even then, you could still split() the big-text by paragraph if you come up with a suitable regular-expression by which to do it.   Really, the answer to this question is, “it’s up to you.”   Perl certainly won’t blink at strings or other data structures that are many megabytes in size, including “paragraphs” which seem to you to be “big.”

    OP, if you will give us more details about what you are actually facing, we can step away from this water-cooler and help you.   Details, please.   :-)

Re: Big paregraphs in Perl
by space_monk (Chaplain) on Jun 21, 2013 at 19:19 UTC
    That's not a big text....this is a big text!

    Crocodile Dundee quotes aside, it depends on what you are doing with your text. 5 paragraphs is nothing in terms of data size, but its possible that if you know what you want to do with the text we can tell you how it should be stored.

    If you spot any bugs in my solutions, it's because I've deliberately left them in as an exercise for the reader! :-)