in reply to Re: Efficiency: Finding if a file contains a paragraph
in thread Efficiency: Finding if a file contains a paragraph

>Are any of your files huge?

No, there are just a lot of them.

>Are any of your paragraphs huge?

No.

>Are your paragraphs delineated with a single blank line?

Yes.

>Are you looking for an exact paragraph match (eg exact spaces, punctuation, and newlines)?

Yes. Basically I'm writing configuration paragraphs into configuration files. IF the paragraph already exists (a distinct possibility) I don't want to re-write it, but if it doesn't I want to add it.

CT

Charles Thomas
Madison, WI
  • Comment on Re: Re: Efficiency: Finding if a file contains a paragraph

Replies are listed 'Best First'.
Re: Re: Re: Efficiency: Finding if a file contains a paragraph
by revdiablo (Prior) on Jun 02, 2004 at 20:03 UTC
    Are you looking for an exact paragraph match (eg exact spaces, punctuation, and newlines)?
    Yes. Basically I'm writing configuration paragraphs into configuration files.

    The problem with an exact match is that two configuration paragraphs may be semantically equivalent, but syntactically different, e.g. in a different order, or with different whitespace. It seems, to me, that the only foolproof way to avoid duplicates would be an actual parser for the config files. Unless you can ensure that semantically equal paragraphs will be syntactically identical, you might be creating duplicates. And with a lot of files, as you say you have, this would make cleaning up any mistakes just as hard.

    Just something to think about...