in reply to Match a string that can contain a carriage return in a random position.

Keep in mind that you are reading LDIF file. This means that the splitting of the line:

1. Will not always happen on the 80s character
2. The splitting can happen to any LDAP attribute, not just the DN

The following two ideas may help you:
1. Use Perl API's for reading LDIF (perldap-1.4.1 can do this). You will have to play around with API calls to prevent the API functions from reading all of the 2G file at once.

Pros: It should work, and you can even write back, to the server, the created LDAP Entry objects.

Cons: Mozilla LDAP API is based on C compiled libraries, however I have no idea how fast the resulting code will be.

2. As far as I'm aware when line in LDIF is split, the next line will begin with blank space. It should be a very easy for you to write an filter program will 'unsplit' big LDIF file. (btw: personally, i think that changing the line break separator is a totally wrong way to go here)


P.S.: Oh.. and you can try looking for that command line switch, that will prevent your whatever2ldif utility from doing the splitting ;-)
  • Comment on Reading LDIF (was: Re: Match a string that can contain a carriage return in a random position.)