topperge has asked for the wisdom of the Perl Monks concerning the following question:

I have an ldif file that I need to edit:

dn: cn=MTOP,cn=users,dc=myco,dc=com
orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc=com
cn: MTOP
orclactivestartdate: 20031028060637Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
sn: Joe
givenname: Blow
o: myo
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==

dn: cn=me,cn=users,dc=polk,dc=com
orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myco,dc=com
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==
orclactivestartdate: 20031030114555Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
cn: me
uid: me
sn: Administrator
givenname: me
o: myco

I need to take every line I find with
cn: <name here>
copy it and add a line right after it that reads
uid: <name here>

i.e. the final output should look like

dn: cn=MTOP,cn=users,dc=myco,dc=com
orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc=com
cn: MTOP
uid: MTOP
orclactivestartdate: 20031028060637Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
sn: Joe
givenname: Blow
o: myo
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==

In some cases there is already a UID, in other cases there is only a cn in other cases there is not and that needs to be captured. The number of lines in the record is variable as well. I'm stumped on this one guys. Any ideas?

Replies are listed 'Best First'.
Re: Perl File Manipulation
by davido (Cardinal) on Jul 09, 2004 at 03:53 UTC

    As a one liner...

    perl -pi.bak -e 's/^cn:(.+)\n$/cn: $1\nuid: $1\n/;' file.name

    See perlrun, perlreintro, and perlretut for more information on this one liner's technique.

    The -p switch causes Perl to iterate through all the files in @ARGV, line by line, reading each line into $_, and then at the end of each loop iteration printing $_. The -i switch specifies "inline editing", which is implemented by opening a tempfile for output, and the input file for...input. After each iteration, $_ is printed to the temp file. Then at the end of script execution the original file is renamed with '.bak' appended to its name, and the tempfile is renamed to what the original file's name was. So when you're all done, it looks like your original file got modified, and its pre-modification version was saved with the .bak filename extension.

    Update: I just realized this doesn't solve the issue of the uid: already existing in some places. Sorry.


    Dave

Re: Perl File Manipulation
by Zaxo (Archbishop) on Jul 09, 2004 at 04:11 UTC

    The one-liner, perl -ni -e'print; print "uid:", $1, $/ if /^cn:(.*)/;' foo.file

    Update: Argh, I missed that bit, too. Off the top of my head, perl -ni -e'print; if (/^cn:(.*)/) { my $uid = "uid:$1\n"; my $nl = <>; print $nl =~ /^uid:/ ? "": $uid, $nl}' foo.file Untested.

    After Compline,
    Zaxo

Re: Perl File Manipulation
by wfsp (Abbot) on Jul 09, 2004 at 07:22 UTC
    Would this help?
    use strict; use warnings; open OUTPUT, '>', 'output.txt' or die; my @record; my $uid_flag = 0; while (<DATA>){ chomp; unless ( /^$/ ){ push @record, $_; $uid_flag++ if /^uid/; } else{ for ( @record ){ print OUTPUT "$_\n"; if ( /^cn/ and ! $uid_flag ){ my ( $field, $value ) = split ':'; print OUTPUT "uid:$value\n"; } } @record = (); $uid_flag = 0; print OUTPUT "\n"; } } close OUTPUT; __DATA__ dn: cn=MTOP,cn=users,dc=myco,dc=com orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc +=com cn: MTOP orclactivestartdate: 20031028060637Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 sn: Joe givenname: Blow o: myo userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== dn: cn=me,cn=users,dc=polk,dc=com orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myc +o,dc=com userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== orclactivestartdate: 20031030114555Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 cn: me uid: me sn: Administrator givenname: me o: myco
    Produces...
    dn: cn=MTOP,cn=users,dc=myco,dc=com orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc +=com cn: MTOP uid: MTOP orclactivestartdate: 20031028060637Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 sn: Joe givenname: Blow o: myo userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== dn: cn=me,cn=users,dc=polk,dc=com orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myc +o,dc=com userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== orclactivestartdate: 20031030114555Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 cn: me uid: me sn: Administrator givenname: me o: myco
Re: Perl File Manipulation
by graff (Chancellor) on Jul 09, 2004 at 12:57 UTC
    If the data format is consistent in terms of always having at least one blank line between consecutive records, and never having a blank line within a record, then you can set the INPUT_RECORD_SEPARATOR to read one whole record at a time into $_, and this can simplify things a lot:
    { local $/ = ""; # set to empty string -- see perldoc perlvar while (<>) { if ( /(\ncn: (\S+))\n(\S+)/ ) { my ( $cnline, $cnval, $nexttag ) = ( $1, $2, $3 ); if ( $nexttag ne "uid" ) { s/$cnline/$cnline\nuid: $cnval/; } } else { warn "Record $. has no cn value\n"; } print; } }
    (not tested)
      The line:
      if ( $nexttag ne "uid" ) {
      needs to be:
      if ( $nexttag ne "uid:" ) {
      (tested!)
      Picky or what!
      I'm not familiar with the 'LDAP Data Interchange Format' (to say the least) but is 'uid' always bound to follow 'cn'?
      wfsp
        Thanks for the fix (and the test).
        I'm not familiar with the 'LDAP Data Interchange Format' (to say the least) but is 'uid' always bound to follow 'cn'?

        I don't know either, but if it varies, reading the data one whole record at a time will make it a lot easier to handle the variation.