Perl File Manipulation

topperge has asked for the wisdom of the Perl Monks concerning the following question:

I have an ldif file that I need to edit:

dn: cn=MTOP,cn=users,dc=myco,dc=com
orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc=com
cn: MTOP
orclactivestartdate: 20031028060637Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
sn: Joe
givenname: Blow
o: myo
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==

dn: cn=me,cn=users,dc=polk,dc=com
orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myco,dc=com
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==
orclactivestartdate: 20031030114555Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
cn: me
uid: me
sn: Administrator
givenname: me
o: myco

I need to take every line I find with
cn: <name here>
copy it and add a line right after it that reads
uid: <name here>

i.e. the final output should look like

dn: cn=MTOP,cn=users,dc=myco,dc=com
orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc=com
cn: MTOP
uid: MTOP
orclactivestartdate: 20031028060637Z
objectclass: top
objectclass: person
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: orclUser
objectclass: orclUserV2
sn: Joe
givenname: Blow
o: myo
userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg==

In some cases there is already a UID, in other cases there is only a cn in other cases there is not and that needs to be captured. The number of lines in the record is variable as well. I'm stumped on this one guys. Any ideas?

Comment on Perl File Manipulation

Replies are listed 'Best First'.
Re: Perl File Manipulation by davido (Cardinal) on Jul 09, 2004 at 03:53 UTC
As a one liner... `perl -pi.bak -e 's/^cn:(.+)\n$/cn: $1\nuid: $1\n/;' file.name` [download] See perlrun, perlreintro, and perlretut for more information on this one liner's technique. The -p switch causes Perl to iterate through all the files in @ARGV, line by line, reading each line into $_, and then at the end of each loop iteration printing $_. The -i switch specifies "inline editing", which is implemented by opening a tempfile for output, and the input file for...input. After each iteration, $_ is printed to the temp file. Then at the end of script execution the original file is renamed with '.bak' appended to its name, and the tempfile is renamed to what the original file's name was. So when you're all done, it looks like your original file got modified, and its pre-modification version was saved with the .bak filename extension. Update: I just realized this doesn't solve the issue of the uid: already existing in some places. Sorry. Dave	[reply] [d/l]
Re: Perl File Manipulation by Zaxo (Archbishop) on Jul 09, 2004 at 04:11 UTC
The one-liner, `perl -ni -e'print; print "uid:", $1, $/ if /^cn:(.)/;' foo.file` Update: Argh, I missed that bit, too. Off the top of my head, `perl -ni -e'print; if (/^cn:(.)/) { my $uid = "uid:$1\n"; my $nl = <>; print $nl =~ /^uid:/ ? "": $uid, $nl}' foo.file` Untested. After Compline, Zaxo	[reply] [d/l] [select]
Re: Perl File Manipulation by wfsp (Abbot) on Jul 09, 2004 at 07:22 UTC
Would this help? use strict; use warnings; open OUTPUT, '>', 'output.txt' or die; my @record; my $uid_flag = 0; while (<DATA>){ chomp; unless ( /^$/ ){ push @record, $_; $uid_flag++ if /^uid/; } else{ for ( @record ){ print OUTPUT "$_\n"; if ( /^cn/ and ! $uid_flag ){ my ( $field, $value ) = split ':'; print OUTPUT "uid:$value\n"; } } @record = (); $uid_flag = 0; print OUTPUT "\n"; } } close OUTPUT; __DATA__ dn: cn=MTOP,cn=users,dc=myco,dc=com orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc +=com cn: MTOP orclactivestartdate: 20031028060637Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 sn: Joe givenname: Blow o: myo userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== dn: cn=me,cn=users,dc=polk,dc=com orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myc +o,dc=com userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== orclactivestartdate: 20031030114555Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 cn: me uid: me sn: Administrator givenname: me o: myco [download] Produces... dn: cn=MTOP,cn=users,dc=myco,dc=com orcldefaultprofilegroup: cn=myco,cn=portal_groups,cn=groups,dc=myco,dc +=com cn: MTOP uid: MTOP orclactivestartdate: 20031028060637Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 sn: Joe givenname: Blow o: myo userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== dn: cn=me,cn=users,dc=polk,dc=com orcldefaultprofilegroup: cn=user_grp,cn=portal_groups,cn=groups,dc=myc +o,dc=com userpassword: {MD4}oLezu+AceuHBDEBHFuVDwg== orclactivestartdate: 20031030114555Z objectclass: top objectclass: person objectclass: inetOrgPerson objectclass: organizationalPerson objectclass: orclUser objectclass: orclUserV2 cn: me uid: me sn: Administrator givenname: me o: myco [download]	[reply] [d/l] [select]
Re: Perl File Manipulation by graff (Chancellor) on Jul 09, 2004 at 12:57 UTC
If the data format is consistent in terms of always having at least one blank line between consecutive records, and never having a blank line within a record, then you can set the INPUT_RECORD_SEPARATOR to read one whole record at a time into $_, and this can simplify things a lot: `{ local $/ = ""; # set to empty string -- see perldoc perlvar while (<>) { if ( /(\ncn: (\S+))\n(\S+)/ ) { my ( $cnline, $cnval, $nexttag ) = ( $1, $2, $3 ); if ( $nexttag ne "uid" ) { s/$cnline/$cnline\nuid: $cnval/; } } else { warn "Record $. has no cn value\n"; } print; } }` [download] (not tested)	[reply] [d/l]
Re^2: Perl File Manipulation by wfsp (Abbot) on Jul 09, 2004 at 18:03 UTC
The line: `if ( $nexttag ne "uid" ) {` [download] needs to be: `if ( $nexttag ne "uid:" ) {` [download] (tested!) Picky or what! I'm not familiar with the 'LDAP Data Interchange Format' (to say the least) but is 'uid' always bound to follow 'cn'? wfsp	[reply] [d/l] [select]
Re^3: Perl File Manipulation by graff (Chancellor) on Jul 09, 2004 at 22:20 UTC
Thanks for the fix (and the test). I'm not familiar with the 'LDAP Data Interchange Format' (to say the least) but is 'uid' always bound to follow 'cn'? I don't know either, but if it varies, reading the data one whole record at a time will make it a lot easier to handle the variation.	[reply]
Re^4: Perl File Manipulation by topperge (Initiate) on Jul 11, 2004 at 17:44 UTC
Re^5: Perl File Manipulation by graff (Chancellor) on Jul 12, 2004 at 02:00 UTC