khippy has asked for the wisdom of the Perl Monks concerning the following question:

Hi perlmongers,

my current problem is the following:

I have a ldif input file from an ldap database, which I want to get the groupnames, guid and members from.

The structure is like this:

<some irrelevant data or the same like downwards> dn: cn=groupname1,dc=domain,dc=com cn: groupname1 gidNumber: 122 memberUid: member1 memberUid: member2 memberUid: member3 memberUid: member4 memberUid: member6 memberUid: member7 userPassword:: Kg== objectClass: top objectClass: posixGroup creatorsName: uid=cyrus,dc=domain,dc=com createTimestamp: 20040408083004Z modifiersName: uid=cyrus,dc=domain,dc=com modifyTimestamp: 20040408083004Z <more irrelevant data or the same like above> dn: cn=groupname2,dc=domain,dc=com cn: groupname2 gidNumber: 113 userPassword:: Kg== objectClass: top objectClass: posixGroup creatorsName: uid=cyrus,dc=domain,dc=com createTimestamp: 20031208140152Z memberUid: member1 memberUid: member2 memberUid: member3 memberUid: member4 memberUid: member6 memberUid: member7 description: some irrelevant description modifiersName: uid=cyrus,dc=domain,dc=com modifyTimestamp: 20040404112251Z <more irrelevant data or the same like above>

Each group of information is separated from the others with an empty line as you can see.

Now I want to regexp across the file and collect from each group of information starting with

dn: cn=<groupname>,dc=domain,dc=com

the following data:
cn: <groupname><br> gidNumber: <guid><br> memberUid: <membername>

The aim is to collect this data to use for another cyrus imap mailserver getting the groups information from an mysql database.

How can I do this?

--

there are no silly questions

killerhippy

Replies are listed 'Best First'.
Re: rexexp across a ldap ldif file, collecting cn: guid: memberUID:
by bart (Canon) on Aug 21, 2004 at 20:44 UTC
    Like I said in the CB: read the file in paragraph mode and extract the sections with /^PATTERN/m. Example code:
    #! perl -w $/ = ""; # paragraph mode while(<DATA>) { if(/^dn:/) { # skip irrelevant paragraphs my($cn) = /^cn:\s*(.*)/m; my($gid) = /^gidNumber:\s*(\d+)/m; my(@member) = /^memberUid:\s*(.+)/mg; print "cn: $cn\ngid: $gid\nmembers: @member\n\n"; } } __DATA__ *** Your sample data follows ***
    The printout I get is:
    cn: groupname1 gid: 122 members: member1 member2 member3 member4 member6 member7 cn: groupname2 gid: 113 members: member1 member2 member3 member4 member6 member7
      Thank you for your code and hint. Updated, see downwards

      I tried the code as is, and I get a lot of
      Use of uninitialized value in concatenation (.) or string at ./data li +ne 9, <DATA> chunk <number of data line>.
      and a lot of empty hits
      cn: gid: members:
      mixed up data
      cn: username gid: 102 members:
      but also hits, which are right.
      This might be lead from dn: being inside irrelevant data blocks.

      How to face a block, and only check it out, if it starts with "dn: cn" at the very first line? This would disregard irrelevant data blocks.

      Update:
      if(/^dn: cn/) { # skip irrelevant paragraphs
      does exactly what I need. Paragraph mode is the one!
      supercool
      --

      there are no silly questions

      killerhippy