in reply to Need Some help with finding a word in a file

If you know what is directly before the information you need, try changing the input record separator. Any time you think "unique", you most likely will want a hash.

#!/usr/bin/perl use warnings; use strict; $/ = 'authDataAlias='; my %no_dupes; foreach my $line (<DATA>) { if ($line =~ m/^"(.*?_DM\S+)"/i) { $no_dupes{$1} = 0; } } print "$_\n" for keys %no_dupes; __DATA__ <factories xmi:type="resources.jdbc:CMPConnectorFactory" xmi:id="CMPCo +nnectorFactory_1195273978412" name="dataSource" authMechanismPreferen +ce="BASIC_PASSWORD" authDataAlias="cell-tstc-65_DM/userQ" connectionD +efinition="ConnectionDefinition_1054132487569" cmpDatasource="DataSou +rce_1195273954323"><factories xmi:type="resources.jdbc:CMPConnectorFa +ctory" xmi:id="CMPConnectorFactory_1195273978412" name="dataSource" a +uthMechanismPreference="BASIC_PASSWORD" authDataAlias="cell-tstc-65_D +M/userQ" connectionDefinition="ConnectionDefinition_1054132487569" cm +pDatasource="DataSource_1195273954323"><factories xmi:type="resources +.jdbc:CMPConnectorFactory" xmi:id="CMPConnectorFactory_1195273978412" + name="dataSource" authMechanismPreference="BASIC_PASSWORD" authDataA +lias="cell-tstc-65_DM/userF" connectionDefinition="ConnectionDefiniti +on_1054132487569" cmpDatasource="DataSource_1195273954323"> <factories xmi:type="resources.jdbc:CMPConnectorFactory" xmi:id="CMPCo +nnectorFactory_1195273978412" name="dataSource" authMechanismPreferen +ce="BASIC_PASSWORD" authDataAlias="node-tstc-65_DM/userF" connectionD +efinition="ConnectionDefinition_1054132487569" cmpDatasource="DataSou +rce_1195273954323">

Replies are listed 'Best First'.
Re^2: Need Some help with finding a word in a file
by was6guy (Initiate) on Nov 29, 2007 at 21:33 UTC
    I think I need help with my regex, some of the stings have a null value, and some have a value, I need the ones that look like this:

    authDataAlias="cell-tstc-65_DM/userQ"

    I'm only concerned with: cell-tstc-65_DM/userQ

    If I do this, sed returns a blank line in the file since one of the authDataAlias strings is set to ="":
    ($line =~ m/authDataAlias=\"([^\"]*)\"/i) cell-tstc-65_DM/userQ cell-tstc-65_DM/user1
    If I run this sed command, it ignores the empty sting, but returns too much of the line:
    ($line =~ m/authDataAlias=(.*-.*-.*_DM\/.*)/i) "cell-tstc-65_DM/userQ" connectionDefinition="ConnectionDefinition_105 +4132487569" cmpDatasource="DataSource_1195273954323"> "cell-tstc-65_DM/user1" relationalResourceAdapter="builtin_rra" statem +entCacheSize="10" datasourceHelperClassname="com.ibm.websphere.rsadap +ter.DB2UniversalDataStoreHelper">
      Your above example works perfect. THANK YOU!