in reply to simulating bash

I saw your post on the perl-beginners list, but deleted it, so I was hoping someone else would reply and I could work off that, so I'm glad you posted it here, too.

Here's my take:

# parse variable assignments out of the file while ($ebuildcontents =~ /\b([-A-Z0-9_]+)=\"(.*?)\"{1}?/sgc) { $ebuildvars{$1} = $2; }
I'm curious about your \"{1}? part there. Are quotes optional? I'd need to see how the config file is formatted, to help you extract these variables appropriately. But let's move onto the troublesome part:
foreach (keys %ebuildvars) { $ebuildvars{$_} =~ s/\$\{?([-A-Z0-9_]+)\}?/$ebuildvars{$1}/gs; }
The problem is that if VAR1 = "say $VAR2" and VAR2 = "hi $VAR3" and VAR3 = "jeff", you want to be able to expand ALL of these, so you get VAR1 = "say hi jeff", VAR2 = "hi jeff", and VAR3 = "jeff", right?

If so, let me know, and I'll help you write a solution. It's kind of related to a dependency tree...

_____________________________________________________
Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Replies are listed 'Best First'.
Re: Re: simulating bash
by agaffney (Beadle) on May 13, 2004 at 15:21 UTC
    Well, the '\"{1}?' part was my attempt to make the match non-greedy. Although, I think the '(.+?)' right before it accomplishes that. I just never removed it.

    Quote:
    The problem is that if VAR1 = "say $VAR2" and VAR2 = "hi $VAR3"
    and VAR3 = "jeff", you want to be able to expand ALL of these,
    so you get VAR1 = "say hi jeff", VAR2 = "hi jeff", and
    VAR3 = "jeff", right?
    

    That's exactly what I want to do.
      Ok, then. Here's a code block that demonstrates how to do it by using a dependency structure:
      #!/usr/bin/perl use strict; use warnings; my %vars; my %depend; while (<DATA>) { chomp; if (/^\s*(\w+)\s*=\s*(?:"([^"]*)"|(.*))/) { my ($name, $val) = ($1, $+); $vars{$name} = $val; $depend{$name} = [ $vars{$name} =~ /\$(\w+)/g ]; } } expand(\%vars, \%depend); sub expand { my ($varhash, $dephash, $queue) = @_; $queue ||= [ keys %$varhash ]; # for each item in the queue for (@$queue) { # make sure its dependencies are expanded expand($varhash, $dephash, $dephash->{$_}); # then interpolate the variables in this item $varhash->{$_} =~ s/\$(\w+)/$varhash->{$1}/g; } } use Data::Dumper; print Dumper(\%vars); __DATA__ VAR1 = "say $VAR2" VAR2 = "hi $VAR3" VAR3 = jeff
      You can see I parse my variables a little differently. If it's more complex for you, then so be it, but I used simple cases. I used the $+ variable because I want to match the string INSIDE quotes (not the quotes themselves) or else an un-quoted string, and I don't know which of these matches, so $+ gives me the last matching capture group. It's like writing defined($2) ? $2 : $3 in this case, but I didn't want to do that.

      You can see how expand() looks similar to the post-order traversal code I showed you a day or two ago. It's a similar concept. The function, when called with only the two hashrefs, uses the keys of the %vars hash as its queue. Each recursive call uses the dependent variables of the current one as its queue. I hope it's understandable.

      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a job (NYC-area)
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;
        I modified your code slightly to incorporate in my program.

        sub expand_bash_var { my ($varhash, $dephash, $queue) = @_; $queue ||= [ keys %$varhash ]; # for each item in the queue for (@$queue) { # make sure its dependencies are expanded expand_bash_var($varhash, $dephash, $dephash->{$_}); # then interpolate the variables in this item $varhash->{$_} =~ s/\$\{?(\w+)\}?/$varhash->{$1}/g; } } sub get_depend { my $ebuildfname = shift; my $ebuildcontents; my %ebuildvars; my $pkgname = $ebuildfname; my %vardepend; # print "Getting depend for '$ebuildfname'\n"; $pkgname =~ s|/usr/portage/||; $pkgname =~ s|(.+)/.+/(.+).ebuild|$1/$2|; my $pkg = parse_package_name($pkgname); $pkg->{version} =~ s/^-//; $ebuildvars{PV} = "$pkg->{version}$pkg->{suffix}"; open EBUILD, "< $ebuildfname" or die "Couldn't open '$ebuildfname' t +o get DEPEND\n"; while(<EBUILD>) { $ebuildcontents .= $_; } close EBUILD; while($ebuildcontents =~ /\b([-A-Z0-9_]+)\s*=\s*\"(.*?)\"/sgc) { $ebuildvars{$1} = $2; } print "Calling expand_bash_var() for '$ebuildfname'\n"; expand_bash_var(\%ebuildvars, \%vardepend); print "Done with expand_bash_var() for '$ebuildfname'\n"; my $depend = $ebuildvars{'DEPEND'} || ''; $depend .= " $ebuildvars{'RDEPEND'}" if(defined $ebuildvars{'RDEPEND +'}); $depend =~ s/(\s+|\n+)/ /gs; print "$depend\n"; return $depend; }

        which gives me the output:

        Deep recursion on subroutine "main::expand_bash_var" at ./portage.pl line 124.
        

        The file it is processing contains:

        RESTRICT="nostrip" IUSE="3dfx sse mmx 3dnow xml2 truetype nls cjk doc ipv6 debug static p +am sdk bindist" filter-flags "-funroll-loops" ALLOWED_FLAGS="-fstack-protector -march -mcpu -O -O1 -O2 -O3 -pipe" strip-flags if [ "${ARCH}" = "x86" ] then if [ -z "${SYNAPTICS}" ] then SYNAPTICS="yes" fi else unset SYNAPTICS fi USE_SNAPSHOT="no" PATCH_VER="2.1.25.4" FT2_VER="2.1.3" XCUR_VER="0.3.1" SISDRV_VER="311003-1" SAVDRV_VER="1.1.27t" MGADRV_VER="1_3_0beta" VIADRV_VER="0.1" SYNDRV_VER="0.12.0" BASE_PV="${PV}" MY_SV="${BASE_PV//\.}" S="${WORKDIR}/xc" SYNDIR="${WORKDIR}/synaptics" SRC_PATH="mirror://xfree/${BASE_PV}/source" HOMEPAGE="http://www.xfree.org" X_PATCHES="http://dev.gentoo.org/~spyderous/xfree/patchsets/${PV}/XFre +e86-${PV}-patches-${PATCH_VER}.tar.bz2 http://www.cpbotha.net/files/dri_resume/xfree86-dri-resume-v8.patc +h" X_DRIVERS="http://people.mandrakesoft.com/~flepied/projects/wacom/xf86 +Wacom.c.gz http://www.probo.com/timr/savage-${SAVDRV_VER}.zip http://www.winischhofer.net/sis/sis_drv_src_${SISDRV_VER}.tar.gz http://w1.894.telia.com/~u89404340/touchpad/files/synaptics-${SYND +RV_VER}.tar.bz2" MS_COREFONTS="./andale32.exe ./arial32.exe ./arialb32.exe ./comic32.exe ./courie32.exe ./georgi32.exe ./impact32.exe ./times32.exe ./trebuc32.exe ./verdan32.exe ./webdin32.exe" MS_FONT_URLS="${MS_COREFONTS//\.\//mirror://sourceforge/corefonts/}" SRC_URI="${SRC_PATH}/X${MY_SV}src-1.tgz ${SRC_PATH}/X${MY_SV}src-2.tgz ${SRC_PATH}/X${MY_SV}src-3.tgz ${SRC_PATH}/X${MY_SV}src-4.tgz ${SRC_PATH}/X${MY_SV}src-5.tgz doc? ( ${SRC_PATH}/X${MY_SV}src-6.tgz ${SRC_PATH}/X${MY_SV}src-7.tgz )" SRC_URI="${SRC_URI} ${X_PATCHES} ${X_DRIVERS} nls? ( mirror://gentoo/gemini-koi8-u.tar.bz2 ) mirror://gentoo/eurofonts-X11.tar.bz2 mirror://gentoo/xfsft-encodings.tar.bz2 mirror://gentoo/XFree86-compose.dir-0.1.bz2 mirror://gentoo/XFree86-en_US.UTF-8.old.bz2 mirror://gentoo/XFree86-locale.alias.bz2 mirror://gentoo/XFree86-locale.dir.bz2 mirror://gentoo/gentoo-cursors-tad-${XCUR_VER}.tar.bz2 truetype? ( !bindist? ( ${MS_FONT_URLS} ) )" LICENSE="Adobe-X CID DEC DEC-2 IBM-X NVIDIA-X NetBSD SGI UCB-LBL XC-2 bigelow-holmes-urw-gmbh-luxi christopher-g-demetriou national-semi +conductor nokia tektronix the-open-group todd-c-miller x-truetype xfree86-1. +0 MIT SGI-B BSD FTL | GPL-2 MSttfEULA" SLOT="0" KEYWORDS="x86 ppc sparc alpha mips hppa amd64 ia64" DEPEND=">=sys-apps/baselayout-1.8.3 >=sys-apps/portage-2.0.50_pre9 >=sys-libs/ncurses-5.1 >=sys-libs/zlib-1.1.3-r2 >=sys-devel/flex-2.5.4a-r5 >=dev-libs/expat-1.95.3 >=media-libs/freetype-${FT2_VER}-r2 >=media-libs/fontconfig-2.1-r1 >=x11-base/opengl-update-1.4 >=x11-misc/ttmkfdir-3.0.4 >=sys-apps/sed-4 >=sys-devel/patch-2.5.9 sys-apps/util-linux dev-lang/perl media-libs/libpng app-arch/unzip pam? ( >=sys-libs/pam-0.75 ) truetype? ( !bindist? ( app-arch/cabextract ) ) !virtual/x11 !x11-libs/xft" PDEPEND="x86? ( 3dfx? ( >=media-libs/glide-v3-3.10 ) )" PROVIDE="virtual/x11 virtual/opengl virtual/glu virtual/xft" DESCRIPTION="XFree86: free X server" PATCH_DIR=${WORKDIR}/patch
        That is pretty cool. I assume this is far faster than trying to pass the values through bash and reading the output back in.