G'day myfrndjk,

You have a number of issues with your regex.

($domain) = $url =~ m|www.([A-Z a-z 0-9]+.{3}).|x;

As $domain is assigned at the start for each iteration, and used within the body of the loop, it would make sense to this get this part fixed first.

I'm guessing that, if the input was "www.example.com.au", the expected output should be either "example.com.au" or "example.com". Please clarify. (FYI: your code produces "example.co", see below.)

[Please take a look at the guidelines in "How do I post a question effectively?" for information on useful materials to include with your post. (In this specific instance, sample input and expected output would have been on the list.)]

This test code:

#!/usr/bin/env perl -l use strict; use warnings; my $url = 'www.example.com.au'; my ($domain) = $url =~ m|www.([A-Z a-z 0-9]+.{3}).|x; print '$domain=[', defined $domain ? $domain : '<undef>', ']';

produces this output:

$domain=[example.co]

Adding these additional lines of code:

my ($alpha_num, $any_three, $final_dot) = $url =~ m|www.([A-Z a-z 0-9]+)(.{3})(.)|x; print '$alpha_num=[', defined $alpha_num ? $alpha_num : '<undef>', ']' +; print '$any_three=[', defined $any_three ? $any_three : '<undef>', ']' +; print '$final_dot=[', defined $final_dot ? $final_dot : '<undef>', ']' +;

and the output now shows which parts of the regex are capturing which parts of the domain:

$alpha_num=[example] $any_three=[.co] $final_dot=[m]

The dot ('.') (meta)character is special in regexes: matching any character except newline [including newline if the \s modifier is used].

You seem to have used it, expecting a literal dot, in "m|www.". I'm not sure what's intended with ".{3}).", hence the request for clarification earlier. Anyway, this problem needs fixing.

You also have a problem with spaces in "[A-Z a-z 0-9]". I suspect this is the result of a misunderstanding about the \x modifier:

"/x tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a character class." [my emphasis]

Decide whether you want domains with spaces or not; modify the character class to have no spaces or just one space.

[See also: perlrequick, perlretut, perlre, strict, warnings, autodie, open()]

-- Ken


In reply to Re: if/else loop prints extra values by kcott
in thread if/else loop prints extra values by myfrndjk

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.