Greetings wise monks,

I want to write a script where I want to sort Fansubs For that purpose I want to find out which group made the file, the name of the file etc

My problem is the following:
I start by trying to find out which group released the file.

Let's say the file is:
[Ureshii]_Amatsuki_-_07_[VORBIS-H264][BCCEA15E].mkv
which I write into $teststring(at the moment of course ;) )

The result has to be "Ureshii"

I now have the following code:
$teststring =~ /([A-Z]*)]/i; print $teststring."\n"; print $1."\n";
Works just fine.
The real problem starts if I have more then one group (more then one way of naming the releases).

The file could also be:
Ureshii Amatsuki - 07 [VORBIS-H264][BCCEA15E].mkv (or something like that)

There are various possiblities and I can't find a way to create a RegEx to retrieve the first word (the group).
I tried variants of the code because I thought "The RegEx tries to retrieve something that matches my 'input'" So first something thats in [A-Z] (by /i I tried to include [a-z] and make the regex shorter --> easier to understand) then an unkown length of characters (should also be in [A-Z]) Then it will find a space, a "]" or and "_" and the search stops (or should :P).

Does anyone of you know a way how I can do that or knows a site that explains RegEx? I visited selfhtml http://de.selfhtml.org/perl/sprache/regexpr.htm (a german site about all kinds of languages) but after 3-5 hours trying to "fix" my RegEx I feel kinda... helpless :(

Any hint would really help me, so I hope someone finds the time to give me some advice :)

Bye for now,
timesink

-----------------------
Edit says: "problem solved"


thanks apl for the page :D forgot about the documentation there :(
selfhtml is a nice site but (of course) this one is much more "complete"
so with this site & the help of another programmer I updated my regex to:
$teststring =~ /([A-Z]*)[_\][:space:]]/i
I have to include every new "end"-sign but I think that's ok :)

I tried to set up something like this before but it didn't work because I had a syntax error in it.

Around three hours of 'work' and then something like this xD But that's programming, huh ;)?

Now I'll try to include the other info like used Codec/Checksum etc but I think with the solution above that I will be able to solve this "remaining" problem on my own.

Thanks for everyone who helped me, I'm really grateful to you all :)

Bye for now,
timesink

In reply to Problem with RegEx & various "endings" by timesink

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.