Sunday, March 27, 2011

Regex: Do not match if pattern at the end of the string

I have the following regex, where I want to match any explicit dot followed by one or more:

<b> <i> <u> </b> </i> </u>

I would like this Regex to NOT match this pattern if it occurs at the end of the string.

string = Regex.Replace(string, "\.((<[\/biu]+>)+)", ".$1||")

Ex:

This <b>should match.</b> allright.

This <i><b>shouldn't match.</b></i>
From stackoverflow
  • "\.((<[\/biu]+>)+)(?!$)"
    

    Use the negative lookahead assertion with the $ symbol to check for end of line. (Remember, $ matches end of line so you want to not match that.)

    Vincent : Thanks, but it still matches "." in the following: This shouldn't match.
    Evan Fosmark : You could always make it non-greedy by introducing the `?` symbol. That would probably make it not match what you wrote. (I don't have any resource to test with right now)
  • Force there to be more items after the last closed element, but make sure they aren't elements themselves.

    "\.((<[\/biu]+>)+)[^<>]+"
    
  • You could use atomic grouping:

    \.(?>(?:<\/?[biu]>)+)(?!$)
    
    Tomalak : The question was to match any *dot*: "\.(?=(?>(?:<\/?[biu]>)+)(?!$))". :-) Otherwise, +1
    Alan Moore : The OP was capturing the tags and plugging them back in with $1, so he should add capturing parens instead of a lookahead. Also, this is the only answer that corrects the OP's mistake WRT matching the tags.

0 comments:

Post a Comment