Wednesday, April 13, 2011

Regex replace consecutive non-alpha chars with single char

I want to replace all non-alpha characters in a string with a plus '+' sign, but making sure that a group of more than one non-alpha chars is only replaced by one plus sign.

I had thought the following might work but apparently not:

System.Text.RegularExpressions.Regex.Replace(name, @"[^\w]*?", "+")
From stackoverflow
  • Try System.Text.RegularExpressions.Regex.Replace(name, @"\W+", "+")

    For this: "sasa-==[]&^asdsa2435" matches -==[]&^

    LukeH : One small caveat: using \W will exclude letters, numbers *and* underscores from the match. Use [^A-Za-z] instead if you only want to exclude letters, or [^0-9A-Za-z] to exclude alphanumerics (but include underscores).
  • You should not disable greediness, and you want 1 or more, not 0 or more. Replace "*?" with "+".

0 comments:

Post a Comment