Monday, April 11, 2011

Compiled replace regular expression

Hi!

I'd like to build a regular expression assembly of common regex I must use in my project. I use these regular expressions to match a pattern and to replace it. I use this piece of code who builds the assembly.

AssemblyName an = new AssemblyName("MyRegExp");

RegexCompilationInfo[] rciList = { 
    new RegexCompilationInfo(@"\<b\>(.+?)\<\/b\>", RegexOptions.IgnoreCase, "BoldCode", "MyRegExp", true),
    new RegexCompilationInfo(@"\<i\>(.+?)\<\/i\>", RegexOptions.IgnoreCase, "ItalicCode", "MyRegExp", true)
};

Regex.CompileToAssembly(rciList, an);

But, I can't find where to specify the replace string. In fact, I'll use these regexp to replace the tags in a html file by . So, the replace string is also constant.

I don't want the calling assembly to specify the replace string as it's always the same, accross different calling assemblies.

Thanks in advance for any advise, Fabian

EDIT1:

Maybe I misexplained what I need to do. I have several regular expression that are always replaced with the same pattern. I replace all string with string, string with string, and so on.

The compiled regex are great, but I miss the replacement pattern in the compiled assembly. I managed to build a workaround with a Helper class the build an array of Tranformation.

My initial question was more this: is it a way to specifiy in the compiled regex the replacement string?

From stackoverflow
  • .NET Reflector is very helpful for things like this. Taking a look at an assembly created by Regex.CompileToAssembly

    The created types derive from Regex. So you use them exactly as you would a Regex instance created inline.


    Re. EDIT In question: the answer appears to be: there is no way of including specified strings in the generated assembly.

    However, given that using CompileToAssembly implies a multi-step build process (create the assembly generator, run it to create an assembly, and then reference that assembly) it is possible to extend this to add other content. Create the regex assembly, and create a replacements strings assembly then use ilmerge to combine them into one.

    Fabian Vilers : I added a clarification on my initial question.
  • Hi,

    it doesn't look like this is directly supported by the "CompileToAssembly" method, so you'll have to find some other way to associate the replacement string with the regex. If you want to store the replacement string in the generated assembly, then one option I can think of is to specify it in custom attributes (third parameter to "CompileToAssembly").

    I think this generates custom attributes for the assembly (and not for single Regexes), but you could for example use something like (Note: you'd have to declare this attribute yourself):

    [RegexReplaceString("RegexName", "Replacement")]
    

    When using the generated DLL from your application, you'd have to add some handling to load the replacement strings and store it together with the compiled Regex objects in some class. This looks a bit difficult, but at least, it let's you store the replacement string in the generated DLL, if that's what you're aiming for.

    Fabian Vilers : Good idea, what I've found is the same as you, Regex don't support knowing the replacement string at compile time. In the mean time, I managed to write an Helper class that do all the stuff.

0 comments:

Post a Comment