Thursday, May 5, 2011

Convert HTML in a database to XHTML using ASP.NET

I have a large amount of non-compliant HTML stored in database tables that I need to make validate.

I thought of pulling it into an inline editor like X-Standard that would do a conversion, but is there an easier way to do this via VB.NET?

From stackoverflow
  • I would look into HTML Tidy.

    From tidy's documentation:

    Tidy reads HTML, XHTML and XML files and writes cleaned up markup. For HTML variants, it detects and corrects many common coding errors and strives to produce visually equivalent markup that is both W3C compliant and works on most browsers. A common use of Tidy is to convert plain HTML to XHTML.

  • HTML Tidy is probably the best option.

    If it's for a one-off conversion it might be easier to use a PHP script (where TIDY is built-in) to do the work; otherwise you'll have to wrap a COM object instead to use it with VB.NET (more info here if you want to do that.

    mmcglynn : The environment only supports ASP.NET.
  • By embedding a WYSIWYG editor on a detail page (tinyMCE), I was able to load the bad HTML and let the editor do the work of creating very close to valid code.

0 comments:

Post a Comment