Tuesday, April 5, 2011

How should I handle URLs inside XML?

I'm creating an XML document and I want to store URLs inside the a node like so:

<ns:url>http://example.com</ns:ulr>

My question is, do I need to encode/escape the URL in anyway?

If I do the will the .Net XmlDocument object handle this for me?

From stackoverflow
  • I doubt there will be any need to escape it in general. However, you might have some application level requirement to do so.

    You should absolutely encode the URL as XML text (for example, it shouldn't contain >).

    Clarification: This does not mean you should pass the encoded text to the DOM implementation. All XML APIs I know do that for you; so for all practical situations, you wouldn't need to do it manually. I just said it should be done.

    Jeffrey Hantin : You're just adding complexity if you do. It's the responsibility of whatever converts the DOM to a text file to handle that escaping. By the way, there's no real need to escape ">" in text content because it doesn't introduce a tag.
    Mehrdad Afshari : I think it's a requirement for a valid XML document. And I didn't say you should do it *manually*. You should do it, by whatever means.
    Robert Rossney : It is, which is why every DOM XML implementation does it for you. The only people who should be generating XML through direct string manipulation are people *writing* DOM XML implementations.
    Mehrdad Afshari : @Robert: I agree, but it is "done". Yes. I would *never* write such a thing in an app. I'd use XElement. But whatever I use will eventually do that, so it has to be done. That's all I stated, nothing more. I didn't ever say, you should manually encode it and give it to your DOM object.
  • In general most DOM implementations (including XmlDocument) should handle any necessary escaping of text content by default.

    Ishmael : Exactly. Create your document via the DOM or one of the XmlWriters and you will be fine.
  • The DOM/XmlWriter/whatever you are using should handle that for you. One minor point: you might find it easier to use XDocument (if you have 3.5) - the namespace usage is much simpler (IMO):

    XNamespace ns = "http://consoso/foobar";
    XDocument doc = new XDocument(
         new XElement("Foo",
             new XAttribute(XNamespace.Xmlns + "ns", ns), // alias
             new XElement("Bar", "abc"),
             new XElement(ns + "url", "http://foo/bar")
         )
     );
     string s = doc.ToString();
    

    Which creates:

    <Foo xmlns:ns="http://consoso/foobar">
      <Bar>abc</Bar>
      <ns:url>http://foo/bar</ns:url>
    </Foo>
    
    tpower : Thanks, XDocument is so much easier. I didn't know about it 'till now.

0 comments:

Post a Comment