I enjoyed reading Effective XML: 50 Specific Ways to Improve Your XML, by Elliotte Rusty Harold. Fortunately, I have not been engaging in particularly ineffective XML, but I haven't found practical advise like this anywhere else. Even the introduction was useful, as it clarified often misused terms like “element” and “tag”.
Part 1, on Syntax, encourages us to include an XML declaration and use ASCII for tags, neither of which I can disagree with. I was surprised that he didn't insist that I include the “standalone” attribute in my XML declarations; I seldom create documents that aren't standalone, and feel guilty that I don't include that attribute... His advise to “Stay with XML 1.0” was good to hear, particularly since XML 1.1 was just released; his arguments seem valid to me. I tend to stay away from DTDs – I suppose I just haven't felt that the added complexity would be worth the trouble – so the related items weren't as important to me. I like the concept of using standard entity references (e.g. Ě) instead of character entities (e.g. ě), but the subsequent requirement of dealing with DTDs is too much for me. Also, though I've preferred hyphen-delimited element and attribute names in the past – the first XML-related specifications let me down that path – I'm coming around to appreciate the use of “camel case” for names, as they probably are more readable.
Part 2, on Structure, gives good advice on how to decide between attributes and elements, whether a date should be encoded as a single string (“2004-02-23”) or three elements (“<year>2004</year>...”), and even the proper use of processing instructions. He reminds us that mixed content is still important, and suggests using XHTML as the standard for rich text (or “narrative content”). He almost contradicts the advice about character entities in part 1 that I disagreed with, for the purpose of interoperability with parsers that don't read the DTD. I was educated on the finer points of URIs, URNs, and URLs, as used by namespaces. He gives a little help on picking a schema language, and clearly isn't fond of the W3C XML Schema Language. There are also a few guidelines that seemed obvious to me, but must be in response to common abuses of XML.
Part 3, on Semantics, starts with a great summary of the many XML-related technologies that have become available, in each case giving an unapologetic opinion as to the general usefulness of that technology. Needless to say, I'm now less concerned about the fact that I haven't yet learned XLink, XPointer, etc. He touches on the joy of XPath; I certainly can't imagine using XML without it, and the “non-portable” nature of the SelectNodes methods of Microsoft's XML parsers has been worth the productivity gains. He touches on the difference between “push” and “pull” XML parsers; unless performance demands it, I'd certainly recommend sticking with “pull” – the “push” style of SAX simply takes too much development effort in many cases. The book encourages me to validate documents, a lesson that I learned long ago with SGML, but have yet to bring into the world of XML, probably because DTDs just don't seem like a good fit for XML with namespaces, and no clear winner has yet emerged from the battle of the schemas...
Part 4, on Implementation, made me curious to learn more about “native XML databases,” and wisely suggests that XML has not replaced relational databases. Beyond that, part 4 didn't really hold my interest much, probably because my focus is on Microsoft technologies, and I'm basically locked in to using their tools and techniques. For the sake of the first three parts, Effective XML is definitely worth reading, and I recommend it to anyone developing with XML.