Closest MS implementation (MSXML .text) documented in MSDN claims:
NODE_DOCUMENTBut what is subnodes and what is text ?
Returns a string representing the value of the node.
This is the concatenated text of
all subnodes with entities expanded.
<?xml version="1.0" encoding="utf-8" ?> <!-- Document level comment --> <!-- TODO: NOTATION --> <!DOCTYPE root [ <!ENTITY ent1 "expanded ent1"> ]> <?pi1 ?> <root attribute="attribute.value"> element.text.1 <e1><![CDATA[cdata.content]]></e1> <e2><!--comment.content--></e2> <e3>&ent1;</e3> element.text.2 </root>Remarks section clarifies something:
When concatenated, the text represents the contents of text or CDATA nodes. All concatenated text nodes are normalized according to xml:space attributes and the value of the preserveWhiteSpace switch. Concatenated CDATA text is not normalized. (Child nodes that contain NODE_COMMENT and NODE_PROCESSING_INSTRUCTION nodes are not concatenated.) .text trims the whitespace on the edges of the result, and "normalizes" \r\n => \n, but otherwise just concatenates text.For this sample it returns:
Retrieves and sets the string representing the text contents of this node or the concatenated text representing this node and its descendants.
For more precise control over text manipulation in an XML document, use the lower-level nodeValue property, which returns the raw text associated with a NODE_TEXT node.
element.text.1 cdata.content expanded ent1 element.text.2Both comments skipped, OK, but I still, miss the text of my NODE_ENTITY.
If requested ditectly NODE_ENTITY.text returns:
expanded ent1So I would expect:
expanded ent1 element.text.1 cdata.content expanded ent1 element.text.2Why is NODE_ENTITY.text missing from NODE_DOCUMENT.text ? Maybe because it is inside NODE_DOCUMENT_TYPE which claims to return .text as "" ? Or because :text", does not mean text but nodeValue which is defined as null for both NODE_DOCUMENT_TYPE and NODE_ENTITY.
From my quick tests Document.text behaves the same as Document.documentElement.text. If anyone can show, how the may differ I would be pleased. Until then, considered as bad design, useless w3 deviation and insufficent documentation.