I have fixed this "naive article" back in spring 2009
an it contained my proposal for "HTML encoding".
Week ago I have discovered mistake in my code:
2 chars which I should exclude from output
where not excluded and outputed as encoded.
I wanted to update the alg on the web and surprise:
HTML Entity Encoding is not enough to stop XSS in web applications. Please see
XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet for more information.
So let's see what is the OWASPS update ?
Article named: XSS_(Cross_Site_Scripting)_Prevention_Cheat_Sheet.
Why Can't I Just HTML Entity Encode Untrusted Data?Ok it covers more in one place, excelent....
HTML entity encoding is okay for untrusted data that you put in the body of the HTML document, such as inside a div tag. It
even sort of works for untrusted data that goes into attributes, particularly if you're religious about using quotes around
your attributes. But HTML entity encoding doesn't work if you're putting untrusted data inside a script tag anywhere, or an
event handler attribute like onmouseover, or inside CSS, or in a URL. So even if you use an HTML entity encoding method
everywhere, you are still most likely vulnerable to XSS. You MUST use the escape syntax for the part of the HTML document
you're putting untrusted data into. That's what the rules below are all about.
introduces "terms" like "HTML Escape" or "Attribute Escape"....
and no surprise it is strong propagation of
ESAPI and ESAPI reference implementation.
BEWARECheck code here:
latest version of mine "pseudo-code" still kept inside owasps wiki history.
compare and decide .....
Mine works for "Supplementary Multilingual Plane"
Numeric character references not Character entity references.
and it's imune to client charset swithing..
Probably we will here more about ESAPI, since they "amuse and scare me" more and more every day....
BUG FIX: Two extra chars to remove are 0b 0c. (swich ifs or add extra if else line). Sorry....