As I understand CKEditor works by simply replacing a <textarea> and html is then sent to the POST request when a user hits submit. Of course, then a user could simply submit his own html in a custom POST request and this html can have javascript, tags I don't allow and invalid xhtml.
So clearly I need something on the server side to ensure only certain xhtml tags have been submitted and the xhtml is valid without any javascript. How do others handle this? Is there a good library, preferably Java for this?
So clearly I need something on the server side to ensure only certain xhtml tags have been submitted and the xhtml is valid without any javascript. How do others handle this? Is there a good library, preferably Java for this?
Re: How best to deal with security issues and CKEditor?
Probably the simplest solution would be to attempt to prevent the user from doing anything custom. For instance: remove the Source plugin. The editor, by default, can produce a number of different dtds, including HTML 4.01 Strict and Transitional as well as XHTML 1.0 Strict and Transitional.
Using those two in concert with each other should produce a bit more favorable results.
Re: How best to deal with security issues and CKEditor?
I wonder how normal it is for web apps to completely rely on a client library like CKEditor to do all the validation? How easy it is for others to take advantage..
Re: How best to deal with security issues and CKEditor?
Treat it as if it were a simple textarea. There is no validation on the CKEditor side by default, and even moreso you should never do validation on the client side because it is easily circumvented.
I did a quick couple google searches to try and give you an idea of how to do some server side validation in a couple server side languages.
PHP
The first is PHP, which actually has a pretty nice whitelist type strip_tags function which strips out all HTML tags, but has a whitelist of allowable tags. This would allow you to set tags such as bold, italics, and underlined etc to be safe since white lists are always safer than black lists.
In addition to stripping out tags, you should probably run the xhtml from CKEditor through HTML Tidy which can repair anything malformed. Here is a devshed article that elaborates upon implementing Tidy in a PHP environment.
ASP.NET
For ASP.NET there are probably a few more options. But to keep things lightweight, here is a WebProNews article that has a nice function that does roughly the same thing as PHP's strip_tags function. Here is the function for each of copy/paste and viewing:
Re: How best to deal with security issues and CKEditor?
And the checks based on regexp are usually too simple for a real attacker. The best way to do it is to take the incoming data, parse it into a DOM tree and then filter out any node or attribute that it isn't whitelisted. Then you get that again as a HTML string and that's the final data.
Any other approach is vulnerable to browser bugs that allows to execute code in a ways that you didn't expect.
For example I think that some site suffered an attack because something like
is executed in IE. Maybe that's not exactly the correct syntax, but it was something that a simple regexp looking for "script" couldn't find. If the cleanup is base only on allowing a set of tags and attributes then it will be safe (or at least much safer, there's nothing perfect)
Re: How best to deal with security issues and CKEditor?
What are good DOM parsers for this approach in Java?
Re: How best to deal with security issues and CKEditor?
Re: How best to deal with security issues and CKEditor?
If CKEditor can do this, then I think the best approach is to simply design an XML schema that I compare the submitted xhtml document to on the server side. Then if it fails, I report an error back to the browser as clearly they very likely circumvented the CKEditor...
If CKEditor cannot do this, then I still like the XML schema approach but then I want to tell the user exactly why their document failed (e.g. we don't support the script tag or the onsubmit attribute). Is there a decent java XML schema validator that can report reasons for failing to validate?
Does this make sense?
Re: How best to deal with security issues and CKEditor?
Here's a product that actually validates the user input against a XML schema. http://xopus.com/. This is one of their features:
Pre-validating
* Pre-validation makes it impossible to create invalid content
* Supports XML Schema (XSD)
The problem with it is that it doesn't yet support Chrome or Safari.
Does the CKEditor have a similar feature or any kind of schema validation against the user input? Does anyone know any other products that have this feature but also support Chrome and Safari?
Re: How best to deal with security issues and CKEditor?
Re: How best to deal with security issues and CKEditor?
Could you please elaborate what exactly this means: "take the incoming data, parse it into a DOM tree and then filter out any node or attribute that it isn't whitelisted".
On the server side I do the following to validate the submitted values:
$allowed_html="<strong><p><a><ul><ol>";
$value = mysql_real_escape_string(strip_tags(html_entity_decode($_POST['value']),$allowed_html));
I have disabled the SOURCE button in CKEditor. But users can still type/paste in html tags. The editor encodes these so I use the decode, strip all non allowed tags, make it database safe and write it into the database. Am I missing anything essential to make it secure?
Thanks, Jens