The forum operates in read-only mode. Please head to StackOverflow for support.

Mon, 01/11/2010 - 01:21

scott

Joined: 11/01/2010

Posts: 5

How best to deal with security issues and CKEditor?

As I understand CKEditor works by simply replacing a <textarea> and html is then sent to the POST request when a user hits submit. Of course, then a user could simply submit his own html in a custom POST request and this html can have javascript, tags I don't allow and invalid xhtml.

So clearly I need something on the server side to ensure only certain xhtml tags have been submitted and the xhtml is valid without any javascript. How do others handle this? Is there a good library, preferably Java for this?

Mon, 01/11/2010 - 04:07

yavin

Joined: 23/12/2009

Posts: 65

Re: How best to deal with security issues and CKEditor?

I would just recommend using some regular expressions to filter out the information you don't want. It should be relatively simple to take care of most of the issues, particularly style and script tags that could produce hazardous results.

Probably the simplest solution would be to attempt to prevent the user from doing anything custom. For instance: remove the Source plugin. The editor, by default, can produce a number of different dtds, including HTML 4.01 Strict and Transitional as well as XHTML 1.0 Strict and Transitional.

Using those two in concert with each other should produce a bit more favorable results.

Mon, 01/11/2010 - 07:32

scott

Joined: 11/01/2010

Posts: 5

Re: How best to deal with security issues and CKEditor?

Thank you for the response yavin. Though I feel uncomfortable filtering out the potentially malicious html myself. Would much rather have a proven library validate and clean the html.

I wonder how normal it is for web apps to completely rely on a client library like CKEditor to do all the validation? How easy it is for others to take advantage..

Mon, 01/11/2010 - 08:22

yavin

Joined: 23/12/2009

Posts: 65

Re: How best to deal with security issues and CKEditor?

Treat it as if it were a simple textarea. There is no validation on the CKEditor side by default, and even moreso you should never do validation on the client side because it is easily circumvented.

I did a quick couple google searches to try and give you an idea of how to do some server side validation in a couple server side languages.

PHP
The first is PHP, which actually has a pretty nice whitelist type strip_tags function which strips out all HTML tags, but has a whitelist of allowable tags. This would allow you to set tags such as bold, italics, and underlined etc to be safe since white lists are always safer than black lists.

In addition to stripping out tags, you should probably run the xhtml from CKEditor through HTML Tidy which can repair anything malformed. Here is a devshed article that elaborates upon implementing Tidy in a PHP environment.

ASP.NET
For ASP.NET there are probably a few more options. But to keep things lightweight, here is a WebProNews article that has a nice function that does roughly the same thing as PHP's strip_tags function. Here is the function for each of copy/paste and viewing:

public static string StripHtml(string html, bool allowHarmlessTags)
{
   if (html == null || html == string.Empty)
     return string.Empty;

   if (allowHarmlessTags)
     return System.Text.RegularExpressions.Regex.Replace(html, "", string.Empty);

   return System.Text.RegularExpressions.Regex.Replace(html, "<[^>]*>", string.Empty);
}

Mon, 01/11/2010 - 08:53

alfonsoml

Joined: 31/12/2006

Posts: 3737

Re: How best to deal with security issues and CKEditor?

Any safety checks must be done always at the server side. Removing things like the source button doesn't help at all because if someone wants to attack your site they will send the data directly without allowing it to be cleaned by a simple javascript.

And the checks based on regexp are usually too simple for a real attacker. The best way to do it is to take the incoming data, parse it into a DOM tree and then filter out any node or attribute that it isn't whitelisted. Then you get that again as a HTML string and that's the final data.

Any other approach is vulnerable to browser bugs that allows to execute code in a ways that you didn't expect.
For example I think that some site suffered an attack because something like

<scri 
pt>alert('hello');</scri
pt>

is executed in IE. Maybe that's not exactly the correct syntax, but it was something that a simple regexp looking for "script" couldn't find. If the cleanup is base only on allowing a set of tags and attributes then it will be safe (or at least much safer, there's nothing perfect)

Mon, 01/11/2010 - 09:38

scott

Joined: 11/01/2010

Posts: 5

Re: How best to deal with security issues and CKEditor?

alfonsoml, I agree with you. The reading into a DOM, filter and then writing out the DOM sounds like a reasonable approach. I just have to be careful that the result is consistent (the attributes don't change order randomly) because my app is a wiki-like app.

What are good DOM parsers for this approach in Java?

Mon, 01/11/2010 - 16:13

alfonsoml

Joined: 31/12/2006

Posts: 3737

Re: How best to deal with security issues and CKEditor?

Sorry, I have no idea about Java. I would have to search the web and then I wouldn't even know if you can use whatever turns out of that search.

Mon, 01/11/2010 - 18:48

scott

Joined: 11/01/2010

Posts: 5

Re: How best to deal with security issues and CKEditor?

From reading the docs, it seems pretty easy to limit what buttons are displayed in the toolbar. What about the xhtml entered by the user in the source mode (important for me to provide this ability)? Can I limit it such that if someone tries to enter a tag that's now allowed either CKEditor will tell them to change it, html encode it or erase it?

If CKEditor can do this, then I think the best approach is to simply design an XML schema that I compare the submitted xhtml document to on the server side. Then if it fails, I report an error back to the browser as clearly they very likely circumvented the CKEditor...

If CKEditor cannot do this, then I still like the XML schema approach but then I want to tell the user exactly why their document failed (e.g. we don't support the script tag or the onsubmit attribute). Is there a decent java XML schema validator that can report reasons for failing to validate?

Does this make sense?

Mon, 01/11/2010 - 23:23

florin

Joined: 11/01/2010

Posts: 3

Re: How best to deal with security issues and CKEditor?

Here's a product that actually validates the user input against a XML schema. http://xopus.com/. This is one of their features:

Pre-validating

* Pre-validation makes it impossible to create invalid content
* Supports XML Schema (XSD)

The problem with it is that it doesn't yet support Chrome or Safari.

Does the CKEditor have a similar feature or any kind of schema validation against the user input? Does anyone know any other products that have this feature but also support Chrome and Safari?

Fri, 02/26/2010 - 10:41

#10

mvl

Joined: 26/02/2010

Posts: 2

Re: How best to deal with security issues and CKEditor?

Client side filtering of course is no replacement for input validation at the server, so client side filtering is not very useful for dealing with security issues. But there are other good reasons to filter in the browser. One reason is that some tags can break usability when the ckEditor is used in a browser based application. Although from a technical perspective this could be fixed by a server side filter, from a usability perspective filtering at the client should als be supported. ckEditor has the structure to implement this. The 'paste' event by default does no filtering at all. This should be fixed to allow configuration of an 'allowed tags' list, optionally combined with a custom filter.

Sun, 05/16/2010 - 05:50

#11

jelo

Joined: 15/05/2010

Posts: 3

Re: How best to deal with security issues and CKEditor?

I agree with mvl. A client side filter would help so that users would be notified straight away about non allowed tags.

Could you please elaborate what exactly this means: "take the incoming data, parse it into a DOM tree and then filter out any node or attribute that it isn't whitelisted".

On the server side I do the following to validate the submitted values:
$allowed_html="<strong><p><a><ul><ol>";
$value = mysql_real_escape_string(strip_tags(html_entity_decode($_POST['value']),$allowed_html));
I have disabled the SOURCE button in CKEditor. But users can still type/paste in html tags. The editor encodes these so I use the decode, strip all non allowed tags, make it database safe and write it into the database. Am I missing anything essential to make it secure?

Thanks, Jens