TinyMCE is a WYSIWYG rich-text editor implemented with Javascript. Many web developers embed TinyMCE within content management systems to provide clients an easy way to create and edit content. In fact, we use TinyMCE in our own content management system. However, it is dangerous to provide users a powerful content editing tool like TinyMCE without adequate training. Invalid markup added to TinyMCE can destroy an otherwise perfect website.

I lose countless development hours correcting invalid markup created within TinyMCE. A majority of the invalid markup is created when users copy and paste text from Microsoft Word™ or Microsoft Office™; users unknowingly paste an inordinate amount of hidden Microsoft™ meta-data into TinyMCE that causes rendering errors in various web browsers (the current stable version of TinyMCE does not properly remove this meta-data even if using TinyMCE's Paste from Word feature). This is not the user's fault. It is unreasonable to expect a user to know how to write valid XHTML. I needed a solution using TinyMCE to convert user input into valid XHTML 1.0 Strict markup. After hours of research, here is what I found.

Project Setup

For this tutorial, I assume you are using the following directory structure:

[PROJECT ROOT] -->index.html -->styles.css -->js/

Install TinyMCE

Download TinyMCE to your computer Extract the TinyMCE ZIP archive into this project's js/ directory. The final directory structure should look like this:

[PROJECT ROOT] -->index.html -->styles.css -->js/ ---->tinymce_3_2_4_1/ ------>jscripts/ -------->tiny_mce/

When you unzip the TinyMCE file, the TinyMCE directory may be named differently than shown above.

Prepare the HTML File

We will start with simple HTML and CSS files. index.html uses the XHTML 1.0 Strict doctype . It has a simple form with a textarea . styles.css contains some simple styles for TinyMCE. We will reference this CSS file later. To save time, go ahead and grab the HTML source code and CSS styles. Paste the HTML source code into index.html. Paste the CSS styles into styles.css.

Initialize TinyMCE

First, add this line into the HEAD of index.html to import the TinyMCE library.

<script src="js/tinymce_3_2_4_1/jscripts/tiny_mce/tiny_mce.js" type="text/javascript"></script>

Be sure the path to tiny_mce.js is accurate based on the TinyMCE file you downloaded and extracted. Next, we need to initialize TinyMCE. Add this code immediately below the aforementioned code.

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced" }); </script>

This will convert the textarea within index.html into a TinyMCE editor. You can view your progress by opening index.html in a web browser. If you are using the Safari web browser on Mac OS X, use this code instead. Be sure the Safari plugin is loaded for all subsequent examples, too.

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", plugins:"safari" }); </script>

Our Objectives

Ensure client input is converted into XHTML 1.0 Strict markup Remove unused classes from markup Remove empty HTML elements** Remove Microsoft™ meta-data Encode HTML entities (<,>,&)

**We do not want to remove all empty elements. Blank div elements are sometime used to place dynamic content. In this tutorial we will only remove empty p , em , and strong elements.

Objective 1: XHTML 1.0 Strict Markup

First, we tell the TinyMCE editor to use the XHTML 1.0 Strict doctype . To do so, we use the doctype parameter when initializing TinyMCE. The TinyMCE initialization code looks like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>" }); </script>

The doctype parameter accepts a string value that is the entire doctype you wish to use. We also need to define a white list of valid XHTML elements and attributes. Any elements and attributes not defined in our white list should be removed from the TinyMCE editor. We tell the TinyMCE editor what elements and attributes are valid by setting the valid_elements parameter to a string. This string should adhere to an expected syntax. The syntax is beyond the point of this article, but you can read more about this syntax on the TinyMCE Wiki. I found a preset XHTML white list on TinyMCE's Wiki. I tweaked this preset white list to ensure it conformed to the Strict doctype . My modified white list should be added to the TinyMCE initialization code like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>", valid_elements : "" +"a[accesskey|charset|class|coords|dir<ltr?rtl|href|hreflang|id|lang|name" +"|onblur|onclick|ondblclick|onfocus|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|rel|rev" +"|shape<circle?default?poly?rect|style|tabindex|title|type]," +"abbr[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"acronym[class|dir<ltr?rtl|id|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"address[class|align|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"area[accesskey|alt|class|coords|dir<ltr?rtl|href|id|lang|nohref<nohref" +"|onblur|onclick|ondblclick|onfocus|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup" +"|shape<circle?default?poly?rect|style|tabindex|title|target]," +"base[href|target]," +"basefont[color|face|id|size]," +"bdo[class|dir<ltr?rtl|id|lang|style|title]," +"big[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"blockquote[cite|class|dir<ltr?rtl|id|lang|onclick|ondblclick" +"|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout" +"|onmouseover|onmouseup|style|title]," +"body[class|dir<ltr?rtl|id|lang|link|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onload|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|onunload|style|title]," +"br[class|id|style|title]," +"button[accesskey|class|dir<ltr?rtl|disabled<disabled|id|lang|name|onblur" +"|onclick|ondblclick|onfocus|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|style|tabindex|title|type" +"|value]," +"caption[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"cite[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"code[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"col[align<center?char?justify?left?right|char|charoff|class|dir<ltr?rtl|id" +"|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|span|style|title" +"|valign<baseline?bottom?middle?top|width]," +"colgroup[align<center?char?justify?left?right|char|charoff|class|dir<ltr?rtl" +"|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|span|style|title" +"|valign<baseline?bottom?middle?top|width]," +"dd[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title]," +"del[cite|class|datetime|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"dfn[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"dir[class|compact<compact|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"div[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"dl[class|compact<compact|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"dt[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title]," +"em/i[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"embed[height|src|type|width|class|contenteditable|contextmenu|dir|draggable|id|irrelevant|lang" +"|ref|registrationmark|tabindex|template|title|onabort|onbeforeunload|onblur|onchange|onclick|oncontextmenu" +"|ondblclick|ondrag|ondragend|ondragcenter|ondragleave|ondragover|ondragstart|ondrop|onerror|onfocus|onkeydown" +"|onkeypress|onkeyup|onload|onmessage|onmousedown|onmousemove|onmouseover|onmouseout|onmouseup|onmousewheel|onresize" +"|onscroll|onselect|onsubmit|onunload]," +"fieldset[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"form[accept|accept-charset|action|class|dir<ltr?rtl|enctype|id|lang" +"|method<get?post|name|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|onreset|onsubmit" +"|style|title]," +"h1[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"h2[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"h3[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"h4[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"h5[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"h6[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"head[dir<ltr?rtl|lang|profile]," +"html[dir<ltr?rtl|lang|version]," +"img[alt=''|class|dir<ltr?rtl|height" +"|id|ismap<ismap|lang|longdesc|name|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|src|style|title|usemap|width]," +"input[accept|accesskey|alt" +"|checked<checked|class|dir<ltr?rtl|disabled<disabled|id|ismap<ismap|lang" +"|maxlength|name|onblur|onclick|ondblclick|onfocus|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|onselect" +"|readonly<readonly|size|src|style|tabindex|title" +"|type<button?checkbox?file?hidden?image?password?radio?reset?submit?text" +"|usemap|value]," +"ins[cite|class|datetime|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"isindex[class|dir<ltr?rtl|id|lang|prompt|style|title]," +"kbd[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"label[accesskey|class|dir<ltr?rtl|for|id|lang|onblur|onclick|ondblclick" +"|onfocus|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout" +"|onmouseover|onmouseup|style|title]," +"legend[accesskey|class|dir<ltr?rtl|id|lang" +"|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"li[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title|type" +"|value]," +"link[charset|class|dir<ltr?rtl|href|hreflang|id|lang|media|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|rel|rev|style|title|type]," +"map[class|dir<ltr?rtl|id|lang|name|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"meta[content|dir<ltr?rtl|http-equiv|lang|name|scheme]," +"noscript[class|dir<ltr?rtl|id|lang|style|title]," +"object[archive|class|classid" +"|codebase|codetype|data|declare|dir<ltr?rtl|height|id|lang|name" +"|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|standby|style|tabindex|title|type|usemap" +"|width]," +"ol[class|compact<compact|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|start|style|title|type]," +"optgroup[class|dir<ltr?rtl|disabled<disabled|id|label|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"option[class|dir<ltr?rtl|disabled<disabled|id|label|lang|onclick|ondblclick" +"|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout" +"|onmouseover|onmouseup|selected<selected|style|title|value]," +"-p[class|dir<ltr?rtl|id|lang|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|style|title]," +"param[id|name|type|value|valuetype<DATA?OBJECT?REF]," +"pre/listing/plaintext/xmp[align|class|dir<ltr?rtl|id|lang|onclick|ondblclick" +"|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout" +"|onmouseover|onmouseup|style|title|width]," +"q[cite|class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"s[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title]," +"samp[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"script[charset|defer|language|src|type]," +"select[class|dir<ltr?rtl|disabled<disabled|id|lang|multiple<multiple|name" +"|onblur|onchange|onclick|ondblclick|onfocus|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|size|style" +"|tabindex|title]," +"small[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"span[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"strike[class|class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title]," +"strong/b[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"style[dir<ltr?rtl|lang|media|title|type]," +"sub[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"sup[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]," +"table[bgcolor|border|cellpadding|cellspacing|class" +"|dir<ltr?rtl|frame|height|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|rules" +"|style|summary|title|width]," +"tbody[char|class|charoff|dir<ltr?rtl|id" +"|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|style|title" +"|valign<baseline?bottom?middle?top]," +"td[abbr|axis|bgcolor|char|charoff|class" +"|colspan|dir<ltr?rtl|headers|height|id|lang|nowrap<nowrap|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|rowspan|scope<col?colgroup?row?rowgroup" +"|style|title|valign<baseline?bottom?middle?top|width]," +"textarea[accesskey|class|cols|dir<ltr?rtl|disabled<disabled|id|lang|name" +"|onblur|onclick|ondblclick|onfocus|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|onselect" +"|readonly<readonly|rows|style|tabindex|title]," +"tfoot[char|charoff|class|dir<ltr?rtl|id" +"|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|style|title" +"|valign<baseline?bottom?middle?top]," +"th[abbr|axis|bgcolor|char|charoff|class" +"|colspan|dir<ltr?rtl|headers|height|id|lang|nowrap<nowrap|onclick" +"|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown|onmousemove" +"|onmouseout|onmouseover|onmouseup|rowspan|scope<col?colgroup?row?rowgroup" +"|style|title|valign<baseline?bottom?middle?top|width]," +"thead[char|charoff|class|dir<ltr?rtl|id" +"|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup|onmousedown" +"|onmousemove|onmouseout|onmouseover|onmouseup|style|title" +"|valign<baseline?bottom?middle?top]," +"title[dir<ltr?rtl|lang]," +"tr[abbr|bgcolor|char|charoff|class" +"|rowspan|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title|valign<baseline?bottom?middle?top]," +"tt[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title]," +"u[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress|onkeyup" +"|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style|title]," +"ul[class|compact<compact|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown" +"|onkeypress|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover" +"|onmouseup|style|title|type]," +"var[class|dir<ltr?rtl|id|lang|onclick|ondblclick|onkeydown|onkeypress" +"|onkeyup|onmousedown|onmousemove|onmouseout|onmouseover|onmouseup|style" +"|title]" }); </script>

For you XHTML 1.0 Strict evangelists, I did include embed in the white list. Technically, embed is not included in the XHTML 1.0 Strict DTD. However, embed is necessary to provide support for older browsers. Also, embed is available in the HTML 5 DTD. We now have a fully functional TinyMCE editor that enforces XHTML 1.0 Strict markup (with the one exception). However, we still have four remaining objectives.

Objective 2: Remove unused classes from markup

It is recommended that you provide a CSS file to style the contents of the TinyMCE editor. By styling the contents of the TinyMCE editor, the user will see what the final styled code will look like while using TinyMCE. We tell TinyMCE to use our CSS file by specifying the content_css parameter during TinyMCE initialization. This code looks like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>", valid_elements:"omitted for brevity", content_css:"styles.css" }); </script>

NOTE: the valid_elements value is omitted for the sake of brevity. Be sure you still specify the valid elements in your own code! The value of content_css is a path to your CSS file relative to the current HTML file. Next, we tell TinyMCE to remove all unused classes from the client's markup that are not present in our CSS file. This code looks like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>", valid_elements:"omitted for brevity", content_css:"styles.css", verify_css_classes:true }); </script>

For example, if a client provided this markup:

<div class="myClass">...</div>

And our CSS file did NOT include the CSS class selector .myClass{} , then TinyMCE would remove the unused class and output this code:

<div>...</div>

Objective 3: Remove empty HTML elements

We will now remove empty p , em , and strong elements from the TinyMCE editor with the cleanup_callback parameter. This parameter's value is the name of a custom Javascript function. Let's call this function myCustomCleanup and define it now.

function myCustomCleanup(type,value){}

This function accepts two parameters: the type of callback (ignored in this tutorial, but you can read more about this parameter on the TinyMCE Wiki), and the final HTML markup of the TinyMCE editor. Let's further define the implementation of the myCustomCleanup function.

function myCustomCleanup(type,value){ var value = value + ""; //Ensure value is a string return value.replace(/<(p|em|strong)(>|[^>]*>)(\\s)*<\\/\\1>/ig,""); }

This function uses a Regular Expression to remove all empty p , em , and strong elements from the TinyMCE editor's markup. We tell TinyMCE to call our custom cleanup method during TinyMCE initialization. Our new TinyMCE initialization code looks like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>", valid_elements:"omitted for brevity", content_css:"styles.css", verify_css_classes:true, cleanup_callback : "myCustomCleanup" }); </script>

Objective 4: Remove Microsoft™ Meta-data

From my own experience, most Microsoft Word™ and Microsoft Office™ meta-data is contained within HTML comments. We will define a new function to remove HTML comments from the TinyMCE editor markup. This function looks like this:

//Citation: http://www.faqts.com/knowledge_base/view.phtml/aid/21761/fid/53 function removeHtmlComments(source){ var html = source + ""; //Ensure source is a string var regX = /<(?:!(?:--[\\s\\S]*?--\\s*)?(>)\\s*|(?:script|style|SCRIPT|STYLE)[\\s\\S]*?<\\/(?:script|style|SCRIPT|STYLE)>)/g; return html.replace(regX, function(m,\$1){ return \$1?'':m; }); }

This method accepts the final HTML markup from the TinyMCE editor and removes all single and multi-line HTML comments except those within script and style elements. We now add this function to our myCustomCleanup method so it is called by TinyMCE. Our new myCustomCleanup method looks like this:

function myCustomCleanup(type,value){ var value = value + ""; //Ensure value is a string value = value.replace(/<(p|em|strong)(>|[^>]*>)(\\s)*<\\/\\1>/ig,""); return removeHtmlComments(value); }

Note: You can force TinyMCE to run the cleanup_callback function at any time by clicking the Broom icon in the TinyMCE editor toolbar.

Objective 5: Encode HTML Entities

Last, we ensure characters like <, >, and & are encoded into their HTML entity equivalents. For example, & will become &. To do this, we specify the entity_encoding parameter during TinyMCE initialization. Our TinyMCE initialization code looks like this:

<script type="text/javascript"> tinyMCE.init({ mode:"textareas", theme:"advanced", doctype:"<!DOCTYPE html PUBLIC '-//W3C//DTD XHTML 1.0 Strict//EN' " + "'http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd'>", valid_elements:"omitted for brevity", content_css:"styles.css", verify_css_classes:true, cleanup_callback : "myCustomCleanup", entity_encoding : "named" }); </script>

The Final Result

We now have a TinyMCE installation that produces valid XHTML 1.0 Strict markup regardless of client input. It also removes unused classes from markup, deletes empty HTML elements, strips Microsoft™ meta-data, and encodes HTML entities! I hope this tutorial gets you moving in the right direction. This tutorial is not meant to be a final implementation. I am still tweaking this implementation to produce even better results. If you see room for improvement, kindly let me know by posting a comment. I'd love to hear your feedback.

Download a ZIP archive containing the final code