Post Header
Along with the upgrade to Rails 3, there have been significant changes and improvements to our HTML sanitizing and parsing in Release 0.8.2. These changes should make things clearer for authors and much faster for readers!
Here is a quick breakdown for those who just want the highlights, followed by a more detailed explanation of what was changed and how it all works.
Highlights
- Blank lines and carriage returns will now be converted to paragraph (<p></p>) and line-break (<br />) tags in the text editor.
- The text will automatically be parsed and "cleaned up" -- any tags that were left open get closed, any mis-nested tags get fixed, etc.
- The text will be sanitized, to remove any elements that are potentially harmful to our server.
- This change fixes the known bug where switching from HTML mode to Rich Text mode causes all your paragraphs to disappear. (Yay!)
- This change will also allow users to embed video from: youtube, vimeo, blip.tv, dailymotion, viddler, metacafe, and 4shared. (Yay!)
What's Behind the Scenes
The new back end for content works in three steps.
- There is now a paragraph-adder that converts blank lines and carriage returns into paragraph tags (<p></p>) and break tags (<br />) based on a few simple rules:
- A blank line left between two pieces of text will be made separate paragraphs:
- A carriage return or newline in the middle of text will add a break tag:
- We also will preserve extra blank lines -- if you have TWO blank lines in a row, we will add in an empty paragraph:
- Note: The paragraph-adder will put <br /> tags at the end of each line whenever there is a carriage return, even in things like lists. So, if you have a nice chunk of HTML in your story that you coded up by hand like this:
Here is paragraph one.
Here is paragraph two.
will become:
<p>Here is paragraph one.</p>
<p>Here is paragraph two.</p>
Here is a line
with a carriage return in the middle.
will become:
Here is a line <br />
with a carriage return in the middle.
Here is paragraph one, and I want extra space after it.
Here is paragraph two.
will become:
<p>Here is paragraph one, and I want extra space after it.</p>
<p> </p>
<p>Here is paragraph two.</p>
<ul>
<li>Item one.</li>
<li>Item two.</li>
</ul>
You can avoid having <br /> tags added by putting the list into a single line with no carriage returns instead:
<ul><li>Item one.</li><li>Item two.</li></ul>
The next step is a Ruby on Rails gem (basically a kind of plugin) called Nokogiri, which parses the text and gives it back to us as a well-formed chunk of XHTML. What this means among other things is that:
- any tags that were left open get closed
- any mis-nested tags get fixed (eg, if you do <strong><em>foo!</strong></em> Nokogiri will turn that into the correct version (<strong><em>Foo!</em></strong>)
- any attribute values that aren't properly in quotes get fixed
Finally, we use the gem Sanitize to clean up this XHTML and take out anything that is legal but not necessarily safe. Sanitize uses a whitelist, meaning that only the tags and attributes we specifically tell it are allowed are let through. It's very customizable, and we have been able to write special rules for Sanitize to safely allow embeds of videos from specific sites (currently: youtube, vimeo, blip.tv, dailymotion, viddler, metacafe and 4shared.) Once Sanitize is done, the final version is saved into the database.
There is lots of documentation available on Nokogiri and Sanitize on their respective sites.
What you see when editing
- If you are working in a field (like content in the Post New Work form) that allows you to use the Rich Text Editor, the tags <p> and <br /> will show, because otherwise if you switch to the Rich Text Editor, it will do that horrible thing where your whitespace disappears and your text all runs together into one giant blob!
- If you manually put in some <p> tags that had extra attributes on them, like "<p align=center>", the tags will show.
- The <p> and <br /> tags will not show when you edit fields like notes and summary, however, where there is no option to use the Rich Text Editor.
Here's an example of how the tags will look on content in the Post New Work form:


Pages Navigation
samjohnsson Sat 13 Nov 2010 10:31AM UTC
Comment Actions
admin-franny (Guest) Sat 13 Nov 2010 11:30AM UTC
Comment Actions
A (mumblemutter) Sun 21 Nov 2010 10:07AM UTC
Last Edited Sun 21 Nov 2010 10:10AM UTC
Comment Actions
mirabilos (Guest) Sat 05 Mar 2011 01:26AM UTC
Comment Actions
mirabilos (Guest) Sat 05 Mar 2011 01:27AM UTC
Comment Actions
mirabilos (Guest) Sat 05 Mar 2011 01:29AM UTC
Comment Actions
Zooey_Glass Sat 05 Mar 2011 08:58AM UTC
Comment Actions
mirabilos Sat 05 Mar 2011 06:52PM UTC
Comment Actions
Cesy Fri 05 Aug 2011 11:07AM UTC
Comment Actions
AMHC Thu 31 May 2012 10:01AM UTC
Comment Actions
AMHC Thu 31 May 2012 10:04AM UTC
Comment Actions
LucyP Thu 31 May 2012 02:37PM UTC
Comment Actions
walking_tornado Sun 16 Jun 2013 12:24PM UTC
Comment Actions
AO3_Support (Official) Sun 16 Jun 2013 06:17PM UTC
Comment Actions
Musicfight23 Sat 21 May 2016 07:44PM UTC
Comment Actions
theLadyLazaruss Tue 18 Apr 2017 12:59PM UTC
Comment Actions
ChekhovsRazor Thu 01 Mar 2018 01:54AM UTC
Last Edited Thu 01 Mar 2018 01:56AM UTC
Comment Actions
Account Deleted Tue 22 Feb 2022 04:15PM UTC
Comment Actions
ChekhovsRazor Thu 03 Mar 2022 05:21PM UTC
Comment Actions
Emily_Writes_Things Sun 04 Mar 2018 05:19AM UTC
Comment Actions
smallbrownfrog Mon 12 Aug 2013 06:38PM UTC
Last Edited Mon 12 Aug 2013 06:43PM UTC
Comment Actions
AO3_Support (Official) Tue 13 Aug 2013 04:54PM UTC
Comment Actions
smallbrownfrog Tue 13 Aug 2013 06:29PM UTC
Comment Actions
Daniel (Guest) Tue 29 Jul 2014 11:01AM UTC
Comment Actions
Daniel (Guest) Wed 30 Jul 2014 05:09AM UTC
Comment Actions
AO3_Support (Official) Wed 30 Jul 2014 07:13PM UTC
Comment Actions
TheMadScribe Wed 30 Jul 2014 03:17AM UTC
Comment Actions
Daniel (Guest) Wed 30 Jul 2014 03:25AM UTC
Comment Actions
TheMadScribe Wed 30 Jul 2014 04:25AM UTC
Comment Actions
AO3_Support (Official) Wed 30 Jul 2014 07:20PM UTC
Last Edited Wed 30 Jul 2014 07:23PM UTC
Comment Actions
Humberto (Guest) Mon 11 Aug 2014 02:13PM UTC
Comment Actions
starzinoureyes Sat 16 Aug 2014 06:31PM UTC
Comment Actions
alanalytical Fri 05 May 2017 08:20PM UTC
Comment Actions
TheAnderfelsOne Sat 30 May 2015 02:08PM UTC
Last Edited Sat 30 May 2015 02:08PM UTC
Comment Actions
AO3_Support (Official) Tue 23 Jun 2015 12:44PM UTC
Comment Actions
TheAnderfelsOne Tue 23 Jun 2015 06:48PM UTC
Comment Actions
AO3_Support (Official) Tue 23 Jun 2015 10:03PM UTC
Comment Actions
TheAnderfelsOne Fri 26 Jun 2015 04:46PM UTC
Comment Actions
Sam of Support (AO3_Support) (Official) Wed 01 Jul 2015 12:39AM UTC
Comment Actions
TheAnderfelsOne Sun 05 Jul 2015 11:21PM UTC
Comment Actions
Sam of Support (AO3_Support) (Official) Mon 06 Jul 2015 12:13PM UTC
Comment Actions
ShilohHemingway Mon 03 Aug 2015 01:07AM UTC
Comment Actions
CodenameCarrot Tue 23 Feb 2016 03:28AM UTC
Comment Actions
Pages Navigation