- Notifications
You must be signed in to change notification settings - Fork86
License
rails/rails-html-sanitizer
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
This gem is responsible for sanitizing HTML fragments in Rails applications. Specifically, this is the set of sanitizers used to implement the Action ViewSanitizerHelper methodssanitize,sanitize_css,strip_tags andstrip_links.
Rails HTML Sanitizer is only intended to be used with Rails applications. If you need similar functionality but aren't using Rails, consider using the underlying sanitization libraryLoofah directly.
All sanitizers respond tosanitize, and are available in variants that use either HTML4 or HTML5 parsing, under theRails::HTML4 andRails::HTML5 namespaces, respectively.
NOTE: The HTML5 sanitizers are not supported on JRuby. Users may programmatically check for support by callingRails::HTML::Sanitizer.html5_support?.
full_sanitizer=Rails::HTML5::FullSanitizer.newfull_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")# => Bold no more! See more here...
or, if you insist on parsing the content as HTML4:
full_sanitizer=Rails::HTML4::FullSanitizer.newfull_sanitizer.sanitize("<b>Bold</b> no more! <a href='more.html'>See more here</a>...")# => Bold no more! See more here...
link_sanitizer=Rails::HTML5::LinkSanitizer.newlink_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')# => Only the link text will be kept.
or, if you insist on parsing the content as HTML4:
link_sanitizer=Rails::HTML4::LinkSanitizer.newlink_sanitizer.sanitize('<a href="example.com">Only the link text will be kept.</a>')# => Only the link text will be kept.
This sanitizer is also available as an HTML4 variant, but for simplicity we'll document only the HTML5 variant below.
safe_list_sanitizer=Rails::HTML5::SafeListSanitizer.new# sanitize via an extensive safe list of allowed elementssafe_list_sanitizer.sanitize(@article.body)# sanitize only the supplied tags and attributessafe_list_sanitizer.sanitize(@article.body,tags:%w(tabletrtd),attributes:%w(idclassstyle))# sanitize via a custom scrubbersafe_list_sanitizer.sanitize(@article.body,scrubber:ArticleScrubber.new)# prune nodes from the tree instead of stripping tags and leaving inner contentsafe_list_sanitizer=Rails::HTML5::SafeListSanitizer.new(prune:true)# the sanitizer can also sanitize csssafe_list_sanitizer.sanitize_css('background-color: #000;')
Scrubbers are objects responsible for removing nodes or attributes you don't want in your HTML document.
This gem includes two scrubbersRails::HTML::PermitScrubber andRails::HTML::TargetScrubber.
This scrubber allows you to permit only the tags and attributes you want.
scrubber=Rails::HTML::PermitScrubber.newscrubber.tags=['a']html_fragment=Loofah.fragment('<a><img/ ></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a></a>"
By default, inner content is left, but it can be removed as well.
scrubber=Rails::HTML::PermitScrubber.newscrubber.tags=['a']html_fragment=Loofah.fragment('<a><span>text</span></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a>text</a>"scrubber=Rails::HTML::PermitScrubber.new(prune:true)scrubber.tags=['a']html_fragment=Loofah.fragment('<a><span>text</span></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a></a>"
WherePermitScrubber picks out tags and attributes to permit in sanitization,Rails::HTML::TargetScrubber targets them for removal. Seehttps://github.com/flavorjones/loofah/blob/main/lib/loofah/html5/safelist.rb for the tag list.
Note: by default, it will scrub anything that is not part of the permitted tags fromloofahHTML5::Scrub.allowed_element?.
scrubber=Rails::HTML::TargetScrubber.newscrubber.tags=['img']html_fragment=Loofah.fragment('<a><img/ ></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a></a>"
Similarly toPermitScrubber, nodes can be fully pruned.
scrubber=Rails::HTML::TargetScrubber.newscrubber.tags=['span']html_fragment=Loofah.fragment('<a><span>text</span></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a>text</a>"scrubber=Rails::HTML::TargetScrubber.new(prune:true)scrubber.tags=['span']html_fragment=Loofah.fragment('<a><span>text</span></a>')html_fragment.scrub!(scrubber)html_fragment.to_s# => "<a></a>"
You can also create custom scrubbers in your application if you want to.
classCommentScrubber <Rails::HTML::PermitScrubberdefinitializesuperself.tags=%w(formscriptcommentblockquote)self.attributes=%w(style)enddefskip_node?(node)node.text?endend
SeeRails::HTML::PermitScrubber documentation to learn more about which methods can be overridden.
Using theCommentScrubber from above, you can use this in a Rails view like so:
<%=sanitize@comment,scrubber:CommentScrubber.new %>
Rails HTML sanitizers are intended to be used by the view layer, at page-render time. They arenot intended to sanitize persisted strings that will be sanitizedagain at page-render time.
Proper HTML sanitization will replace some characters with HTML entities. For example, text containing a< character will be updated to contain< to ensure that the markup is well-formed.
This is important to keep in mind becauseHTML entities will render improperly if they are sanitized twice.
Imagine the user is asked to enter their employer's name, which will appear on their public profile page. Then imagine they enterJPMorgan Chase & Co..
If you sanitize this before persisting it in the database, the stored string will beJPMorgan Chase & Co.
When the page is rendered, if this string is sanitized a second time by the view layer, the HTML will containJPMorgan Chase &amp; Co. which will render as "JPMorgan Chase & Co.".
Another problem that can arise is rendering the sanitized string in a non-HTML context (for example, if it ends up being part of an SMS message). In this case, it may contain inappropriate HTML entities.
You might simply choose to persist the untrusted string as-is (the raw input), and then ensure that the string will be properly sanitized by the view layer.
That raw string, if rendered in an non-HTML context (like SMS), must also be sanitized by a method appropriate for that context. You may wish to look into usingLoofah orSanitize to customize how this sanitization works, including omitting HTML entities in the final string.
If you really want to sanitize the string that's stored in your database, you may wish to look intoLoofah::ActiveRecord rather than use the Rails HTML sanitizers.
In versions < 1.6, the only module defined by this library wasRails::Html. Starting in 1.6, we define three additional modules:
Rails::HTMLfor general functionality (replacingRails::Html)Rails::HTML4containing sanitizers that parse content as HTML4Rails::HTML5containing sanitizers that parse content as HTML5 (if supported)
The following aliases are maintained for backwards compatibility:
Rails::Htmlpoints toRails::HTMLRails::HTML::FullSanitizerpoints toRails::HTML4::FullSanitizerRails::HTML::LinkSanitizerpoints toRails::HTML4::LinkSanitizerRails::HTML::SafeListSanitizerpoints toRails::HTML4::SafeListSanitizer
Add this line to your application's Gemfile:
gem 'rails-html-sanitizer'And then execute:
$ bundleOr install it yourself as:
$ gem install rails-html-sanitizer| branch | ruby support | actively maintained | security support |
|---|---|---|---|
| 1.6.x | >= 2.7 | yes | yes |
| 1.5.x | >= 2.5 | no | while Rails 6.1 is in security support |
| 1.4.x | >= 1.8.7 | no | no |
Loofah is what underlies the sanitizers and scrubbers of rails-html-sanitizer.
Thenode argument passed to some methods in a custom scrubber is an instance ofNokogiri::XML::Node.
Rails HTML Sanitizers is work of many contributors. You're encouraged to submit pull requests, propose features and discuss issues.
SeeCONTRIBUTING.
Trying to report a possible security vulnerability in this project? Please check out theRails project's security policy for instructions.
Rails HTML Sanitizers is released under theMIT License.
About
Resources
License
Contributing
Uh oh!
There was an error while loading.Please reload this page.