HTML parser that is internally used by sanitize_html() function.
(collections.Set) The set of disallowed URI schemes e.g. javascript:.
(re.RegexObject) The regular expression pattern that matches to disallowed CSS properties.
HTML parser that is internally used by clean_html() function.
Strip all markup tags from html string. That means, it simply makes the given html document a plain text.
Parameters: | html (str) – html string to clean |
---|---|
Returns: | cleaned plain text |
Return type: | str |
Sanitize the given html string. It removes the following tags and attributes that are not secure nor useful for RSS reader layout:
Parameters: | html (str) – html string to sanitize |
---|---|
Returns: | cleaned plain text |
Return type: | str |