-
-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature goals compared to Bleach (/ full ammonia API)? #10
Comments
Neither feature parity with Bleach nor full ammonia API is the goal for |
Others will likely opine but FWIW looking at our current use of bleach, Ammonia seems to support most of what we need (though some of it by hand-rolling through the ultra generic
We also customise the serialization compared to the bleach default (which is just html5lib's as far as I can tell), but html5ever doesn't seem to have tuning knobs there (or at least not any which is relevant to what we configured) so there's definitely nothing you could do. FWIW for the first two items nh3 might be able to provide bespoke whitelists from which it'd compose the relevant |
Added |
Added |
❤️ |
Bleach offers the ability to pass |
It's not entirely clear how to add And is possible to rename element, for example |
I don't think you can do either of those via Ammonia (and thus nh3), especially for the second request it's not a general-purpose HTML-rewriting device. You can see all the operations Ammonia supports at Your first request would I think be rust-ammonia/ammonia#163. |
Would it be possible to allow a
I'm trying to sanitize data from CKEditor5 and multiple tags can for example contain |
nh3 is a thin layer over ammonia so it's limited to what ammonia provides. For But there's no such Unless you can get Ammonia to add a |
Ah, right i missed that in the rust docs.
Luckily performance is not an issue for me so i'll do it myself
That is fair enough. I assume I'd implement that by checking Does this filter apply before or after cleaning? Thank you for the quick reply Edit: I've resorted to just implementing a loop to update the tag_attribute_values with a tag whitelist for tag in tag_whitelist:
tag_attribute_values.update([
(tag, tag_attribute_values['*']),
]) |
One thing from bleach that I'm missing (or maybe I'm just missing it in the docs) is the |
Isn't that just >>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div><foo>xxx</foo></div>'
>>> bleach.clean('<div><foo>xxx</foo></div>', tags={'div'}, strip=True)
'<div>xxx</div>'
>>> nh3.clean('<div><foo>xxx</foo></div>', tags={'div'})
'<div>xxx</div>'
|
That makes it unusable for usecases where you want to tread the string as plaintext where people may be writing stuff that looks like HTML (ie it has |
|
Is there an equivalent for |
No. |
Discussions are not enabled so opening it here, sorry 'bout it.
With the recent deprecation of bleach (mostly on grounds of html5lib being unmaintained), unless someone has the time to e.g. rebuild the html5lib API on top of an existing html5 parser and the maintainer of bleach decides to use that, ammonia/nh3 seems well positioned as a migration target (there's already one package which has done that visible from the linked Bleach PR).
One issue there is that nh3 currently provides rather limited tuning knobs compared to Ammonia and Bleach (not sure how the two relate as I have not looked yet), but the readme doesn't really say what your eventual goals would be on that front as maintainer. If you do aim to favor such support & migration, maybe an issue or even project (kanban) about full Ammonia support and / or Bleach features parity (if not API compatibility) could be a consideration?
An other possible issue (though more internal) is for exposing customisations which allow arbitrary callables (
attribute_filter
seems to be the only one currently):nh3
currently releases the GIL during cleanup, which wouldn't allow calling Python functions, and thus exposing a genericattribute_filter
, I don't know whether Ammonia has parallelism built-in or how much you care about parallel cleaning (though I figure having two paths and only keeping the GIL if callbacks were actually provided would always be an option if a somewhat more annoying one).The text was updated successfully, but these errors were encountered: