etree.strip_tags() does not remove all instances of a defined tag
I'm a little hesitant to file a bug report since this seems a pretty obvious flaw with strip_tags(), and its quite likely i'm doing something wrong.
I have the following python code on lxml 2.2.2 with libxml 2.7.6 on FreeBSD 7.2:
from lxml import etree
html = """
<div>
<div>
I like <strong>
<br/>
I like lots of <strong>
<br/>
Click <a href="www.
<br/>
</div>
</div>
"""
element = etree.fromstrin
etree.strip_
print etree.tostring(
which prints:
<div>
<div>
I like <strong>
I like lots of <strong>
</div>
</div>
I would expect *all* the "br" and "a" tags to be removed.
Another example, use "etree.
<div>
<div>
I like beer.
I like lots of <strong>
</div>
</div>
Again i would expect all the defined tags to be stripped. Surely this can't be correct behavior?
Thanks
Question information
- Language:
- English Edit question
- Status:
- Solved
- For:
- lxml Edit question
- Assignee:
- No assignee Edit question
- Solved by:
- phatfish
- Solved:
- Last query:
- Last reply: