java - How to keep link title attribute with jsoup? -
using jsoup.clean(), jsoup turns title attribute of html link from:
<a href="" title="test <br />">test</a> into:
<a href="" title="test <br />">test</a> this demo application:
whitelist whitelist = new whitelist(); whitelist.addtags("a"); whitelist.addattributes("a", "href", "title"); string input = "<a href=\"\" title=\"test <br />\">test</a>"; system.out.println("input: " + input); string output = jsoup.clean(input, whitelist); system.out.println("output: " + output); which prints:
input: <a href="" title="test <br />">test</a>
output: <a href="" title="test <br />">test</a>
i tried add outputsettings escapemode:
outputsettings outputsettings = new outputsettings(); outputsettings.escapemode(escapemode.xhtml); escapemode.base , escapemode.extend have no effect. escapemode.xhtml prints following:
input: <a href="" title="test <br />">test</a>
output: <a href="" title="test <br />">test</a>
any idea how jsoup not manipulate title tag?
this known issue/behavior: https://github.com/jhy/jsoup/issues/684 (marked "won't fix" jsoup team).
there's not bug here.
when serializing (i.e. in example when you're printing out xml/html), escape few characters necessary. why > not escaped >; because it's in quoted attribute, there's no ambiguity it's closing tag, doesn't escaped.
Comments
Post a Comment