jsoup is a Java library that makes it easy to work with real-world HTML and XML. It offers an easy-to-use API for URL fetching, data parsing, extraction, and manipulation using DOM API methods, CSS, ...
clean the url:http://blog.sina.com.cn/s/blog_501a5b1f0102dx6z.html It's have to much wbr tags,when i search the page source ,found 24205. i look at org.jsoup.safety ...
Author, DevRel, Blogger, Open Source Hacker, Java Rockstar, Conference Speaker, Instructor and Entrepreneur ...