JTidy is a Java port of HTML Tidy , a HTML syntax checker and pretty printer. Like its non-Java cousin, JTidy can be used as a tool for cleaning up malformed and faulty HTML. In addition, JTidy provides a DOM interface to the document that is being processed, which effectively makes you able to use JTidy as a DOM parser for real-world HTML.
JTidy was written by Andy Quick, who later stepped down from the maintainer position. Now JTidy is maintained by a group of volunteers.
More information on JTidy can be found on the JTidy SourceForge project page .
New subproject: jtidyservlet
The JTidy Servlet library is an open source suite of custom tags and servlets that integrate JTidy HTML syntax checker and pretty printer functionality into Servlet/JSP container.
JTidyservlet is managed by Vlad Skarzhevskyy, who recently joined the JTidy project.
Two major new features expected for the next release have been finally committed to cvs!
Tidy.setMessageListener()
and be notified for error, warnings and summary messages.
What is missing before a release?
More than 50% of the tests are now working, and hundreds of fixes and new features have been ported from the c version. Xml/xhtml output is now fairly more robust. Check out a nightly build and reports any bug found!
Nightly builds are now automatically generated daily and the whole website is refresh at the same time. 1/3 of the implemented tests is working now. Two years of reported bugs are difficult to catch up, but the change log starts becoming "important"...
Do you wanna play with a recent build? Get the source or binary distribution from the nightly builds page .
Site updated using the latest maven version: test report is a lot more readable now (formatting has been fixed in the latest junit-report plugin)... new site layout (using a tweaked version of the maven xdoc plugin: xhtml + tableless CSS)
183 test cases fully implemented now. All the test cases from Tidy and some new tests for JTidy have been added.
All the test cases which caused JTidy to crash or loop have been fixed! Priority (1) is done, now there are other 139 tests failing. Note most of the test are failing at the firsts lines for differences in doctype handling and formatting in Tidy (the latest Tidy release has been used to produce output files for comparison).
These are the priorities before a release:
Mh, formatting in maven-generated junit report is really bad, I just submitted a bug report to maven: error messages are escaped two times, newlines are not preserved and random whitespaces are added. I think I should spend some time in fixing junit report plugin bugs if I want to be able to fix JTidy bugs...
179 test cases for JTidy have been partially implemented and added!
All the test cases for the non java version of Tidy have been integrated. Partially because most of them don't check yet output or warnings produced by Tidy, but simply test that JTidy doesn't crash or loop.
Well, actually as you can see in the junit report we have 1 test causing a NPE and 4 causing infinite loops! These bugs will have the precedence over any incorrect output bug (fixing these will probably worth a new release, you don't want your software to hang using JTidy, right?).
Anyway, in the TidyCrashingBugsTest (test that crashed the c version of Tidy) 21 of the 24 tests works without problems... not so bad as expected.
See testcases , if you wanna help JTidy supplying tests or fixes.
Thanks to the Clover team for the free license for the JTidy project!
JTidy new website is online!
The project is starting again after two years without a release. I (Fabrizio Giustina) just joined the project as new administrator and developer.
Main targets are now:
A note about mailing lists: there are two new mailing list, specific to jtidy, see project mailing lists . You can find previous discussions in html-tidy@w3.org archives (common to tidy and jtidy).