org.w3c.tidy
Class Tidy

java.lang.Object
  extended by org.w3c.tidy.Tidy
All Implemented Interfaces:
java.io.Serializable

public class Tidy
extends java.lang.Object
implements java.io.Serializable

HTML parser and pretty printer.

Version:
$Revision: 807 $ ($Author: fgiust $)
Author:
Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
See Also:
Serialized Form

Constructor Summary
Tidy()
          Instantiates a new Tidy instance.
 
Method Summary
static org.w3c.dom.Document createEmptyDocument()
          Creates an empty DOM Document.
 java.lang.String getAltText()
          alt-text- default text for alt attribute.
 boolean getAsciiChars()
          ascii-chars- convert quotes and dashes to nearest ASCII char.
 boolean getBreakBeforeBR()
          break-before-br - output newline before <br>.
 boolean getBurstSlides()
          split- create slides on each h2 element.
 Configuration getConfiguration()
          Returns the actual configuration
 java.lang.String getDocType()
          doctype- user specified doctype.
 boolean getDropEmptyParas()
          drop-empty-paras- discard empty p elements.
 boolean getDropFontTags()
          drop-font-tags- discard presentation tags.
 boolean getDropProprietaryAttributes()
          drop-proprietary-attributes- discard proprietary attributes.
 boolean getEmacs()
          gnu-emacs- if true format error output for GNU Emacs.
 boolean getEncloseBlockText()
          enclose-block-text- if true text in blocks is wrapped in <p>'s. return true if tidy should will text text in blocks in <p>'s.
 boolean getEncloseText()
          enclose-text- if true text at body is wrapped in <p>'s.
 java.lang.String getErrfile()
          Errfile - file name to write errors to.
 java.io.PrintWriter getErrout()
          Errout - the error output stream.
 boolean getEscapeCdata()
          escape-cdata -replace CDATA sections with escaped text.
 boolean getFixBackslash()
          fix-backslash- fix URLs by replacing \ with /.
 boolean getFixComments()
          fix-bad-comments- fix comments with adjacent hyphens.
 boolean getFixUri()
          fix-uri- output BODY content only.
 boolean getForceOutput()
          force-output- output document even if errors were found.
 boolean getHideComments()
          hide-comments- hides all (real) comments in output.
 boolean getHideEndTags()
          hide-endtags - suppress optional end tags.
 boolean getIndentAttributes()
          indent-attributes- newline+indent before each attribute.
 boolean getIndentCdata()
          indent-cdata- indent CDATA sections.
 boolean getIndentContent()
          indent - indent content of appropriate tags.
 java.lang.String getInputEncoding()
          input-encoding the character encoding used for input.
 java.lang.String getInputStreamName()
           
 boolean getJoinClasses()
          join-classes- join multiple class attributes.
 boolean getJoinStyles()
          join-styles- join multiple style attributes.
 boolean getKeepFileTimes()
          keep-time- if true last modified time is preserved.
 boolean getLiteralAttribs()
          literal-attributes- if true attributes may use newlines.
 boolean getLogicalEmphasis()
          logical-emphasis- replace i by em and b by strong.
 boolean getLowerLiterals()
          lower-literals- folds known attribute values to lower case.
 boolean getMakeBare()
          make-clean - remove Microsoft cruft.
 boolean getMakeClean()
          make-clean - remove presentational clutter.
 boolean getNumEntities()
          numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
 boolean getOnlyErrors()
          only-errors - if true normal output is suppressed.
 java.lang.String getOutputEncoding()
          output-encoding the character encoding used for output.
 int getParseErrors()
          ParseErrors - the number of errors that occurred in the most recent parse operation.
 int getParseWarnings()
          ParseWarnings - the number of warnings that occurred in the most recent parse operation.
 boolean getPrintBodyOnly()
          print-body-only- output BODY content only.
 boolean getQuiet()
          quiet - no 'Parsing X', guessed DTD or summary.
 boolean getQuoteAmpersand()
          quote-ampersand- output naked ampersand as &.
 boolean getQuoteMarks()
          quote-marks- output " marks as &quot;.
 boolean getQuoteNbsp()
          quote-nbsp- output non-breaking space as entity.
 boolean getRawOut()
          output-raw- avoid mapping values > 127 to entities.
 int getRepeatedAttributes()
          repeated-attributes- keep first or last duplicate attribute.
 boolean getReplaceColor()
          replace-color- replace hex color attribute values with names.
 int getShowErrors()
          show-errors- number of errors to put out.
 boolean getShowWarnings()
          show-warnings - show warnings?
 boolean getSmartIndent()
          SmartIndent - does text/block level content effect indentation.
 int getSpaces()
          indent-spaces- default indentation.
 java.io.PrintWriter getStderr()
           
 int getTabsize()
          tab-size- tab size in chars.
 boolean getTidyMark()
          tidy-mark- add meta element indicating tidied doc.
 boolean getTrimEmptyElements()
          trim-empty-elements- trim empty elements.
 boolean getUpperCaseAttrs()
          uppercase-attributes - output attributes in upper case.
 boolean getUpperCaseTags()
          uppercase-tags - output tags in upper case.
 boolean getWord2000()
          word-2000- draconian cleaning for Word2000.
 boolean getWrapAsp()
          wrap-asp- wrap within ASP pseudo elements.
 boolean getWrapAttVals()
          wrap-attributes- wrap within attribute values.
 boolean getWrapJste()
          wrap-jste- wrap within JSTE pseudo elements.
 int getWraplen()
          wrap- default wrap margin.
 boolean getWrapPhp()
          wrap-php- wrap within PHP pseudo elements.
 boolean getWrapScriptlets()
          wrap-script-literals- wrap within JavaScript string literals.
 boolean getWrapSection()
          wrap-sections- wrap within <!
 boolean getWriteback()
          writeback - if true then output tidied markup.
 boolean getXHTML()
          output-xhtml - output extensible HTML.
 boolean getXmlOut()
          output-xml - create output as XML.
 boolean getXmlPi()
          add-xml-pi- add <?
 boolean getXmlPIs()
          assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?
 boolean getXmlSpace()
          add-xml-space- if set to yes adds xml:space attr as needed.
 boolean getXmlTags()
          input-xml - treat input as XML.
static void main(java.lang.String[] argv)
          Command line interface to parser and pretty printer.
protected  int mainExec(java.lang.String[] argv)
          Main method, but returns the return code as an int instead of calling System.exit(code).
 Node parse(java.io.InputStream in, java.io.OutputStream out)
          Reads from the given input and returns the root Node.
 Node parse(java.io.InputStream in, java.io.Writer out)
          Reads from the given input and returns the root Node.
 Node parse(java.io.Reader in, java.io.OutputStream out)
          Reads from the given input and returns the root Node.
 Node parse(java.io.Reader in, java.io.Writer out)
          Reads from the given input and returns the root Node.
 org.w3c.dom.Document parseDOM(java.io.InputStream in, java.io.OutputStream out)
          Parses InputStream in and returns a DOM Document node.
 void pprint(org.w3c.dom.Document doc, java.io.OutputStream out)
          Pretty-prints a DOM Document.
 void pprint(org.w3c.dom.Node node, java.io.OutputStream out)
          Pretty-prints a DOM Node.
 void setAltText(java.lang.String altText)
          alt-text- default text for alt attribute.
 void setAsciiChars(boolean asciiChars)
          ascii-chars- convert quotes and dashes to nearest ASCII char.
 void setBreakBeforeBR(boolean breakBeforeBR)
          break-before-br - output newline before <br>.
 void setBurstSlides(boolean burstSlides)
          split- create slides on each h2 element.
 void setConfigurationFromFile(java.lang.String filename)
          Sets the configuration from a configuration file.
 void setConfigurationFromProps(java.util.Properties props)
          Sets the configuration from a properties object.
 void setDocType(java.lang.String doctype)
          doctype- user specified doctype.
 void setDropEmptyParas(boolean dropEmptyParas)
          drop-empty-paras- discard empty p elements.
 void setDropFontTags(boolean dropFontTags)
          drop-font-tags- discard presentation tags.
 void setDropProprietaryAttributes(boolean dropProprietaryAttributes)
          drop-proprietary-attributes- discard proprietary attributes.
 void setEmacs(boolean emacs)
          gnu-emacs- if true format error output for GNU Emacs.
 void setEncloseBlockText(boolean encloseBlockText)
          enclose-block-text- if true text in blocks is wrapped in <p>'s.
 void setEncloseText(boolean encloseText)
          enclose-text- if true text at body is wrapped in <p>'s.
 void setErrfile(java.lang.String errfile)
          Errfile - file name to write errors to.
 void setErrout(java.io.PrintWriter out)
           
 void setEscapeCdata(boolean escapeCdata)
          escape-cdata- replace CDATA sections with escaped text.
 void setFixBackslash(boolean fixBackslash)
          fix-backslash- fix URLs by replacing \ with /.
 void setFixComments(boolean fixComments)
          fix-bad-comments- fix comments with adjacent hyphens.
 void setFixUri(boolean fixUri)
          fix-uri- fix uri references applying URI encoding if necessary.
 void setForceOutput(boolean forceOutput)
          force-output- output document even if errors were found.
 void setHideComments(boolean hideComments)
          hide-comments- hides all (real) comments in output.
 void setHideEndTags(boolean hideEndTags)
          hide-endtags - suppress optional end tags.
 void setIndentAttributes(boolean indentAttributes)
          indent-attributes- newline+indent before each attribute.
 void setIndentCdata(boolean indentCdata)
          indent-cdata- indent CDATA sections.
 void setIndentContent(boolean indentContent)
          indent - indent content of appropriate tags.
 void setInputEncoding(java.lang.String encoding)
          input-encoding the character encoding used for input.
 void setInputStreamName(java.lang.String name)
          InputStreamName - the name of the input stream (printed in the header information).
 void setJoinClasses(boolean joinClasses)
          join-classes- join multiple class attributes.
 void setJoinStyles(boolean joinStyles)
          join-styles- join multiple style attributes.
 void setKeepFileTimes(boolean keepFileTimes)
          keep-time- if true last modified time is preserved.
 void setLiteralAttribs(boolean literalAttribs)
          literal-attributes- if true attributes may use newlines.
 void setLogicalEmphasis(boolean logicalEmphasis)
          logical-emphasis- replace i by em and b by strong.
 void setLowerLiterals(boolean lowerLiterals)
          lower-literals- folds known attribute values to lower case.
 void setMakeBare(boolean makeBare)
          make-bare - remove Microsoft cruft.
 void setMakeClean(boolean makeClean)
          make-clean - remove presentational clutter.
 void setMessageListener(TidyMessageListener listener)
          Attach a TidyMessageListener which will be notified for messages and errors.
 void setNumEntities(boolean numEntities)
          numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.
 void setOnlyErrors(boolean onlyErrors)
          only-errors - if true normal output is suppressed.
 void setOutputEncoding(java.lang.String encoding)
          output-encoding the character encoding used for output.
 void setPrintBodyOnly(boolean bodyOnly)
          print-body-only- output BODY content only.
 void setQuiet(boolean quiet)
          quiet - no 'Parsing X', guessed DTD or summary.
 void setQuoteAmpersand(boolean quoteAmpersand)
          quote-ampersand- output naked ampersand as &.
 void setQuoteMarks(boolean quoteMarks)
          quote-marks- output " marks as &quot;.
 void setQuoteNbsp(boolean quoteNbsp)
          quote-nbsp- output non-breaking space as entity.
 void setRawOut(boolean rawOut)
          output-raw- avoid mapping values > 127 to entities.
 void setRepeatedAttributes(int repeatedAttributes)
          repeated-attributes- keep first or last duplicate attribute.
 void setReplaceColor(boolean replaceColor)
          replace-color- replace hex color attribute values with names.
 void setShowErrors(int showErrors)
          show-errors- set the number of errors to put out.
 void setShowWarnings(boolean showWarnings)
          show-warnings - show warnings?
 void setSmartIndent(boolean smartIndent)
          SmartIndent - does text/block level content effect indentation.
 void setSpaces(int spaces)
          indent-spaces- default indentation.
 void setTabsize(int tabsize)
          tab-size- tab size in chars.
 void setTidyMark(boolean tidyMark)
          tidy-mark- add meta element indicating tidied doc.
 void setTrimEmptyElements(boolean trimEmpty)
          trim-empty-elements- trim empty elements.
 void setUpperCaseAttrs(boolean upperCaseAttrs)
          uppercase-attributes - output attributes in upper case.
 void setUpperCaseTags(boolean upperCaseTags)
          uppercase-tags - output tags in upper case.
 void setWord2000(boolean word2000)
          word-2000- draconian cleaning for Word2000.
 void setWrapAsp(boolean wrapAsp)
          wrap-asp- wrap within ASP pseudo elements.
 void setWrapAttVals(boolean wrapAttVals)
          wrap-attributes- wrap within attribute values.
 void setWrapJste(boolean wrapJste)
          wrap-jste- wrap within JSTE pseudo elements.
 void setWraplen(int wraplen)
          wrap- default wrap margin.
 void setWrapPhp(boolean wrapPhp)
          wrap-php- wrap within PHP pseudo elements.
 void setWrapScriptlets(boolean wrapScriptlets)
          wrap-script-literals- wrap within JavaScript string literals.
 void setWrapSection(boolean wrapSection)
          wrap-sections- wrap within <!
 void setWriteback(boolean writeback)
          writeback - if true then output tidied markup.
 void setXHTML(boolean xhtml)
          output-xhtml - output extensible HTML.
 void setXmlOut(boolean xmlOut)
          output-xml - create output as XML.
 void setXmlPi(boolean xmlPi)
          add-xml-pi- add <?
 void setXmlPIs(boolean xmlPIs)
          assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?
 void setXmlSpace(boolean xmlSpace)
          add-xml-space- if set to yes adds xml:space attr as needed.
 void setXmlTags(boolean xmlTags)
          input-xml - treat input as XML.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Tidy

public Tidy()
Instantiates a new Tidy instance. It's reccomended that a new instance is used at each parsing.

Method Detail

getConfiguration

public Configuration getConfiguration()
Returns the actual configuration

Returns:
tidy configuration

getStderr

public java.io.PrintWriter getStderr()

getParseErrors

public int getParseErrors()
ParseErrors - the number of errors that occurred in the most recent parse operation.

Returns:
number of errors that occurred in the most recent parse operation.

getParseWarnings

public int getParseWarnings()
ParseWarnings - the number of warnings that occurred in the most recent parse operation.

Returns:
number of warnings that occurred in the most recent parse operation.

setInputStreamName

public void setInputStreamName(java.lang.String name)
InputStreamName - the name of the input stream (printed in the header information).

Parameters:
name - input stream name

getInputStreamName

public java.lang.String getInputStreamName()

getErrout

public java.io.PrintWriter getErrout()
Errout - the error output stream.

Returns:
error output stream.

setErrout

public void setErrout(java.io.PrintWriter out)

setConfigurationFromFile

public void setConfigurationFromFile(java.lang.String filename)
Sets the configuration from a configuration file.

Parameters:
filename - configuration file name/path.

setConfigurationFromProps

public void setConfigurationFromProps(java.util.Properties props)
Sets the configuration from a properties object.

Parameters:
props - Properties object

createEmptyDocument

public static org.w3c.dom.Document createEmptyDocument()
Creates an empty DOM Document.

Returns:
a new org.w3c.dom.Document

parse

public Node parse(java.io.InputStream in,
                  java.io.OutputStream out)
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.

Parameters:
in - input
out - optional destination for pretty-printed document
Returns:
parsed org.w3c.tidy.Node

parse

public Node parse(java.io.Reader in,
                  java.io.OutputStream out)
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.

Parameters:
in - input
out - optional destination for pretty-printed document
Returns:
parsed org.w3c.tidy.Node

parse

public Node parse(java.io.Reader in,
                  java.io.Writer out)
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.

Parameters:
in - input
out - optional destination for pretty-printed document
Returns:
parsed org.w3c.tidy.Node

parse

public Node parse(java.io.InputStream in,
                  java.io.Writer out)
Reads from the given input and returns the root Node. If out is non-null, pretty prints to out. Warning: caller is responsible for calling close() on input and output after calling this method.

Parameters:
in - input
out - optional destination for pretty-printed document
Returns:
parsed org.w3c.tidy.Node

parseDOM

public org.w3c.dom.Document parseDOM(java.io.InputStream in,
                                     java.io.OutputStream out)
Parses InputStream in and returns a DOM Document node. If out is non-null, pretty prints to OutputStream out.

Parameters:
in - input stream
out - optional output stream
Returns:
parsed org.w3c.dom.Document

pprint

public void pprint(org.w3c.dom.Document doc,
                   java.io.OutputStream out)
Pretty-prints a DOM Document. Must be an instance of org.w3c.tidy.DOMDocumentImpl. Caller is responsible for closing the outputStream after calling this method.

Parameters:
doc - org.w3c.dom.Document
out - output stream

pprint

public void pprint(org.w3c.dom.Node node,
                   java.io.OutputStream out)
Pretty-prints a DOM Node. Caller is responsible for closing the outputStream after calling this method.

Parameters:
node - org.w3c.dom.Node. Must be an instance of org.w3c.tidy.DOMNodeImpl.
out - output stream

main

public static void main(java.lang.String[] argv)
Command line interface to parser and pretty printer.

Parameters:
argv - command line parameters

mainExec

protected int mainExec(java.lang.String[] argv)
Main method, but returns the return code as an int instead of calling System.exit(code). Needed for testing main method without shutting down tests.

Parameters:
argv - command line parameters
Returns:
return code

setMessageListener

public void setMessageListener(TidyMessageListener listener)
Attach a TidyMessageListener which will be notified for messages and errors.

Parameters:
listener - TidyMessageListener implementation

setSpaces

public void setSpaces(int spaces)
indent-spaces- default indentation.

Parameters:
spaces - number of spaces used for indentation
See Also:
Configuration.spaces

getSpaces

public int getSpaces()
indent-spaces- default indentation.

Returns:
number of spaces used for indentation
See Also:
Configuration.spaces

setWraplen

public void setWraplen(int wraplen)
wrap- default wrap margin.

Parameters:
wraplen - default wrap margin
See Also:
Configuration.wraplen

getWraplen

public int getWraplen()
wrap- default wrap margin.

Returns:
default wrap margin
See Also:
Configuration.wraplen

setTabsize

public void setTabsize(int tabsize)
tab-size- tab size in chars.

Parameters:
tabsize - tab size in chars
See Also:
Configuration.tabsize

getTabsize

public int getTabsize()
tab-size- tab size in chars.

Returns:
tab size in chars
See Also:
Configuration.tabsize

setErrfile

public void setErrfile(java.lang.String errfile)
Errfile - file name to write errors to.

Parameters:
errfile - file name to write errors to
See Also:
Configuration.errfile

getErrfile

public java.lang.String getErrfile()
Errfile - file name to write errors to.

Returns:
error file name
See Also:
Configuration.errfile

setWriteback

public void setWriteback(boolean writeback)
writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.

Parameters:
writeback - true= output tidied markup
See Also:
Configuration.writeback

getWriteback

public boolean getWriteback()
writeback - if true then output tidied markup. NOTE: this property is ignored when parsing from an InputStream.

Returns:
true if tidy will output tidied markup in input file
See Also:
Configuration.writeback

setOnlyErrors

public void setOnlyErrors(boolean onlyErrors)
only-errors - if true normal output is suppressed.

Parameters:
onlyErrors - if true normal output is suppressed.
See Also:
Configuration.onlyErrors

getOnlyErrors

public boolean getOnlyErrors()
only-errors - if true normal output is suppressed.

Returns:
true if normal output is suppressed.
See Also:
Configuration.onlyErrors

setShowWarnings

public void setShowWarnings(boolean showWarnings)
show-warnings - show warnings? (errors are always shown).

Parameters:
showWarnings - if false warnings are not shown
See Also:
Configuration.showWarnings

getShowWarnings

public boolean getShowWarnings()
show-warnings - show warnings? (errors are always shown).

Returns:
false if warnings are not shown
See Also:
Configuration.showWarnings

setQuiet

public void setQuiet(boolean quiet)
quiet - no 'Parsing X', guessed DTD or summary.

Parameters:
quiet - true= don't output summary, warnings or errors
See Also:
Configuration.quiet

getQuiet

public boolean getQuiet()
quiet - no 'Parsing X', guessed DTD or summary.

Returns:
true if tidy will not output summary, warnings or errors
See Also:
Configuration.quiet

setIndentContent

public void setIndentContent(boolean indentContent)
indent - indent content of appropriate tags.

Parameters:
indentContent - indent content of appropriate tags
See Also:
Configuration.indentContent

getIndentContent

public boolean getIndentContent()
indent - indent content of appropriate tags.

Returns:
true if tidy will indent content of appropriate tags
See Also:
Configuration.indentContent

setSmartIndent

public void setSmartIndent(boolean smartIndent)
SmartIndent - does text/block level content effect indentation.

Parameters:
smartIndent - true if text/block level content should effect indentation
See Also:
Configuration.smartIndent

getSmartIndent

public boolean getSmartIndent()
SmartIndent - does text/block level content effect indentation.

Returns:
true if text/block level content should effect indentation
See Also:
Configuration.smartIndent

setHideEndTags

public void setHideEndTags(boolean hideEndTags)
hide-endtags - suppress optional end tags.

Parameters:
hideEndTags - true= suppress optional end tags
See Also:
Configuration.hideEndTags

getHideEndTags

public boolean getHideEndTags()
hide-endtags - suppress optional end tags.

Returns:
true if tidy will suppress optional end tags
See Also:
Configuration.hideEndTags

setXmlTags

public void setXmlTags(boolean xmlTags)
input-xml - treat input as XML.

Parameters:
xmlTags - true if tidy should treat input as XML
See Also:
Configuration.xmlTags

getXmlTags

public boolean getXmlTags()
input-xml - treat input as XML.

Returns:
true if tidy will treat input as XML
See Also:
Configuration.xmlTags

setXmlOut

public void setXmlOut(boolean xmlOut)
output-xml - create output as XML.

Parameters:
xmlOut - true if tidy should create output as xml
See Also:
Configuration.xmlOut

getXmlOut

public boolean getXmlOut()
output-xml - create output as XML.

Returns:
true if tidy will create output as xml
See Also:
Configuration.xmlOut

setXHTML

public void setXHTML(boolean xhtml)
output-xhtml - output extensible HTML.

Parameters:
xhtml - true if tidy should output XHTML
See Also:
Configuration.xHTML

getXHTML

public boolean getXHTML()
output-xhtml - output extensible HTML.

Returns:
true if tidy will output XHTML
See Also:
Configuration.xHTML

setUpperCaseTags

public void setUpperCaseTags(boolean upperCaseTags)
uppercase-tags - output tags in upper case.

Parameters:
upperCaseTags - true if tidy should output tags in upper case (default is lowercase)
See Also:
Configuration.upperCaseTags

getUpperCaseTags

public boolean getUpperCaseTags()
uppercase-tags - output tags in upper case.

Returns:
true if tidy should will tags in upper case
See Also:
Configuration.upperCaseTags

setUpperCaseAttrs

public void setUpperCaseAttrs(boolean upperCaseAttrs)
uppercase-attributes - output attributes in upper case.

Parameters:
upperCaseAttrs - true if tidy should output attributes in upper case (default is lowercase)
See Also:
Configuration.upperCaseAttrs

getUpperCaseAttrs

public boolean getUpperCaseAttrs()
uppercase-attributes - output attributes in upper case.

Returns:
true if tidy should will attributes in upper case
See Also:
Configuration.upperCaseAttrs

setMakeClean

public void setMakeClean(boolean makeClean)
make-clean - remove presentational clutter.

Parameters:
makeClean - true to remove presentational clutter
See Also:
Configuration.makeClean

getMakeClean

public boolean getMakeClean()
make-clean - remove presentational clutter.

Returns:
true if tidy will remove presentational clutter
See Also:
Configuration.makeClean

setMakeBare

public void setMakeBare(boolean makeBare)
make-bare - remove Microsoft cruft.

Parameters:
makeBare - true to remove Microsoft cruft
See Also:
Configuration.makeBare

getMakeBare

public boolean getMakeBare()
make-clean - remove Microsoft cruft.

Returns:
true if tidy will remove Microsoft cruft
See Also:
Configuration.makeBare

setBreakBeforeBR

public void setBreakBeforeBR(boolean breakBeforeBR)
break-before-br - output newline before <br>.

Parameters:
breakBeforeBR - true if tidy should output a newline before <br>
See Also:
Configuration.breakBeforeBR

getBreakBeforeBR

public boolean getBreakBeforeBR()
break-before-br - output newline before <br>.

Returns:
true if tidy will output a newline before <br>
See Also:
Configuration.breakBeforeBR

setBurstSlides

public void setBurstSlides(boolean burstSlides)
split- create slides on each h2 element.

Parameters:
burstSlides - true if tidy should create slides on each h2 element
See Also:
Configuration.burstSlides

getBurstSlides

public boolean getBurstSlides()
split- create slides on each h2 element.

Returns:
true if tidy will create slides on each h2 element
See Also:
Configuration.burstSlides

setNumEntities

public void setNumEntities(boolean numEntities)
numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.

Parameters:
numEntities - true if tidy should output entities in the numeric form.
See Also:
Configuration.numEntities

getNumEntities

public boolean getNumEntities()
numeric-entities- output entities other than the built-in HTML entities in the numeric rather than the named entity form.

Returns:
true if tidy will output entities in the numeric form.
See Also:
Configuration.numEntities

setQuoteMarks

public void setQuoteMarks(boolean quoteMarks)
quote-marks- output " marks as &quot;.

Parameters:
quoteMarks - true if tidy should output " marks as &quot;
See Also:
Configuration.quoteMarks

getQuoteMarks

public boolean getQuoteMarks()
quote-marks- output " marks as &quot;.

Returns:
true if tidy will output " marks as &quot;
See Also:
Configuration.quoteMarks

setQuoteNbsp

public void setQuoteNbsp(boolean quoteNbsp)
quote-nbsp- output non-breaking space as entity.

Parameters:
quoteNbsp - true if tidy should output non-breaking space as entity
See Also:
Configuration.quoteNbsp

getQuoteNbsp

public boolean getQuoteNbsp()
quote-nbsp- output non-breaking space as entity.

Returns:
true if tidy will output non-breaking space as entity
See Also:
Configuration.quoteNbsp

setQuoteAmpersand

public void setQuoteAmpersand(boolean quoteAmpersand)
quote-ampersand- output naked ampersand as &.

Parameters:
quoteAmpersand - true if tidy should output naked ampersand as &
See Also:
Configuration.quoteAmpersand

getQuoteAmpersand

public boolean getQuoteAmpersand()
quote-ampersand- output naked ampersand as &.

Returns:
true if tidy will output naked ampersand as &
See Also:
Configuration.quoteAmpersand

setWrapAttVals

public void setWrapAttVals(boolean wrapAttVals)
wrap-attributes- wrap within attribute values.

Parameters:
wrapAttVals - true if tidy should wrap within attribute values
See Also:
Configuration.wrapAttVals

getWrapAttVals

public boolean getWrapAttVals()
wrap-attributes- wrap within attribute values.

Returns:
true if tidy will wrap within attribute values
See Also:
Configuration.wrapAttVals

setWrapScriptlets

public void setWrapScriptlets(boolean wrapScriptlets)
wrap-script-literals- wrap within JavaScript string literals.

Parameters:
wrapScriptlets - true if tidy should wrap within JavaScript string literals
See Also:
Configuration.wrapScriptlets

getWrapScriptlets

public boolean getWrapScriptlets()
wrap-script-literals- wrap within JavaScript string literals.

Returns:
true if tidy will wrap within JavaScript string literals
See Also:
Configuration.wrapScriptlets

setWrapSection

public void setWrapSection(boolean wrapSection)
wrap-sections- wrap within <![ ... ]> section tags

Parameters:
wrapSection - true if tidy should wrap within <![ ... ]> section tags
See Also:
Configuration.wrapSection

getWrapSection

public boolean getWrapSection()
wrap-sections- wrap within <![ ... ]> section tags

Returns:
true if tidy will wrap within <![ ... ]> section tags
See Also:
Configuration.wrapSection

setAltText

public void setAltText(java.lang.String altText)
alt-text- default text for alt attribute.

Parameters:
altText - default text for alt attribute
See Also:
Configuration.altText

getAltText

public java.lang.String getAltText()
alt-text- default text for alt attribute.

Returns:
default text for alt attribute
See Also:
Configuration.altText

setXmlPi

public void setXmlPi(boolean xmlPi)
add-xml-pi- add <?xml?> for XML docs.

Parameters:
xmlPi - true if tidy should add <?xml?> for XML docs
See Also:
Configuration.xmlPi

getXmlPi

public boolean getXmlPi()
add-xml-pi- add <?xml?> for XML docs.

Returns:
true if tidy will add <?xml?> for XML docs
See Also:
Configuration.xmlPi

setDropFontTags

public void setDropFontTags(boolean dropFontTags)
drop-font-tags- discard presentation tags.

Parameters:
dropFontTags - true if tidy should discard presentation tags
See Also:
Configuration.dropFontTags

getDropFontTags

public boolean getDropFontTags()
drop-font-tags- discard presentation tags.

Returns:
true if tidy will discard presentation tags
See Also:
Configuration.dropFontTags

setDropProprietaryAttributes

public void setDropProprietaryAttributes(boolean dropProprietaryAttributes)
drop-proprietary-attributes- discard proprietary attributes.

Parameters:
dropProprietaryAttributes - true if tidy should discard proprietary attributes
See Also:
Configuration.dropProprietaryAttributes

getDropProprietaryAttributes

public boolean getDropProprietaryAttributes()
drop-proprietary-attributes- discard proprietary attributes.

Returns:
true if tidy will discard proprietary attributes
See Also:
Configuration.dropProprietaryAttributes

setDropEmptyParas

public void setDropEmptyParas(boolean dropEmptyParas)
drop-empty-paras- discard empty p elements.

Parameters:
dropEmptyParas - true if tidy should discard empty p elements
See Also:
Configuration.dropEmptyParas

getDropEmptyParas

public boolean getDropEmptyParas()
drop-empty-paras- discard empty p elements.

Returns:
true if tidy will discard empty p elements
See Also:
Configuration.dropEmptyParas

setFixComments

public void setFixComments(boolean fixComments)
fix-bad-comments- fix comments with adjacent hyphens.

Parameters:
fixComments - true if tidy should fix comments with adjacent hyphens
See Also:
Configuration.fixComments

getFixComments

public boolean getFixComments()
fix-bad-comments- fix comments with adjacent hyphens.

Returns:
true if tidy will fix comments with adjacent hyphens
See Also:
Configuration.fixComments

setWrapAsp

public void setWrapAsp(boolean wrapAsp)
wrap-asp- wrap within ASP pseudo elements.

Parameters:
wrapAsp - true if tidy should wrap within ASP pseudo elements
See Also:
Configuration.wrapAsp

getWrapAsp

public boolean getWrapAsp()
wrap-asp- wrap within ASP pseudo elements.

Returns:
true if tidy will wrap within ASP pseudo elements
See Also:
Configuration.wrapAsp

setWrapJste

public void setWrapJste(boolean wrapJste)
wrap-jste- wrap within JSTE pseudo elements.

Parameters:
wrapJste - true if tidy should wrap within JSTE pseudo elements
See Also:
Configuration.wrapJste

getWrapJste

public boolean getWrapJste()
wrap-jste- wrap within JSTE pseudo elements.

Returns:
true if tidy will wrap within JSTE pseudo elements
See Also:
Configuration.wrapJste

setWrapPhp

public void setWrapPhp(boolean wrapPhp)
wrap-php- wrap within PHP pseudo elements.

Parameters:
wrapPhp - true if tidy should wrap within PHP pseudo elements
See Also:
Configuration.wrapPhp

getWrapPhp

public boolean getWrapPhp()
wrap-php- wrap within PHP pseudo elements.

Returns:
true if tidy will wrap within PHP pseudo elements
See Also:
Configuration.wrapPhp

setFixBackslash

public void setFixBackslash(boolean fixBackslash)
fix-backslash- fix URLs by replacing \ with /.

Parameters:
fixBackslash - true if tidy should fix URLs by replacing \ with /
See Also:
Configuration.fixBackslash

getFixBackslash

public boolean getFixBackslash()
fix-backslash- fix URLs by replacing \ with /.

Returns:
true if tidy will fix URLs by replacing \ with /
See Also:
Configuration.fixBackslash

setIndentAttributes

public void setIndentAttributes(boolean indentAttributes)
indent-attributes- newline+indent before each attribute.

Parameters:
indentAttributes - true if tidy should output a newline+indent before each attribute
See Also:
Configuration.indentAttributes

getIndentAttributes

public boolean getIndentAttributes()
indent-attributes- newline+indent before each attribute.

Returns:
true if tidy will output a newline+indent before each attribute
See Also:
Configuration.indentAttributes

setDocType

public void setDocType(java.lang.String doctype)
doctype- user specified doctype.

Parameters:
doctype - omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
See Also:
Configuration.docTypeStr, Configuration.docTypeMode

getDocType

public java.lang.String getDocType()
doctype- user specified doctype.

Returns:
omit | auto | strict | loose | fpi where the fpi is a string similar to "-//ACME//DTD HTML 3.14159//EN" Note: for fpi include the double-quotes in the string.
See Also:
Configuration.docTypeStr, Configuration.docTypeMode

setLogicalEmphasis

public void setLogicalEmphasis(boolean logicalEmphasis)
logical-emphasis- replace i by em and b by strong.

Parameters:
logicalEmphasis - true if tidy should replace i by em and b by strong
See Also:
Configuration.logicalEmphasis

getLogicalEmphasis

public boolean getLogicalEmphasis()
logical-emphasis- replace i by em and b by strong.

Returns:
true if tidy will replace i by em and b by strong
See Also:
Configuration.logicalEmphasis

setXmlPIs

public void setXmlPIs(boolean xmlPIs)
assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.

Parameters:
xmlPIs - true if tidy should expect a ?> at the end of processing instructions
See Also:
Configuration.xmlPIs

getXmlPIs

public boolean getXmlPIs()
assume-xml-procins This option specifies if Tidy should change the parsing of processing instructions to require ?> as the terminator rather than >. This option is automatically set if the input is in XML.

Returns:
true if tidy will expect a ?> at the end of processing instructions
See Also:
Configuration.xmlPIs

setEncloseText

public void setEncloseText(boolean encloseText)
enclose-text- if true text at body is wrapped in <p>'s.

Parameters:
encloseText - true if tidy should wrap text at body in <p>'s.
See Also:
Configuration.encloseBodyText

getEncloseText

public boolean getEncloseText()
enclose-text- if true text at body is wrapped in <p>'s.

Returns:
true if tidy will wrap text at body in <p>'s.
See Also:
Configuration.encloseBodyText

setEncloseBlockText

public void setEncloseBlockText(boolean encloseBlockText)
enclose-block-text- if true text in blocks is wrapped in <p>'s.

Parameters:
encloseBlockText - true if tidy should wrap text text in blocks in <p>'s.
See Also:
Configuration.encloseBlockText

getEncloseBlockText

public boolean getEncloseBlockText()
enclose-block-text- if true text in blocks is wrapped in <p>'s. return true if tidy should will text text in blocks in <p>'s.

See Also:
Configuration.encloseBlockText

setWord2000

public void setWord2000(boolean word2000)
word-2000- draconian cleaning for Word2000.

Parameters:
word2000 - true if tidy should clean word2000 documents
See Also:
Configuration.word2000

getWord2000

public boolean getWord2000()
word-2000- draconian cleaning for Word2000.

Returns:
true if tidy will clean word2000 documents
See Also:
Configuration.word2000

setTidyMark

public void setTidyMark(boolean tidyMark)
tidy-mark- add meta element indicating tidied doc.

Parameters:
tidyMark - true if tidy should add meta element indicating tidied doc
See Also:
Configuration.tidyMark

getTidyMark

public boolean getTidyMark()
tidy-mark- add meta element indicating tidied doc.

Returns:
true if tidy will add meta element indicating tidied doc
See Also:
Configuration.tidyMark

setXmlSpace

public void setXmlSpace(boolean xmlSpace)
add-xml-space- if set to yes adds xml:space attr as needed.

Parameters:
xmlSpace - true if tidy should add xml:space attr as needed
See Also:
Configuration.xmlSpace

getXmlSpace

public boolean getXmlSpace()
add-xml-space- if set to yes adds xml:space attr as needed.

Returns:
true if tidy will add xml:space attr as needed
See Also:
Configuration.xmlSpace

setEmacs

public void setEmacs(boolean emacs)
gnu-emacs- if true format error output for GNU Emacs.

Parameters:
emacs - true if tidy should format error output for GNU Emacs
See Also:
Configuration.emacs

getEmacs

public boolean getEmacs()
gnu-emacs- if true format error output for GNU Emacs.

Returns:
true if tidy will format error output for GNU Emacs
See Also:
Configuration.emacs

setLiteralAttribs

public void setLiteralAttribs(boolean literalAttribs)
literal-attributes- if true attributes may use newlines.

Parameters:
literalAttribs - true if attributes may use newlines
See Also:
Configuration.literalAttribs

getLiteralAttribs

public boolean getLiteralAttribs()
literal-attributes- if true attributes may use newlines.

Returns:
true if attributes may use newlines
See Also:
Configuration.literalAttribs

setPrintBodyOnly

public void setPrintBodyOnly(boolean bodyOnly)
print-body-only- output BODY content only.

Parameters:
bodyOnly - true = print only the document body
See Also:
Configuration.bodyOnly

getPrintBodyOnly

public boolean getPrintBodyOnly()
print-body-only- output BODY content only.

Returns:
true if tidy will print only the document body

setFixUri

public void setFixUri(boolean fixUri)
fix-uri- fix uri references applying URI encoding if necessary.

Parameters:
fixUri - true = fix uri references
See Also:
Configuration.fixUri

getFixUri

public boolean getFixUri()
fix-uri- output BODY content only.

Returns:
true if tidy will fix uri references

setLowerLiterals

public void setLowerLiterals(boolean lowerLiterals)
lower-literals- folds known attribute values to lower case.

Parameters:
lowerLiterals - true = folds known attribute values to lower case
See Also:
Configuration.lowerLiterals

getLowerLiterals

public boolean getLowerLiterals()
lower-literals- folds known attribute values to lower case.

Returns:
true if tidy will folds known attribute values to lower case

setHideComments

public void setHideComments(boolean hideComments)
hide-comments- hides all (real) comments in output.

Parameters:
hideComments - true = hides all comments in output
See Also:
Configuration.hideComments

getHideComments

public boolean getHideComments()
hide-comments- hides all (real) comments in output.

Returns:
true if tidy will hide all comments in output

setIndentCdata

public void setIndentCdata(boolean indentCdata)
indent-cdata- indent CDATA sections.

Parameters:
indentCdata - true = indent CDATA sections
See Also:
Configuration.indentCdata

getIndentCdata

public boolean getIndentCdata()
indent-cdata- indent CDATA sections.

Returns:
true if tidy will indent CDATA sections

setForceOutput

public void setForceOutput(boolean forceOutput)
force-output- output document even if errors were found.

Parameters:
forceOutput - true = output document even if errors were found
See Also:
Configuration.forceOutput

getForceOutput

public boolean getForceOutput()
force-output- output document even if errors were found.

Returns:
true if tidy will output document even if errors were found

setShowErrors

public void setShowErrors(int showErrors)
show-errors- set the number of errors to put out.

Parameters:
showErrors - number of errors to put out
See Also:
Configuration.showErrors

getShowErrors

public int getShowErrors()
show-errors- number of errors to put out.

Returns:
the number of errors tidy will put out

setAsciiChars

public void setAsciiChars(boolean asciiChars)
ascii-chars- convert quotes and dashes to nearest ASCII char.

Parameters:
asciiChars - true = convert quotes and dashes to nearest ASCII char
See Also:
Configuration.asciiChars

getAsciiChars

public boolean getAsciiChars()
ascii-chars- convert quotes and dashes to nearest ASCII char.

Returns:
true if tidy will convert quotes and dashes to nearest ASCII char

setJoinClasses

public void setJoinClasses(boolean joinClasses)
join-classes- join multiple class attributes.

Parameters:
joinClasses - true = join multiple class attributes
See Also:
Configuration.joinClasses

getJoinClasses

public boolean getJoinClasses()
join-classes- join multiple class attributes.

Returns:
true if tidy will join multiple class attributes

setJoinStyles

public void setJoinStyles(boolean joinStyles)
join-styles- join multiple style attributes.

Parameters:
joinStyles - true = join multiple style attributes
See Also:
Configuration.joinStyles

getJoinStyles

public boolean getJoinStyles()
join-styles- join multiple style attributes.

Returns:
true if tidy will join multiple style attributes

setTrimEmptyElements

public void setTrimEmptyElements(boolean trimEmpty)
trim-empty-elements- trim empty elements.

Parameters:
trim-empty-elements - true = trim empty elements
See Also:
Configuration.trimEmpty

getTrimEmptyElements

public boolean getTrimEmptyElements()
trim-empty-elements- trim empty elements.

Returns:
true if tidy will trim empty elements

setReplaceColor

public void setReplaceColor(boolean replaceColor)
replace-color- replace hex color attribute values with names.

Parameters:
replaceColor - true = replace hex color attribute values with names
See Also:
Configuration.replaceColor

getReplaceColor

public boolean getReplaceColor()
replace-color- replace hex color attribute values with names.

Returns:
true if tidy will replace hex color attribute values with names

setEscapeCdata

public void setEscapeCdata(boolean escapeCdata)
escape-cdata- replace CDATA sections with escaped text.

Parameters:
escapeCdata - true = replace CDATA sections with escaped text
See Also:
Configuration.escapeCdata

getEscapeCdata

public boolean getEscapeCdata()
escape-cdata -replace CDATA sections with escaped text.

Returns:
true if tidy will replace CDATA sections with escaped text

setRepeatedAttributes

public void setRepeatedAttributes(int repeatedAttributes)
repeated-attributes- keep first or last duplicate attribute.

Parameters:
repeatedAttributes - Configuration.KEEP_FIRST | Configuration.KEEP_LAST
See Also:
Configuration.duplicateAttrs

getRepeatedAttributes

public int getRepeatedAttributes()
repeated-attributes- keep first or last duplicate attribute.

Returns:
Configuration.KEEP_FIRST | Configuration.KEEP_LAST

setKeepFileTimes

public void setKeepFileTimes(boolean keepFileTimes)
keep-time- if true last modified time is preserved.

Parameters:
keepFileTimes - true if tidy should preserved last modified time in input file.
See Also:
Configuration.keepFileTimes

getKeepFileTimes

public boolean getKeepFileTimes()
keep-time- if true last modified time is preserved.

Returns:
true if tidy will preserved last modified time in input file.
See Also:
Configuration.keepFileTimes

setRawOut

public void setRawOut(boolean rawOut)
output-raw- avoid mapping values > 127 to entities. This has the same effect of specifying a "raw" encoding in the original version of tidy.

Parameters:
rawOut - avoid mapping values > 127 to entities
See Also:
Configuration.rawOut

getRawOut

public boolean getRawOut()
output-raw- avoid mapping values > 127 to entities.

Returns:
true if tidy will not map values > 127 to entities
See Also:
Configuration.rawOut

setInputEncoding

public void setInputEncoding(java.lang.String encoding)
input-encoding the character encoding used for input.

Parameters:
encoding - a valid java encoding name

getInputEncoding

public java.lang.String getInputEncoding()
input-encoding the character encoding used for input.

Returns:
the java name of the encoding currently used for input

setOutputEncoding

public void setOutputEncoding(java.lang.String encoding)
output-encoding the character encoding used for output.

Parameters:
encoding - a valid java encoding name

getOutputEncoding

public java.lang.String getOutputEncoding()
output-encoding the character encoding used for output.

Returns:
the java name of the encoding currently used for output


Copyright © 2000-2006 sourceforge. All Rights Reserved.