org.w3c.tidy
Class Configuration

java.lang.Object
  extended by org.w3c.tidy.Configuration
All Implemented Interfaces:
java.io.Serializable

public class Configuration
extends java.lang.Object
implements java.io.Serializable

Read configuration file and manage configuration properties. Configuration files associate a property name with a value. The format is that of a Java .properties file.

Version:
$Revision: 807 $ ($Author: fgiust $)
Author:
Dave Raggett dsr@w3.org , Andy Quick ac.quick@sympatico.ca (translation to Java), Fabrizio Giustina
See Also:
Serialized Form

Field Summary
protected  java.lang.String altText
          default text for alt attribute.
static int ASCII
          Deprecated.  
protected  boolean asciiChars
          convert quotes and dashes to nearest ASCII char.
static int BIG5
          Deprecated.  
protected  boolean bodyOnly
          output BODY content only.
protected  boolean breakBeforeBR
          o/p newline before br or not?
protected  boolean burstSlides
          create slides on each h2 element.
protected  java.lang.String cssPrefix
          CSS class naming for -clean option.
protected  int definedTags
          track what types of tags user has defined to eliminate unnecessary searches.
static int DOCTYPE_AUTO
          treatment of doctype: auto.
static int DOCTYPE_LOOSE
          treatment of doctype: loose.
static int DOCTYPE_OMIT
          treatment of doctype: omit.
static int DOCTYPE_STRICT
          treatment of doctype: strict.
static int DOCTYPE_USER
          treatment of doctype: user.
protected  int docTypeMode
          see doctype property.
protected  java.lang.String docTypeStr
          user specified doctype.
protected  boolean dropEmptyParas
          discard empty p elements.
protected  boolean dropFontTags
          discard presentation tags.
protected  boolean dropProprietaryAttributes
          discard proprietary attributes.
protected  int duplicateAttrs
          Keep first or last duplicate attribute.
protected  boolean emacs
          if true format error output for GNU Emacs.
protected  boolean encloseBlockText
          if yes text in blocks is wrapped in p's.
protected  boolean encloseBodyText
          if yes text at body is wrapped in p's.
protected  java.lang.String errfile
          file name to write errors to.
protected  boolean escapeCdata
          replace CDATA sections with escaped text.
protected  boolean fixBackslash
          fix URLs by replacing \ with /.
protected  boolean fixComments
          fix comments with adjacent hyphens.
protected  boolean fixUri
          properly escape URLs.
protected  boolean forceOutput
          output document even if errors were found.
protected  boolean hideComments
          hides all (real) comments in output.
protected  boolean hideEndTags
          suppress optional end tags.
protected  boolean htmlOut
          output plain-old HTML, even for XHTML input.
protected  boolean indentAttributes
          newline+indent before each attribute.
protected  boolean indentCdata
          indent CDATA sections.
protected  boolean indentContent
          indent content of appropriate tags.
static int ISO2022
          Deprecated.  
protected  boolean joinClasses
          join multiple class attributes.
protected  boolean joinStyles
          join multiple style attributes.
static int KEEP_FIRST
          Keep first duplicate attribute.
static int KEEP_LAST
          Keep last duplicate attribute.
protected  boolean keepFileTimes
          if yes last modied time is preserved.
protected  java.lang.String language
          RJ language property.
static int LATIN1
          Deprecated.  
protected  boolean literalAttribs
          if true attributes may use newlines.
protected  boolean logicalEmphasis
          replace i by em and b by strong.
protected  boolean lowerLiterals
          folds known attribute values to lower case.
static int MACROMAN
          Deprecated.  
protected  boolean makeBare
          Make bare HTML: remove Microsoft cruft.
protected  boolean makeClean
          remove presentational clutter.
protected  boolean ncr
          allow numeric character references.
protected  char[] newline
          bytes for the newline marker.
protected  boolean numEntities
          use numeric entities.
protected  boolean onlyErrors
          if true normal output is suppressed.
protected  boolean quiet
          no 'Parsing X', guessed DTD or summary.
protected  boolean quoteAmpersand
          output naked ampersand as &.
protected  boolean quoteMarks
          output " marks as ".
protected  boolean quoteNbsp
          output non-breaking space as entity.
static int RAW
          Deprecated. use Tidy.setRawOut(true) for raw output
protected  boolean rawOut
          Avoid mapping values > 127 to entities.
protected  boolean replaceColor
          replace hex color attribute values with names.
protected  java.lang.String replacementCharEncoding
          char encoding used when replacing illegal SGML chars, regardless of specified encoding.
protected  Report report
          Report instance.
static int SHIFTJIS
          Deprecated.  
protected  int showErrors
          number of errors to put out.
protected  boolean showWarnings
          however errors are always shown.
protected  java.lang.String slidestyle
          Deprecated. does nothing
protected  boolean smartIndent
          does text/block level content effect indentation.
protected  int spaces
          default indentation.
protected  int tabsize
          default tab size (8).
protected  boolean tidyMark
          add meta element indicating tidied doc.
protected  boolean trimEmpty
          trim empty elements.
protected  TagTable tt
          TagTable associated with this Configuration.
protected  boolean upperCaseAttrs
          output attributes in upper not lower case.
protected  boolean upperCaseTags
          output tags in upper not lower case.
static int UTF16
          Deprecated.  
static int UTF16BE
          Deprecated.  
static int UTF16LE
          Deprecated.  
static int UTF8
          Deprecated.  
static int WIN1252
          Deprecated.  
protected  boolean word2000
          draconian cleaning for Word2000.
protected  boolean wrapAsp
          wrap within ASP pseudo elements.
protected  boolean wrapAttVals
          wrap within attribute values.
protected  boolean wrapJste
          wrap within JSTE pseudo elements.
protected  int wraplen
          default wrap margin (68).
protected  boolean wrapPhp
          wrap within PHP pseudo elements.
protected  boolean wrapScriptlets
          wrap within JavaScript string literals.
protected  boolean wrapSection
          wrap within CDATA section tags.
protected  boolean writeback
          if true then output tidied markup.
protected  boolean xHTML
          output extensible HTML.
protected  boolean xmlOut
          create output as XML.
protected  boolean xmlPi
          add <?
protected  boolean xmlPIs
          If set to yes PIs must end with ?
protected  boolean xmlSpace
          if set to yes adds xml:space attr as needed.
protected  boolean xmlTags
          treat input as XML.
 
Constructor Summary
protected Configuration(Report report)
          Instantiates a new Configuration.
 
Method Summary
 void addProps(java.util.Properties p)
          adds configuration Properties.
 void adjust()
          Ensure that config is self consistent.
protected  java.lang.String convertCharEncoding(int code)
          Convert a char encoding from the deprecated tidy constant to a standard java encoding name.
protected  java.lang.String getInCharEncodingName()
          Getter for inCharEncodingName.
protected  java.lang.String getOutCharEncodingName()
          Getter for outCharEncodingName.
static boolean isKnownOption(java.lang.String name)
          Is the given String a valid configuration flag?
 void parseFile(java.lang.String filename)
          Parses a property file.
protected  void setInCharEncoding(int encoding)
          Deprecated. use setInCharEncodingName(String)
protected  void setInCharEncodingName(java.lang.String encoding)
          Setter for inCharEncodingName.
protected  void setInOutEncodingName(java.lang.String encoding)
          Setter for inOutCharEncodingName.
protected  void setOutCharEncoding(int encoding)
          Deprecated. use setOutCharEncodingName(String)
protected  void setOutCharEncodingName(java.lang.String encoding)
          Setter for outCharEncodingName.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

RAW

public static final int RAW
Deprecated. use Tidy.setRawOut(true) for raw output
character encoding = RAW.

See Also:
Constant Field Values

ASCII

public static final int ASCII
Deprecated. 
character encoding = ASCII.

See Also:
Constant Field Values

LATIN1

public static final int LATIN1
Deprecated. 
character encoding = LATIN1.

See Also:
Constant Field Values

UTF8

public static final int UTF8
Deprecated. 
character encoding = UTF8.

See Also:
Constant Field Values

ISO2022

public static final int ISO2022
Deprecated. 
character encoding = ISO2022.

See Also:
Constant Field Values

MACROMAN

public static final int MACROMAN
Deprecated. 
character encoding = MACROMAN.

See Also:
Constant Field Values

UTF16LE

public static final int UTF16LE
Deprecated. 
character encoding = UTF16LE.

See Also:
Constant Field Values

UTF16BE

public static final int UTF16BE
Deprecated. 
character encoding = UTF16BE.

See Also:
Constant Field Values

UTF16

public static final int UTF16
Deprecated. 
character encoding = UTF16.

See Also:
Constant Field Values

WIN1252

public static final int WIN1252
Deprecated. 
character encoding = WIN1252.

See Also:
Constant Field Values

BIG5

public static final int BIG5
Deprecated. 
character encoding = BIG5.

See Also:
Constant Field Values

SHIFTJIS

public static final int SHIFTJIS
Deprecated. 
character encoding = SHIFTJIS.

See Also:
Constant Field Values

DOCTYPE_OMIT

public static final int DOCTYPE_OMIT
treatment of doctype: omit.

See Also:
Constant Field Values

DOCTYPE_AUTO

public static final int DOCTYPE_AUTO
treatment of doctype: auto.

See Also:
Constant Field Values

DOCTYPE_STRICT

public static final int DOCTYPE_STRICT
treatment of doctype: strict.

See Also:
Constant Field Values

DOCTYPE_LOOSE

public static final int DOCTYPE_LOOSE
treatment of doctype: loose.

See Also:
Constant Field Values

DOCTYPE_USER

public static final int DOCTYPE_USER
treatment of doctype: user.

See Also:
Constant Field Values

KEEP_LAST

public static final int KEEP_LAST
Keep last duplicate attribute.

See Also:
Constant Field Values

KEEP_FIRST

public static final int KEEP_FIRST
Keep first duplicate attribute.

See Also:
Constant Field Values

spaces

protected int spaces
default indentation.


wraplen

protected int wraplen
default wrap margin (68).


tabsize

protected int tabsize
default tab size (8).


docTypeMode

protected int docTypeMode
see doctype property.


duplicateAttrs

protected int duplicateAttrs
Keep first or last duplicate attribute.


altText

protected java.lang.String altText
default text for alt attribute.


slidestyle

protected java.lang.String slidestyle
Deprecated. does nothing
style sheet for slides.


language

protected java.lang.String language
RJ language property.


docTypeStr

protected java.lang.String docTypeStr
user specified doctype.


errfile

protected java.lang.String errfile
file name to write errors to.


writeback

protected boolean writeback
if true then output tidied markup.


onlyErrors

protected boolean onlyErrors
if true normal output is suppressed.


showWarnings

protected boolean showWarnings
however errors are always shown.


quiet

protected boolean quiet
no 'Parsing X', guessed DTD or summary.


indentContent

protected boolean indentContent
indent content of appropriate tags.


smartIndent

protected boolean smartIndent
does text/block level content effect indentation.


hideEndTags

protected boolean hideEndTags
suppress optional end tags.


xmlTags

protected boolean xmlTags
treat input as XML.


xmlOut

protected boolean xmlOut
create output as XML.


xHTML

protected boolean xHTML
output extensible HTML.


htmlOut

protected boolean htmlOut
output plain-old HTML, even for XHTML input. Yes means set explicitly.


xmlPi

protected boolean xmlPi
add <?xml?> for XML docs.


upperCaseTags

protected boolean upperCaseTags
output tags in upper not lower case.


upperCaseAttrs

protected boolean upperCaseAttrs
output attributes in upper not lower case.


makeClean

protected boolean makeClean
remove presentational clutter.


makeBare

protected boolean makeBare
Make bare HTML: remove Microsoft cruft.


logicalEmphasis

protected boolean logicalEmphasis
replace i by em and b by strong.


dropFontTags

protected boolean dropFontTags
discard presentation tags.


dropProprietaryAttributes

protected boolean dropProprietaryAttributes
discard proprietary attributes.


dropEmptyParas

protected boolean dropEmptyParas
discard empty p elements.


fixComments

protected boolean fixComments
fix comments with adjacent hyphens.


trimEmpty

protected boolean trimEmpty
trim empty elements.


breakBeforeBR

protected boolean breakBeforeBR
o/p newline before br or not?


burstSlides

protected boolean burstSlides
create slides on each h2 element.


numEntities

protected boolean numEntities
use numeric entities.


quoteMarks

protected boolean quoteMarks
output " marks as ".


quoteNbsp

protected boolean quoteNbsp
output non-breaking space as entity.


quoteAmpersand

protected boolean quoteAmpersand
output naked ampersand as &.


wrapAttVals

protected boolean wrapAttVals
wrap within attribute values.


wrapScriptlets

protected boolean wrapScriptlets
wrap within JavaScript string literals.


wrapSection

protected boolean wrapSection
wrap within CDATA section tags.


wrapAsp

protected boolean wrapAsp
wrap within ASP pseudo elements.


wrapJste

protected boolean wrapJste
wrap within JSTE pseudo elements.


wrapPhp

protected boolean wrapPhp
wrap within PHP pseudo elements.


fixBackslash

protected boolean fixBackslash
fix URLs by replacing \ with /.


indentAttributes

protected boolean indentAttributes
newline+indent before each attribute.


xmlPIs

protected boolean xmlPIs
If set to yes PIs must end with ?>.


xmlSpace

protected boolean xmlSpace
if set to yes adds xml:space attr as needed.


encloseBodyText

protected boolean encloseBodyText
if yes text at body is wrapped in p's.


encloseBlockText

protected boolean encloseBlockText
if yes text in blocks is wrapped in p's.


keepFileTimes

protected boolean keepFileTimes
if yes last modied time is preserved.


word2000

protected boolean word2000
draconian cleaning for Word2000.


tidyMark

protected boolean tidyMark
add meta element indicating tidied doc.


emacs

protected boolean emacs
if true format error output for GNU Emacs.


literalAttribs

protected boolean literalAttribs
if true attributes may use newlines.


bodyOnly

protected boolean bodyOnly
output BODY content only.


fixUri

protected boolean fixUri
properly escape URLs.


lowerLiterals

protected boolean lowerLiterals
folds known attribute values to lower case.


replaceColor

protected boolean replaceColor
replace hex color attribute values with names.


hideComments

protected boolean hideComments
hides all (real) comments in output.


indentCdata

protected boolean indentCdata
indent CDATA sections.


forceOutput

protected boolean forceOutput
output document even if errors were found.


showErrors

protected int showErrors
number of errors to put out.


asciiChars

protected boolean asciiChars
convert quotes and dashes to nearest ASCII char.


joinClasses

protected boolean joinClasses
join multiple class attributes.


joinStyles

protected boolean joinStyles
join multiple style attributes.


escapeCdata

protected boolean escapeCdata
replace CDATA sections with escaped text.


ncr

protected boolean ncr
allow numeric character references.


cssPrefix

protected java.lang.String cssPrefix
CSS class naming for -clean option.


replacementCharEncoding

protected java.lang.String replacementCharEncoding
char encoding used when replacing illegal SGML chars, regardless of specified encoding.


tt

protected TagTable tt
TagTable associated with this Configuration.


report

protected Report report
Report instance. Used for messages.


definedTags

protected int definedTags
track what types of tags user has defined to eliminate unnecessary searches.


newline

protected char[] newline
bytes for the newline marker.


rawOut

protected boolean rawOut
Avoid mapping values > 127 to entities.

Constructor Detail

Configuration

protected Configuration(Report report)
Instantiates a new Configuration. This method should be called by Tidy only.

Parameters:
report - Report instance
Method Detail

addProps

public void addProps(java.util.Properties p)
adds configuration Properties.

Parameters:
p - Properties

parseFile

public void parseFile(java.lang.String filename)
Parses a property file.

Parameters:
filename - file name

isKnownOption

public static boolean isKnownOption(java.lang.String name)
Is the given String a valid configuration flag?

Parameters:
name - configuration parameter name
Returns:
true if the given String is a valid config option

adjust

public void adjust()
Ensure that config is self consistent.


getInCharEncodingName

protected java.lang.String getInCharEncodingName()
Getter for inCharEncodingName.

Returns:
Returns the inCharEncodingName.

setInCharEncodingName

protected void setInCharEncodingName(java.lang.String encoding)
Setter for inCharEncodingName.

Parameters:
encoding - The inCharEncodingName to set.

getOutCharEncodingName

protected java.lang.String getOutCharEncodingName()
Getter for outCharEncodingName.

Returns:
Returns the outCharEncodingName.

setOutCharEncodingName

protected void setOutCharEncodingName(java.lang.String encoding)
Setter for outCharEncodingName.

Parameters:
encoding - The outCharEncodingName to set.

setInOutEncodingName

protected void setInOutEncodingName(java.lang.String encoding)
Setter for inOutCharEncodingName.

Parameters:
encoding - The CharEncodingName to set.

setOutCharEncoding

protected void setOutCharEncoding(int encoding)
Deprecated. use setOutCharEncodingName(String)

Setter for outCharEncoding.

Parameters:
encoding - The outCharEncoding to set.

setInCharEncoding

protected void setInCharEncoding(int encoding)
Deprecated. use setInCharEncodingName(String)

Setter for inCharEncoding.

Parameters:
encoding - The inCharEncoding to set.

convertCharEncoding

protected java.lang.String convertCharEncoding(int code)
Convert a char encoding from the deprecated tidy constant to a standard java encoding name.

Parameters:
code - encoding code
Returns:
encoding name


Copyright © 2000-2006 sourceforge. All Rights Reserved.