|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.w3c.tidy.EncodingUtils
public final class EncodingUtils
Field Summary | |
---|---|
static int |
FSM_ASCII
states for ISO 2022 A document in ISO-2022 based encoding uses some ESC sequences called "designator" to switch character sets. |
static int |
FSM_ESC
state ESC. |
static int |
FSM_ESCD
state ESCD. |
static int |
FSM_ESCDP
state ESCDP. |
static int |
FSM_ESCP
state ESCP. |
static int |
FSM_NONASCII
state NONASCII. |
static int |
HIGH_UTF16_SURROGATE
UTF-16 high surrogate. |
static int |
LOW_UTF16_SURROGATE
utf16 low surrogate. |
static int |
MAX_UTF16_FROM_UCS4
Max UTF-16 value. |
static int |
MAX_UTF8_FROM_UCS4
Max UTF-88 valid char value. |
static int |
UNICODE_BOM
the default (big-endian) UNICODE BOM. |
static int |
UNICODE_BOM_BE
the big-endian (default) UNICODE BOM. |
static int |
UNICODE_BOM_LE
the little-endian UNICODE BOM. |
static int |
UNICODE_BOM_UTF8
the UTF-8 UNICODE BOM. |
static int |
UTF16_HIGH_SURROGATE_BEGIN
UTF-16 surrogate pair areas: high surrogates begin. |
static int |
UTF16_HIGH_SURROGATE_END
UTF-16 surrogate pair areas: high surrogates end. |
static int |
UTF16_LOW_SURROGATE_BEGIN
UTF-16 surrogate pair areas: low surrogates begin. |
static int |
UTF16_LOW_SURROGATE_END
UTF-16 surrogate pair areas: low surrogates end. |
static int |
UTF16_SURROGATES_BEGIN
UTF-16 surrogates begin. |
Method Summary | |
---|---|
protected static int |
decodeMacRoman(int c)
Function to convert from MacRoman to Unicode. |
protected static int |
decodeWin1252(int c)
Function for conversion from Windows-1252 to Unicode. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int UNICODE_BOM_BE
public static final int UNICODE_BOM
public static final int UNICODE_BOM_LE
public static final int UNICODE_BOM_UTF8
public static final int FSM_ASCII
public static final int FSM_ESC
public static final int FSM_ESCD
public static final int FSM_ESCDP
public static final int FSM_ESCP
public static final int FSM_NONASCII
public static final int MAX_UTF8_FROM_UCS4
public static final int MAX_UTF16_FROM_UCS4
public static final int LOW_UTF16_SURROGATE
public static final int UTF16_SURROGATES_BEGIN
public static final int UTF16_LOW_SURROGATE_BEGIN
public static final int UTF16_LOW_SURROGATE_END
public static final int UTF16_HIGH_SURROGATE_BEGIN
public static final int UTF16_HIGH_SURROGATE_END
public static final int HIGH_UTF16_SURROGATE
Method Detail |
---|
protected static int decodeWin1252(int c)
c
- char to decode
protected static int decodeMacRoman(int c)
c
- char to decode
|
||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |