public final class UriEncoder extends Object
Per Section 2.1 of RFC 3986, URIs should contain only characters that are part of US-ASCII, and some characters are further reserved to delimit components or subcomponents; therefore, characters that are outside the allowed set need to be encoded. This is done using the escape sequence "%XX" where XX is the hexadecimal value of the bytewise representation of the character.
This encoding format is used for the application/x-www-form-urlencoded content type, as defined by section 17.13.4 of the W3C's HTML 4.01 Specification.
For example, the Unicode string "flambé" is represented as the byte
sequence [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xe9] in ISO-8859-1. In
UTF-8, it is represented as [0x66, 0x6c, 0x61, 0x6d, 0x62, 0xc3,
0xa9]. The first five characters are unreserved and do not require encoding,
but the last character is not, so the URI representation is "flamb%E9" in
ISO-8859-1 and "flamb%C3%A9" in UTF-8. Escape sequences are not
case-sensitive.
Uri| Modifier and Type | Field and Description |
|---|---|
static Charset |
DEFAULT_ENCODING
The default character encoding, UTF-8, per Section 2.5 of RFC 3986.
|
| Modifier and Type | Method and Description |
|---|---|
static String |
decode(String string)
Percent-decodes a US-ASCII string into a Unicode string.
|
static String |
decode(String string,
Charset encoding)
Percent-decodes a US-ASCII string into a Unicode string.
|
static String |
encode(String string)
Percent-encodes a Unicode string into a US-ASCII string.
|
static String |
encode(String string,
Charset encoding)
Percent-encodes a Unicode string into a US-ASCII string.
|
public static String encode(String string)
DEFAULT_ENCODING, UTF-8, is used to determine how non-US-ASCII and
reserved characters should be represented as consecutive sequences of the
form "%XX".
This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string - a Unicode stringNullPointerException - if string is nullpublic static String encode(String string, Charset encoding)
This replaces ' ' with '+'. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string - a Unicode stringencoding - a character encodingNullPointerException - if any argument is nullpublic static String decode(String string)
DEFAULT_ENCODING, UTF-8, is used to determine what characters are
represented by any consecutive sequences of the form "%XX".
This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string - a percent-encoded US-ASCII stringNullPointerException - if string is nullpublic static String decode(String string, Charset encoding)
This replaces '+' with ' '. So this method should not be used for non application/x-www-form-urlencoded strings such as host and path.
string - a percent-encoded US-ASCII stringencoding - a character encodingNullPointerException - if any argument is nullRuntimeException - if any the decoding failed because some %
sequence above is invalid (for example, "%HH")Copyright © 2012. All Rights Reserved.