utf_convert library

Support for encoding and decoding Unicode characters in UTF-8, UTF-16, and UTF-32.

Classes

IterableUtf16Decoder
Return type of decodeUtf16AsIterable and variants. The Iterable type provides an iterator on demand and the iterator will only translate bytes as requested by the user of the iterator. (Note: results are not cached.)
IterableUtf32Decoder
Return type of decodeUtf32AsIterable and variants. The Iterable type provides an iterator on demand and the iterator will only translate bytes as requested by the user of the iterator. (Note: results are not cached.)
IterableUtf8Decoder
Return type of decodeUtf8AsIterable and variants. The Iterable type provides an iterator on demand and the iterator will only translate bytes as requested by the user of the iterator. (Note: results are not cached.)
Utf16beBytesToCodeUnitsDecoder
Convert UTF-16BE encoded bytes to utf16 code units by grouping 1-2 bytes to produce the code unit (0-(2^16)-1).
Utf16BytesToCodeUnitsDecoder
Convert UTF-16 encoded bytes to UTF-16 code units by grouping 1-2 bytes to produce the code unit (0-(2^16)-1). Relies on BOM to determine endian-ness, and defaults to BE.
Utf16CodeUnitDecoder
An Iterator
Utf16leBytesToCodeUnitsDecoder
Convert UTF-16LE encoded bytes to utf16 code units by grouping 1-2 bytes to produce the code unit (0-(2^16)-1).
Utf32beBytesDecoder
Convert UTF-32BE encoded bytes to codepoints by grouping 4 bytes to produce the unicode codepoint.
Utf32BytesDecoder
Abstract parent class converts encoded bytes to codepoints.
Utf32leBytesDecoder
Convert UTF-32BE encoded bytes to codepoints by grouping 4 bytes to produce the unicode codepoint.
Utf8Decoder
Provides an iterator of Unicode codepoints from UTF-8 encoded bytes. The parameters can set an offset into a list of bytes (as int), limit the length of the values to be decoded, and override the default Unicode replacement character. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value. The return value from this method can be used as an Iterable (e.g. in a for-loop).
Utf8DecoderTransformer
StringTransformer that decodes a stream of UTF-8 encoded bytes.
Utf8EncoderTransformer
StringTransformer that UTF-8 encodes a stream of strings.

Functions

codepointsToString(List<int> codepoints) String
Generate a string from the provided Unicode codepoints.
codepointsToUtf8(List<int?> codepoints, [int offset = 0, int? length]) List<int?>
Encode code points as UTF-8 code units.
decodeUtf16(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-16 encoded bytes. This method always strips a leading BOM. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for the replacementCodepoint is U+FFFD.
decodeUtf16AsIterable(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf16Decoder
Decodes the UTF-16 bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. Determines the byte order from the BOM, or uses big-endian as a default. This method always strips a leading BOM. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for replacementCodepoint is U+FFFD.
decodeUtf16be(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-16BE encoded bytes. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for the replacementCodepoint is U+FFFD.
decodeUtf16beAsIterable(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf16Decoder
Decodes the UTF-16BE bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for the replacementCodepoint is U+FFFD.
decodeUtf16le(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-16LE encoded bytes. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for the replacementCodepoint is U+FFFD.
decodeUtf16leAsIterable(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf16Decoder
Decodes the UTF-16LE bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCodepoint to null to throw an ArgumentError rather than replace the bad value. The default value for the replacementCodepoint is U+FFFD.
decodeUtf32(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-32 encoded bytes. The parameters allow an offset into a list of bytes (as int), limiting the length of the values be decoded and the ability of override the default Unicode replacement character. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf32AsIterable(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf32Decoder
Decodes the UTF-32 bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. Determines the byte order from the BOM, or uses big-endian as a default. This method always strips a leading BOM. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf32be(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-32BE encoded bytes. The parameters allow an offset into a list of bytes (as int), limiting the length of the values be decoded and the ability of override the default Unicode replacement character. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf32beAsIterable(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf32Decoder
Decodes the UTF-32BE bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf32le(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a sequence of UTF-32LE encoded bytes. The parameters allow an offset into a list of bytes (as int), limiting the length of the values be decoded and the ability of override the default Unicode replacement character. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf32leAsIterable(List<int> bytes, [int offset = 0, int? length, bool stripBom = true, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf32Decoder
Decodes the UTF-32LE bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. This method strips a leading BOM by default, but can be overridden by setting the optional parameter stripBom to false. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf8(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) String
Produce a String from a List of UTF-8 encoded bytes. The parameters can set an offset into a list of bytes (as int), limit the length of the values to be decoded, and override the default Unicode replacement character. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
decodeUtf8AsIterable(List<int> bytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) IterableUtf8Decoder
Decodes the UTF-8 bytes as an iterable. Thus, the consumer can only convert as much of the input as needed. Set the replacementCharacter to null to throw an ArgumentError rather than replace the bad value.
encodeUtf16(String str) List<int?>
Produce a list of UTF-16 encoded bytes. This method prefixes the resulting bytes with a big-endian byte-order-marker.
encodeUtf16be(String str, [bool writeBOM = false]) List<int?>
Produce a list of UTF-16BE encoded bytes. By default, this method produces UTF-16BE bytes with no BOM.
encodeUtf16le(String str, [bool writeBOM = false]) List<int?>
Produce a list of UTF-16LE encoded bytes. By default, this method produces UTF-16LE bytes with no BOM.
encodeUtf32(String str) List<int?>
Produce a list of UTF-32 encoded bytes. This method prefixes the resulting bytes with a big-endian byte-order-marker.
encodeUtf32be(String str, [bool writeBOM = false]) List<int?>
Produce a list of UTF-32BE encoded bytes. By default, this method produces UTF-32BE bytes with no BOM.
encodeUtf32le(String str, [bool writeBOM = false]) List<int?>
Produce a list of UTF-32LE encoded bytes. By default, this method produces UTF-32BE bytes with no BOM.
encodeUtf8(String str) List<int?>
Produce a sequence of UTF-8 encoded bytes from the provided string.
hasUtf16beBom(List<int> utf16EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a big-endian byte-order marker (BOM).
hasUtf16Bom(List<int> utf32EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a byte-order marker (BOM).
hasUtf16leBom(List<int> utf16EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a little-endian byte-order marker (BOM).
hasUtf32beBom(List<int> utf32EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a big-endian byte-order marker (BOM).
hasUtf32Bom(List<int> utf32EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a byte-order marker (BOM).
hasUtf32leBom(List<int> utf32EncodedBytes, [int offset = 0, int? length]) bool
Identifies whether a List of bytes starts (based on offset) with a little-endian byte-order marker (BOM).
stringToCodepoints(String str) List<int?>
Provide a list of Unicode codepoints for a given string.
utf8ToCodepoints(List<int> utf8EncodedBytes, [int offset = 0, int? length, int replacementCodepoint = UNICODE_REPLACEMENT_CHARACTER_CODEPOINT]) List<int?>