flutter_chardet

pub package

Detect the character encoding of text bytes in Flutter, then optionally convert those bytes into a Dart String.

This is useful when your app opens files, subtitles, logs, crawled pages, or other text that is not guaranteed to be UTF-8.

flutter_chardet uses the native uchardet detector through FFI. When you call autoDecode, the byte-to-string conversion is handled by charset_converter.

Install

flutter pub add flutter_chardet

Supported Platforms

  • Android
  • iOS
  • macOS
  • Windows
  • Linux

Web is not supported because detection uses native code. Conversion support is provided by charset_converter, which supports Android, iOS, macOS, Windows, and Linux.

Usage

import 'dart:io';

import 'package:flutter_chardet/flutter_chardet.dart';

final bytes = await File('subtitle.srt').readAsBytes();

// Only detect the most likely charset.
final detection = await FlutterChardet.detect(bytes);
print(detection.charset); // Possible output: SHIFT_JIS
print(detection.confidence); // Possible output: 0.99
print(detection.language); // Possible output: ja

// Inspect candidate charsets. By default this returns the top 5 candidates.
final candidates = await FlutterChardet.detectAll(bytes);
print(candidates.map((candidate) => candidate.charset).toList());
// Possible output: [SHIFT_JIS, EUC-JP, ISO-2022-JP]

// Use top: 0 when you want every candidate reported by uchardet.
final allCandidates = await FlutterChardet.detectAll(bytes, top: 0);
print(allCandidates.length); // Possible output: 8

// Detect and convert to a Dart String. By default this tries the top 5
// candidates in order until charset_converter decodes one successfully.
final decoded = await FlutterChardet.autoDecode(bytes);
print(decoded.text); // Possible output: こんにちは
print(decoded.charset); // Possible output: SHIFT_JIS
print(decoded.confidence); // Possible output: 0.99
print(decoded.language); // Possible output: ja

Other common charset outputs include UTF-8, windows-1251, Big5, EUC-JP, and ISO-8859-1. Very short text can be ambiguous, so treat the result as a best guess rather than a promise.

Compared with flutter_charset_detector

flutter_charset_detector is another Flutter package that can detect and decode text encodings. The table below uses the upstream uchardet fixtures in test/upstream as a reference set. A decoded string only counts as correct when it matches an independent UTF-8 conversion.

Package Supported platforms Correct charset detection Correct decoded text Small text autoDecode Large text autoDecode Minimal Android APK Minimal iOS app
flutter_chardet Android, iOS, macOS, Windows, Linux 152 / 158 148 / 158 0.118 ms 135.5 ms 14.1 MB 12.4 MB
flutter_charset_detector 6.0.0 Android, iOS, macOS, Web 146 / 158 145 / 158 0.107 ms 276.1 ms 13.9 MB 19.9 MB

On this fixture set, flutter_chardet detects and decodes more cases correctly. For very small text the two packages are similar. For the large Shift_JIS sample, flutter_chardet was faster in the local benchmark.

Detailed result files are kept in the GitHub repository:

Notes

Detection is only a guess. If your format already declares an encoding, prefer that explicit value. Use charset detection when the input does not tell you what it is.