Phonemize (Dart)

A fast, multilingual Grapheme-to-Phoneme (G2P) phonemization library for Dart. This is a port of the original phonemize JavaScript library.

Features

  • Lightning fast - Pure rule-based processing, no ML overhead.
  • 🌍 Multilingual support - English, Chinese, Japanese, and Korean.
  • 🧠 Smart rule-based G2P - Advanced phonetic rules for unknown words.
  • 🌍 Multiple formats - IPA and ARPABET output.
  • 💻 Pure Dart - No native dependencies, works everywhere (Flutter, Web, Server).

Installation

Add phonemize to your pubspec.yaml:

dependencies:
  phonemize: ^1.0.0

Quick Start

import 'package:phonemize/phonemize.dart';

void main() {
  // Simple phonemization (IPA)
  print(phonemize("Hello world!")); 
  // Output: həˈloʊ wɜːrld!

  // English dialects
  print(phonemize("doctor", language: "en-GB")); // RP: ˈdɑktə
  print(phonemize("doctor", language: "en-US")); // GA: ˈdɑktɝ

  // Multilingual support
  print(phonemize("你好", language: "zh")); // ni˧˩˧ xɑʊ˧˩˧
  print(phonemize("konnichiwa", language: "ja")); // konnitɕiwa
  print(phonemize("Annyeong", language: "ko")); // annjʌŋ
}

API Reference

phonemize(text, {stripStress, format, separator, language})

Convert text to phonemes string.

  • stripStress: Remove stress markers (default: false)
  • format: Output format, either "ipa" or "arpabet" (default: "ipa")
  • separator: Phoneme separator (default: " ")
  • language: Preferred language tag (e.g., "en-US", "en-GB", "zh", "ja", "ko")

Phonemizer class

For multi-instance usage or custom configurations.

final phonemizer = Phonemizer(options: TokenizerOptions(format: "arpabet"));
print(phonemizer.phonemize("Hello")); // HH AH0 L OW1

Credits

This project is a Dart port of the original phonemize JavaScript library. All phonetic rules and logic are derived from the original implementation.

License

MIT

Libraries

phonemize