unicode_scripts_blocks 0.4.0

Unicode Scripts and Blocks #

A tool for checking if a code unit belongs to a Unicode Script or Block

Notice: This library is deprecated! It will not be maintained and could be deleted at any time. Please use Unicode Data instead.

Background #

Unicode code points are divided into code blocks that generally contains characters within the same or related writing systems. For example Basic Latin or Arabic. However, the complete character set needed for a writing system is often spread across a number of code blocks. This character set is referred to as a script. If you want to know what writing system a particular character belongs to, it is generally more accurate to use the Unicode script data rather than the block data. You can read more about the difference here.

This library provides a way to test whether a given code point belongs to some particular Unicode script or block. It was generated from the Unicode 12.0 Scripts.txt and Blocks.txt data files. This library is exhaustive in that it implements every script and block in those data files.

Usage #

A simple usage example:

import 'package:unicode_scripts_blocks/unicode_scripts_blocks.dart';

main() {
  // Unicode Block
  int space = 0x0020;
  bool isBasicLatin = UnicodeBlock.isBasicLatin(latinChar); // true

  // Unicode Script
  int thaiChar = 'ด'.codeUnitAt(0);
  bool isThai = UnicodeScript.isThai(thaiChar); // true

Contributing #

Your help and pull requests are welcome.

Here are some known issues:

  • Many of the single code point checks in UnicodeScripts are consecutive, meaning they could be consolidated into ranges, which would probably improve performance.
  • The lookup algorithm is O(n). I'm sure this could be improved with a better data structure.
  • There are tests for each code block and script but there isn't 100% code coverage. It would be good to at least test characters in scripts with code points higher than U+10000 to make sure they don't get accidentally omitted in future updates.

Features to add:

  • Return the Block name or Script property value as a string when given a code point.
  • Having a completely automatic code generator that takes the data file and produces the code would make updates for future version of Unicode much easier.
  • The most recent version of the script and block data files can be found here: Scripts, Blocks.

Change log #

0.4.0 #

0.3.0 #

  • Exposed Scripts class

0.2.0 #

  • Added a script list UnicodeScript.scripts

0.1.0 #

  • Initial version


import 'package:unicode_scripts_blocks/src/script.dart';
import 'package:unicode_scripts_blocks/unicode_scripts_blocks.dart';

main() {
  final thaiChar = 'ด'.codeUnitAt(0);
  final latinChar = 'a'.codeUnitAt(0);

  if (UnicodeScript.isThai(thaiChar)) {
    print('this script is Thai');
  } else {
    print('not Thai');

  if (UnicodeBlock.isBasicLatin(latinChar)) {
    print('this block is Basic Latin');
  } else {
    print('not Basic Latin');

  final scripts = UnicodeScript2.scripts;
  final punc = scripts.where((script) => script.category.startsWith('P'));
  for (Script p in punc) {
    print('${p.start} ${p.end} ${p.propertyValue} ${p.category}');
  // String myString = ',;';
  // for (int codeUnit in myString.codeUnits) {
  //   if (isPunctuation(codeUnit)) {
  //     // do sth
  //   }
  // }
  //for (int i; i < myString.length; i++) {}

// bool isPunctuation(int codeUnit) {
//   final scripts = UnicodeScript2.scripts;
//   final punc = scripts.where((script) => script.category.startsWith('P'));
//   for (Script p in punc) {
//     print('${p.start} ${p.end} ${p.propertyValue} ${p.category}');
//   }
//   //UnicodeScript2.scripts.where((script) );
// }

Use this package as a library

1. Depend on it

Add this to your package's pubspec.yaml file:

  unicode_scripts_blocks: ^0.4.0

2. Install it

You can install packages from the command line:

with pub:

$ pub get

Alternatively, your editor might support pub get. Check the docs for your editor to learn more.

3. Import it

Now in your Dart code, you can use:

import 'package:unicode_scripts_blocks/unicode_scripts_blocks.dart';
Describes how popular the package is relative to other packages. [more]
Code health derived from static analysis. [more]
Reflects how tidy and up-to-date the package is. [more]
Weighted score of the above. [more]
Learn more about scoring.

This package is not analyzed, because it is discontinued.

Health issues and suggestions

Document public APIs. (-0.75 points)

463 out of 467 API elements have no dartdoc comment.Providing good documentation for libraries, classes, functions, and other API elements improves code readability and helps developers find and use your API.


Package Constraint Resolved Available
Direct dependencies
Dart SDK >=2.1.0 <3.0.0