unicode_data 0.1.1

  • Readme
  • Changelog
  • Example
  • Installing
  • 54

Unicode Data #

This library puts Unicode data in a format that can be programmatically manipulated. The current implementation includes Unicode blocks and scrips data.

Background #

Unicode code points are divided into code blocks that generally contains characters within the same or related writing systems. For example Basic Latin or Arabic. However, the complete character set needed for a writing system is often spread across a number of code blocks. This character set is referred to as a script. If you want to know what writing system a particular character belongs to, it is generally more accurate to use the Unicode script data rather than the block data. You can read more about the difference here.

This library contains classes for Unicode scripts and blocks. It was generated from the Unicode 12.0 Scripts.txt and Blocks.txt data files. This library is exhaustive in that it includes every script and block in those data files.

Usage #

A simple usage example:

import 'package:unicode_data/unicode_data.dart';

main() {
  unicodeBlockExamples();
  unicodeScriptExamples();
}

// Unicode Blocks
void unicodeBlockExamples() {
  // get a list of all blocks
  List<Block> blocks = UnicodeBlock.blocks;

  // find the block name for a code point
  final codePoint = 'a'.runes.single;
  final found = blocks
      .where((block) => codePoint >= block.start && codePoint <= block.end);
  final blockName = found.single.name; // Basic Latin

  // get the range for a specific block name
  final block = blocks.where((block) => block.name == 'Mongolian').single;
  final rangeStart = block.start; // 6144
  final rangeEnd = block.end; // 6319
}

// Unicode Scripts
void unicodeScriptExamples() {
  // get a list of all scripts
  List<Script> scripts = UnicodeScript.scripts;

  // find the script name and category for a code point
  final codePoint = 'a'.runes.single;
  final found = scripts.where(
      (script) => codePoint >= script.start && codePoint <= script.end);
  final script = found.single;
  final name = script.name; // Latin
  final category = script.category; // L&

  // find all script ranges for Latin
  final latinScripts = scripts.where((script) => script.name == 'Latin');

  // final all script ranges that are punctuation
  final punctRanges = scripts.where((script) => script.category.startsWith('P'));
}

The category is the type of character that it is, whether a letter, punctuation or some other type.

Contributing #

Your help and pull requests are welcome.

  • When there is a new Unicode version, the code should be regenerated from the data files. See the generators folder in the source code.
  • Because there is so much data in the list, it can be potentially expensive to query the list. I would appreciate advice or examples on how to do this more efficiently. Or I am open to using a different data structure.
  • There are other types of Unicode data that could be included in this library in the future. You can open an issue if you have a request.

Change log #

0.1.1 #

  • Updated library description

0.1.0 #

  • Initial version
  • Contains Unicode scripts and blocks data

example/unicode_data_example.dart

import 'package:unicode_data/unicode_data.dart';

main() {
  unicodeBlockExamples();
  unicodeScriptExamples();
}

// Unicode Blocks
void unicodeBlockExamples() {
  // get a list of all blocks
  List<Block> blocks = UnicodeBlock.blocks;

  // find the block name for a code point
  final codePoint = 'a'.runes.single;
  final found = blocks
      .where((block) => codePoint >= block.start && codePoint <= block.end);
  final blockName = found.single.name; // Basic Latin

  // get the range for a specific block name
  final block = blocks.where((block) => block.name == 'Mongolian').single;
  final rangeStart = block.start; // 6144
  final rangeEnd = block.end; // 6319
}

// Unicode Scripts
void unicodeScriptExamples() {
  // get a list of all scripts
  List<Script> scripts = UnicodeScript.scripts;

  // find the script name and category for a code point
  final codePoint = 'a'.runes.single;
  final found = scripts.where(
      (script) => codePoint >= script.start && codePoint <= script.end);
  final script = found.single;
  final name = script.name; // Latin
  final category = script.category; // L&

  // find all script ranges for Latin
  final latinScripts = scripts.where((script) => script.name == 'Latin');

  // final all script ranges that are punctuation
  final punctRanges = scripts.where((script) => script.category.startsWith('P'));
}

Use this package as a library

1. Depend on it

Add this to your package's pubspec.yaml file:


dependencies:
  unicode_data: ^0.1.1

2. Install it

You can install packages from the command line:

with pub:


$ pub get

with Flutter:


$ flutter pub get

Alternatively, your editor might support pub get or flutter pub get. Check the docs for your editor to learn more.

3. Import it

Now in your Dart code, you can use:


import 'package:unicode_data/unicode_data.dart';
  
Popularity:
Describes how popular the package is relative to other packages. [more]
9
Health:
Code health derived from static analysis. [more]
100
Maintenance:
Reflects how tidy and up-to-date the package is. [more]
100
Overall:
Weighted score of the above. [more]
54
Learn more about scoring.

We analyzed this package on Oct 16, 2019, and provided a score, details, and suggestions below. Analysis was completed with status completed using:

  • Dart: 2.5.1
  • pana: 0.12.21

Platforms

Detected platforms: Flutter, web, other

No platform restriction found in primary library package:unicode_data/unicode_data.dart.

Health suggestions

Format lib/src/unicode_script.dart.

Run dartfmt to format lib/src/unicode_script.dart.

Dependencies

Package Constraint Resolved Available
Direct dependencies
Dart SDK >=2.1.0 <3.0.0
Dev dependencies
pedantic ^1.0.0
test ^1.0.0