encodeMinimally static method

Result encodeMinimally(
  1. Input input
)

Implementation

static Result encodeMinimally(Input input) {
  //@SuppressWarnings("checkstyle:lineLength")
  /* The minimal encoding is computed by Dijkstra. The acyclic graph is modeled as follows:
   * A vertex represents a combination of a position in the input and an encoding mode where position 0
   * denotes the position left of the first character, 1 the position left of the second character and so on.
   * Likewise the end vertices are located after the last character at position input.length().
   * For any position there might be up to six vertices, one for each of the encoding types ASCII, C40, TEXT, X12,
   * EDF and B256.
   *
   * As an example consider the input string "ABC123" then at position 0 there is only one vertex with the default
   * ASCII encodation. At position 3 there might be vertices for the types ASCII, C40, X12, EDF and B256.
   *
   * An edge leading to such a vertex encodes one or more of the characters left of the position that the vertex
   * represents. It encodes the characters in the encoding mode of the vertex that it ends on. In other words,
   * all edges leading to a particular vertex encode the same characters (the length of the suffix can vary) using the same
   * encoding mode.
   * As an example consider the input string "ABC123" and the vertex (4,EDF). Possible edges leading to this vertex
   * are:
   *   (0,ASCII)  --EDF(ABC1)--> (4,EDF)
   *   (1,ASCII)  --EDF(BC1)-->  (4,EDF)
   *   (1,B256)   --EDF(BC1)-->  (4,EDF)
   *   (1,EDF)    --EDF(BC1)-->  (4,EDF)
   *   (2,ASCII)  --EDF(C1)-->   (4,EDF)
   *   (2,B256)   --EDF(C1)-->   (4,EDF)
   *   (2,EDF)    --EDF(C1)-->   (4,EDF)
   *   (3,ASCII)  --EDF(1)-->    (4,EDF)
   *   (3,B256)   --EDF(1)-->    (4,EDF)
   *   (3,EDF)    --EDF(1)-->    (4,EDF)
   *   (3,C40)    --EDF(1)-->    (4,EDF)
   *   (3,X12)    --EDF(1)-->    (4,EDF)
   *
   * The edges leading to a vertex are stored in such a way that there is a fast way to enumerate the edges ending
   * on a particular vertex.
   *
   * The algorithm processes the vertices in order of their position thereby performing the following:
   *
   * For every vertex at position i the algorithm enumerates the edges ending on the vertex and removes all but the
   * shortest from that list.
   * Then it processes the vertices for the position i+1. If i+1 == input.length() then the algorithm ends
   * and chooses the the edge with the smallest size from any of the edges leading to vertices at this position.
   * Otherwise the algorithm computes all possible outgoing edges for the vertices at the position i+1
   *
   * Examples:
   * The process is illustrated by showing the graph (edges) after each iteration from left to right over the input:
   * An edge is drawn as follows "(" + fromVertex + ") -- " + encodingMode + "(" + encodedInput + ") (" +
   * accumulatedSize + ") --> (" + toVertex + ")"
   *
   * Example 1 encoding the string "ABCDEFG":
   *
   *
   * Situation after adding edges to the start vertex (0,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF)
   * (0,ASCII) C40(ABC) (3) --> (3,C40)
   * (0,ASCII) TEXT(ABC) (5) --> (3,TEXT)
   * (0,ASCII) X12(ABC) (3) --> (3,X12)
   * (0,ASCII) EDF(ABC) (4) --> (3,EDF)
   * (0,ASCII) EDF(ABCD) (4) --> (4,EDF)
   *
   * Situation after adding edges to vertices at position 1
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF)
   * (0,ASCII) C40(ABC) (3) --> (3,C40)
   * (0,ASCII) TEXT(ABC) (5) --> (3,TEXT)
   * (0,ASCII) X12(ABC) (3) --> (3,X12)
   * (0,ASCII) EDF(ABC) (4) --> (3,EDF)
   * (0,ASCII) EDF(ABCD) (4) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) B256(B) (4) --> (2,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BC) (5) --> (3,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) C40(BCD) (4) --> (4,C40)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) TEXT(BCD) (6) --> (4,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) X12(BCD) (4) --> (4,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BCD) (5) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BCDE) (5) --> (5,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) ASCII(B) (4) --> (2,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BC) (6) --> (3,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) C40(BCD) (5) --> (4,C40)
   * (0,ASCII) B256(A) (3) --> (1,B256) TEXT(BCD) (7) --> (4,TEXT)
   * (0,ASCII) B256(A) (3) --> (1,B256) X12(BCD) (5) --> (4,X12)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BCD) (6) --> (4,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BCDE) (6) --> (5,EDF)
   *
   * Edge "(1,ASCII) ASCII(B) (2) --> (2,ASCII)" is minimal for the vertex (2,ASCII) so that edge "(1,B256) ASCII(B) (4) --> (2,ASCII)" is removed.
   * Edge "(1,B256) B256(B) (3) --> (2,B256)" is minimal for the vertext (2,B256) so that the edge "(1,ASCII) B256(B) (4) --> (2,B256)" is removed.
   *
   * Situation after adding edges to vertices at position 2
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF)
   * (0,ASCII) C40(ABC) (3) --> (3,C40)
   * (0,ASCII) TEXT(ABC) (5) --> (3,TEXT)
   * (0,ASCII) X12(ABC) (3) --> (3,X12)
   * (0,ASCII) EDF(ABC) (4) --> (3,EDF)
   * (0,ASCII) EDF(ABCD) (4) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BC) (5) --> (3,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) C40(BCD) (4) --> (4,C40)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) TEXT(BCD) (6) --> (4,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) X12(BCD) (4) --> (4,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BCD) (5) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BCDE) (5) --> (5,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BC) (6) --> (3,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) C40(BCD) (5) --> (4,C40)
   * (0,ASCII) B256(A) (3) --> (1,B256) TEXT(BCD) (7) --> (4,TEXT)
   * (0,ASCII) B256(A) (3) --> (1,B256) X12(BCD) (5) --> (4,X12)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BCD) (6) --> (4,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) EDF(BCDE) (6) --> (5,EDF)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) ASCII(C) (5) --> (3,ASCII)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) B256(C) (6) --> (3,B256)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) EDF(CD) (7) --> (4,EDF)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) C40(CDE) (6) --> (5,C40)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) TEXT(CDE) (8) --> (5,TEXT)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) X12(CDE) (6) --> (5,X12)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) EDF(CDE) (7) --> (5,EDF)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF) EDF(CDEF) (7) --> (6,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) B256(C) (5) --> (3,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) EDF(CD) (6) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) C40(CDE) (5) --> (5,C40)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) TEXT(CDE) (7) --> (5,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) X12(CDE) (5) --> (5,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) EDF(CDE) (6) --> (5,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) EDF(CDEF) (6) --> (6,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) ASCII(C) (4) --> (3,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) EDF(CD) (6) --> (4,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) C40(CDE) (5) --> (5,C40)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) TEXT(CDE) (7) --> (5,TEXT)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) X12(CDE) (5) --> (5,X12)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) EDF(CDE) (6) --> (5,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) EDF(CDEF) (6) --> (6,EDF)
   *
   * Edge "(2,ASCII) ASCII(C) (3) --> (3,ASCII)" is minimal for the vertex (3,ASCII) so that edges "(2,EDF) ASCII(C) (5) --> (3,ASCII)"
   * and "(2,B256) ASCII(C) (4) --> (3,ASCII)" can be removed.
   * Edge "(0,ASCII) EDF(ABC) (4) --> (3,EDF)" is minimal for the vertex (3,EDF) so that edges "(1,ASCII) EDF(BC) (5) --> (3,EDF)"
   * and "(1,B256) EDF(BC) (6) --> (3,EDF)" can be removed.
   * Edge "(2,B256) B256(C) (4) --> (3,B256)" is minimal for the vertex (3,B256) so that edges "(2,ASCII) B256(C) (5) --> (3,B256)"
   * and "(2,EDF) B256(C) (6) --> (3,B256)" can be removed.
   *
   * This continues for vertices 3 thru 7
   *
   * Situation after adding edges to vertices at position 7
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256)
   * (0,ASCII) EDF(AB) (4) --> (2,EDF)
   * (0,ASCII) C40(ABC) (3) --> (3,C40)
   * (0,ASCII) TEXT(ABC) (5) --> (3,TEXT)
   * (0,ASCII) X12(ABC) (3) --> (3,X12)
   * (0,ASCII) EDF(ABC) (4) --> (3,EDF)
   * (0,ASCII) EDF(ABCD) (4) --> (4,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) C40(BCD) (4) --> (4,C40)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) TEXT(BCD) (6) --> (4,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) X12(BCD) (4) --> (4,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) EDF(BCDE) (5) --> (5,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256)
   * (0,ASCII) C40(ABC) (3) --> (3,C40) C40(DEF) (5) --> (6,C40)
   * (0,ASCII) X12(ABC) (3) --> (3,X12) X12(DEF) (5) --> (6,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) C40(CDE) (5) --> (5,C40)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) TEXT(CDE) (7) --> (5,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) X12(CDE) (5) --> (5,X12)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) EDF(CDEF) (6) --> (6,EDF)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) C40(BCD) (4) --> (4,C40) C40(EFG) (6) --> (7,C40)    //Solution 1
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) X12(BCD) (4) --> (4,X12) X12(EFG) (6) --> (7,X12)    //Solution 2
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) ASCII(D) (4) --> (4,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) TEXT(DEF) (8) --> (6,TEXT)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) EDF(DEFG) (7) --> (7,EDF)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256) B256(D) (5) --> (4,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) ASCII(D) (4) --> (4,ASCII) ASCII(E) (5) --> (5,ASCII)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) ASCII(D) (4) --> (4,ASCII) TEXT(EFG) (9) --> (7,TEXT)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256) B256(D) (5) --> (4,B256) B256(E) (6) --> (5,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) ASCII(D) (4) --> (4,ASCII) ASCII(E) (5) --> (5,ASCII) ASCII(F) (6) --> (6,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256) B256(D) (5) --> (4,B256) B256(E) (6) --> (5,B256) B256(F) (7) --> (6,B256)
   * (0,ASCII) ASCII(A) (1) --> (1,ASCII) ASCII(B) (2) --> (2,ASCII) ASCII(C) (3) --> (3,ASCII) ASCII(D) (4) --> (4,ASCII) ASCII(E) (5) --> (5,ASCII) ASCII(F) (6) --> (6,ASCII) ASCII(G) (7) --> (7,ASCII)
   * (0,ASCII) B256(A) (3) --> (1,B256) B256(B) (3) --> (2,B256) B256(C) (4) --> (3,B256) B256(D) (5) --> (4,B256) B256(E) (6) --> (5,B256) B256(F) (7) --> (6,B256) B256(G) (8) --> (7,B256)
   *
   * Hence a minimal encoding of "ABCDEFG" is either ASCII(A),C40(BCDEFG) or ASCII(A), X12(BCDEFG) with a size of 5 bytes.
   */

  final inputLength = input.length;

  // Array that represents vertices. There is a vertex for every character and mode.
  // The last dimension in the array below encodes the 6 modes ASCII, C40, TEXT, X12, EDF and B256
  final List<List<Edge?>> edges = List.generate(
    inputLength + 1,
    (index) => List.filled(6, null),
  );
  addEdges(input, edges, 0, null);

  for (int i = 1; i <= inputLength; i++) {
    for (int j = 0; j < 6; j++) {
      if (edges[i][j] != null && i < inputLength) {
        addEdges(input, edges, i, edges[i][j]);
      }
    }
    //optimize memory by removing edges that have been passed.
    edges[i - 1].fillRange(0, 6, null);
  }

  int minimalJ = -1;
  int minimalSize = MathUtils.maxValue;
  for (int j = 0; j < 6; j++) {
    if (edges[inputLength][j] != null) {
      final edge = edges[inputLength][j]!;
      //C40, TEXT and X12 need an extra unlatch at the end
      final size = (j >= 1 && j <= 3)
          ? edge.cachedTotalSize + 1
          : edge.cachedTotalSize;
      if (size < minimalSize) {
        minimalSize = size;
        minimalJ = j;
      }
    }
  }

  if (minimalJ < 0) {
    // IllegalStateException
    throw StateError('Failed to encode "$input"');
  }
  return Result(edges[inputLength][minimalJ]!);
}