Class ImmutableStringTrie

java.lang.Object
com.google.inject.internal.aop.ImmutableStringTrie
All Implemented Interfaces:
ToIntFunction<String>

final class ImmutableStringTrie extends Object implements ToIntFunction<String>
Immutable space-efficient trie that provides a quick lookup index for a sorted set of non empty strings. It assumes only those strings will be queried and therefore may produce false-positive results for strings not in the array.

Each node of the tree is represented as a series of chars using this layout:

 +---------------------------------+
 | number of branches              |
 +---------------------------------+---------------------------------+----
 | char for branch 0               | char for branch 1               | ...
 +---------------------------------+---------------------------------+----
 | key-delta/leaf/bud for branch 0 | key-delta/leaf/bud for branch 1 | ...
 +---------------------------------+---------------------------------+----
 | offset to jump to branch 1      | offset to jump to branch 2      | ...
 +---------------------------------+---------------------------------+----
 
Each node is immediately followed by its child nodes according to branch order.

The key-delta is used to skip over a section of the input key when we know it should always match given the recently matched char (assumes only strings from the original list are queried).

Leaves mark a definite end of the match, while buds mark a potential end which could continue down the trie if there are more characters to match. The key-delta for buds is implicitly 1.

The jump for branch 0 is assumed to be 0 and is always ommitted, that is any continuation of the trie for branch 0 immediately follows the current node. The entire jump section is omitted when all the branches from a node are leaves.

Simple example trie with 2 strings "getValue" and "setValue":

 +---+---+---+--------+--------+
 | 2 | g | s | 0x8000 | 0x8001 |
 +---+---+---+--------+--------+
 
In this case the first character is enough to determine the index result.

Example of a trie with a 'bud' that contains 2 strings "getName" and "getNameAndValue":

 +---+---+---+---+---+--------+---+---+--------+
 | 1 | g | 6 | 1 | e | 0x4000 | 1 | A | 0x8001 |
 +---+---+---+---+---+--------+---+---+--------+
 
After matching 'g' we skip to the end of 'getName' before checking if there are any more characters to match.

More complex example with 3 strings "getName", "getValue", "getVersion":

 +---+---+---+---+---+---+--------+---+---+---+---+---+--------+--------+
 | 1 | g | 3 | 2 | N | V | 0x8000 | 1 | 0 | 2 | a | e | 0x8001 | 0x8002 |
 +---+---+---+---+---+---+--------+---+---+---+---+---+--------+--------+
 
After matching 'g' we skip past the 'get'. If the next character is 'N' we know this is 'getName' otherwise we skip over the 'V' and jump to the last check between '...alue' and '...ersion'.
  • Nested Class Summary

    Nested Classes
    Modifier and Type
    Class
    Description
    private static final class 
    Immutable trie that delegates searches that lie outside its range to an overflow trie.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    private static final char
    Marks a 'bud' in the tree; the same as a leaf except the trie continues beneath it.
    private static final char
    Marks a leaf in the trie, where the rest of the bits are the index to be returned.
    private static final int
    Maximum number of rows that can be indexed by a single trie.
    private final char[]
    The compressed trie.
  • Constructor Summary

    Constructors
    Constructor
    Description
    ImmutableStringTrie(char[] data)
     
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Returns the index assigned in the trie to the given string.
    private static void
    buildSubTrie(StringBuilder buf, String[] table, int column, int row, int rowLimit)
    Recursively builds a trie for a slice of rows at a particular column.
    private static ToIntFunction<String>
    buildTrie(StringBuilder buf, String[] table, int row, int rowLimit)
    Builds a trie, overflowing to additional tries if there are too many rows
    Builds an immutable trie that indexes the given table of strings.
    private static int
    nextPivotColumn(String[] table, int column, int row, int rowLimit)
    Finds the next column in the current row whose character differs in at least one other row.
    private static int
    nextPivotRow(String[] table, char pivot, int column, int row, int rowLimit)
    Finds the next row that has a different character in the selected column to the given one, or is too short to include the column.
    private static int
     

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • LEAF_MARKER

      private static final char LEAF_MARKER
      Marks a leaf in the trie, where the rest of the bits are the index to be returned.
      See Also:
    • BUD_MARKER

      private static final char BUD_MARKER
      Marks a 'bud' in the tree; the same as a leaf except the trie continues beneath it.
      See Also:
    • MAX_ROWS_PER_TRIE

      private static final int MAX_ROWS_PER_TRIE
      Maximum number of rows that can be indexed by a single trie.
      See Also:
    • trie

      private final char[] trie
      The compressed trie.
  • Constructor Details

    • ImmutableStringTrie

      ImmutableStringTrie(char[] data)
  • Method Details

    • singletonTrie

      private static int singletonTrie(String key)
    • applyAsInt

      public int applyAsInt(String key)
      Returns the index assigned in the trie to the given string.

      Note: a return value of -1 means the string is definitely not in the trie, but a non-negative index may be returned for strings that closely match those in the trie. This is acceptable because we will only call this method with strings that we know exist in the trie.

      Specified by:
      applyAsInt in interface ToIntFunction<String>
    • buildTrie

      public static ToIntFunction<String> buildTrie(Collection<String> table)
      Builds an immutable trie that indexes the given table of strings.

      The table of strings must be sorted in lexical order.

    • buildTrie

      private static ToIntFunction<String> buildTrie(StringBuilder buf, String[] table, int row, int rowLimit)
      Builds a trie, overflowing to additional tries if there are too many rows
    • buildSubTrie

      private static void buildSubTrie(StringBuilder buf, String[] table, int column, int row, int rowLimit)
      Recursively builds a trie for a slice of rows at a particular column.
    • nextPivotRow

      private static int nextPivotRow(String[] table, char pivot, int column, int row, int rowLimit)
      Finds the next row that has a different character in the selected column to the given one, or is too short to include the column. This determines the span of rows that fall under the given character in the trie.

      Returns the row just after the end of the range if all rows have the same character.

    • nextPivotColumn

      private static int nextPivotColumn(String[] table, int column, int row, int rowLimit)
      Finds the next column in the current row whose character differs in at least one other row. This helps identify the longest common prefix from the current pivot point to the next one.

      Returns the column just after the end of the current row if all rows are identical.