public final class ByteQuadsCanonicalizer extends Object
BytesToNameCanonicalizer
which aims at more localized
memory access due to flattening of name quad data.
Performance improvement modest for simple JSON document data binding (maybe 3%),
but should help more for larger symbol tables, or for binary formats like Smile.
Hash area is divided into 4 sections:
hash (LSB) >> 1
hash (LSB) >> 2
int
s, where 1 - 3 ints contain 1 - 12
UTF-8 encoded bytes of name (null-padded), and last int is offset in
_names
that contains actual name Strings.Modifier and Type | Field and Description |
---|---|
protected int |
_count
Total number of Strings in the symbol table; only used for child tables.
|
protected boolean |
_failOnDoS
Flag that indicates whether we should throw an exception if enough
hash collisions are detected (true); or just worked around (false).
|
protected int[] |
_hashArea
Primary hash information area: consists of
2 * _hashSize
entries of 16 bytes (4 ints), arranged in a cascading lookup
structure (details of which may be tweaked depending on expected rates
of collisions). |
protected boolean |
_hashShared
Flag that indicates whether underlying data structures for
the main hash area are shared or not.
|
protected int |
_hashSize
Number of slots for primary entries within
_hashArea ; which is
at most 1/8 of actual size of the underlying array (4-int slots,
primary covers only half of the area; plus, additional area for longer
symbols after hash area). |
protected boolean |
_intern
Whether canonical symbol Strings are to be intern()ed before added
to the table or not.
|
protected int |
_longNameOffset
Offset within
_hashArea that follows main slots and contains
quads for longer names (13 bytes or longer), and points to the
first available int that may be used for appending quads of the next
long name. |
protected String[] |
_names
Array that contains
String instances matching
entries in _hashArea . |
protected ByteQuadsCanonicalizer |
_parent
Reference to the root symbol table, for child tables, so
that they can merge table information back as necessary.
|
protected int |
_secondaryStart
Offset within
_hashArea where secondary entries start |
protected int |
_seed
Seed value we use as the base to make hash codes non-static between
different runs, but still stable for lifetime of a single symbol table
instance.
|
protected int |
_spilloverEnd
Pointer to the offset within spill-over area where there is room
for more spilled over entries (if any).
|
protected AtomicReference<com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.TableInfo> |
_tableInfo
Member that is only used by the root table instance: root
passes immutable state info child instances, and children
may return new state if they add entries to the table.
|
protected int |
_tertiaryShift
Constant that determines size of buckets for tertiary entries:
1 << _tertiaryShift is the size, and shift value
is also used for translating from primary offset into
tertiary bucket (shift right by 4 + _tertiaryShift ). |
protected int |
_tertiaryStart
Offset within
_hashArea where tertiary entries start |
Modifier and Type | Method and Description |
---|---|
protected void |
_reportTooManyCollisions() |
String |
addName(String name,
int q1) |
String |
addName(String name,
int[] q,
int qlen) |
String |
addName(String name,
int q1,
int q2) |
String |
addName(String name,
int q1,
int q2,
int q3) |
int |
bucketCount() |
int |
calcHash(int q1) |
int |
calcHash(int[] q,
int qlen) |
int |
calcHash(int q1,
int q2) |
int |
calcHash(int q1,
int q2,
int q3) |
static ByteQuadsCanonicalizer |
createRoot()
Factory method to call to create a symbol table instance with a
randomized seed value.
|
protected static ByteQuadsCanonicalizer |
createRoot(int seed) |
String |
findName(int q1) |
String |
findName(int[] q,
int qlen) |
String |
findName(int q1,
int q2) |
String |
findName(int q1,
int q2,
int q3) |
int |
hashSeed() |
ByteQuadsCanonicalizer |
makeChild(int flags)
Factory method used to create actual symbol table instance to
use for parsing.
|
boolean |
maybeDirty()
Method called to check to quickly see if a child symbol table
may have gotten additional entries.
|
int |
primaryCount()
Method mostly needed by unit tests; calculates number of
entries that are in the primary slot set.
|
void |
release()
Method called by the using code to indicate it is done with this instance.
|
int |
secondaryCount()
Method mostly needed by unit tests; calculates number of entries
in secondary buckets
|
int |
size() |
int |
spilloverCount()
Method mostly needed by unit tests; calculates number of entries
in shared spill-over area
|
int |
tertiaryCount()
Method mostly needed by unit tests; calculates number of entries
in tertiary buckets
|
String |
toString() |
int |
totalCount() |
protected final ByteQuadsCanonicalizer _parent
protected final AtomicReference<com.fasterxml.jackson.core.sym.ByteQuadsCanonicalizer.TableInfo> _tableInfo
protected final int _seed
protected boolean _intern
NOTE: non-final to allow disabling intern()ing in case of excessive collisions.
protected final boolean _failOnDoS
protected int[] _hashArea
2 * _hashSize
entries of 16 bytes (4 ints), arranged in a cascading lookup
structure (details of which may be tweaked depending on expected rates
of collisions).protected int _hashSize
_hashArea
; which is
at most 1/8
of actual size of the underlying array (4-int slots,
primary covers only half of the area; plus, additional area for longer
symbols after hash area).protected int _secondaryStart
_hashArea
where secondary entries startprotected int _tertiaryStart
_hashArea
where tertiary entries startprotected int _tertiaryShift
1 << _tertiaryShift
is the size, and shift value
is also used for translating from primary offset into
tertiary bucket (shift right by 4 + _tertiaryShift
).
Default value is 2, for buckets of 4 slots; grows bigger with bigger table sizes.
protected int _count
protected String[] _names
protected int _spilloverEnd
_hashArea
.protected int _longNameOffset
protected boolean _hashShared
This flag needs to be checked both when adding new main entries, and when adding new collision list queues (i.e. creating a new collision list head entry)
public static ByteQuadsCanonicalizer createRoot()
protected static ByteQuadsCanonicalizer createRoot(int seed)
public ByteQuadsCanonicalizer makeChild(int flags)
flags
- Bit flags of active JsonFactory.Feature
s enabled.public void release()
public int size()
public int bucketCount()
public boolean maybeDirty()
public int hashSeed()
public int primaryCount()
public int secondaryCount()
public int tertiaryCount()
public int spilloverCount()
public int totalCount()
public String findName(int q1)
public String findName(int q1, int q2)
public String findName(int q1, int q2, int q3)
public String findName(int[] q, int qlen)
public int calcHash(int q1)
public int calcHash(int q1, int q2)
public int calcHash(int q1, int q2, int q3)
public int calcHash(int[] q, int qlen)
protected void _reportTooManyCollisions()
Copyright © 2008–2022 FasterXML. All rights reserved.