website/docs/api/basevectors.mdx
BaseVectors is an abstract class to support the development of custom vectors
implementations.
For use in training with StaticVectors,
get_batch must be implemented. For improved performance, use efficient
batching in get_batch and implement to_ops to copy the vector data to the
current device. See an example custom implementation for
BPEmb subword embeddings.
Create a new vector store.
| Name | Description |
|---|---|
| keyword-only | |
strings | The string store. A new string store is created if one is not provided. Defaults to None. |
Get a vector by key. If the key is not found in the table, a KeyError should
be raised.
| Name | Description |
|---|---|
key | The key to get the vector for. |
| RETURNS | The vector for the key. |
Return the number of vectors in the table.
| Name | Description |
|---|---|
| RETURNS | The number of vectors in the table. |
Check whether there is a vector entry for the given key.
| Name | Description |
|---|---|
key | The key to check. |
| RETURNS | Whether the key has a vector entry. |
Add a key to the table, if possible. If no keys can be added, return -1.
| Name | Description |
|---|---|
key | The key to add. |
| RETURNS | The row the vector was added to, or -1 if the operation is not supported. |
Get (rows, dims) tuples of number of rows and number of dimensions in the
vector table.
| Name | Description |
|---|---|
| RETURNS | A (rows, dims) pair. |
The vector size, i.e. rows * dims.
| Name | Description |
|---|---|
| RETURNS | The vector size. |
Whether the vectors table is full and no slots are available for new keys.
| Name | Description |
|---|---|
| RETURNS | Whether the vectors table is full. |
Get the vectors for the provided keys efficiently as a batch. Required to use
the vectors with StaticVectors for
training.
| Name | Description |
|---|---|
keys | The keys. |
Dummy method. Implement this to change the embedding matrix to use different Thinc ops.
| Name | Description |
|---|---|
ops | The Thinc ops to switch the embedding matrix to. |
Dummy method to allow serialization. Implement to save vector data with the pipeline.
| Name | Description |
|---|---|
path | A path to a directory, which will be created if it doesn't exist. Paths may be either strings or Path-like objects. |
Dummy method to allow serialization. Implement to load vector data from a saved pipeline.
| Name | Description |
|---|---|
path | A path to a directory. Paths may be either strings or Path-like objects. |
| RETURNS | The modified vectors object. |
Dummy method to allow serialization. Implement to serialize vector data to a binary string.
| Name | Description |
|---|---|
| RETURNS | The serialized form of the vectors object. |
Dummy method to allow serialization. Implement to load vector data from a binary string.
| Name | Description |
|---|---|
data | The data to load from. |
| RETURNS | The vectors object. |