fsa/doc/fsa_file_format.html
| header | 256 bytes | | symbol table | size bytes | | state table | size*4 bytes | | data table | data_size bytes | | perfect hahs table (optional) | size*4 bytes |
| field | offset | size | descrption |
|---|---|---|---|
| magic | 0 | 4 (uint32) | Magic number (0x79832469) |
| version | 4 | 4 (uint32) | Version number of the fsa library used for building this fsa |
| checksum | 8 | 4 (uint32) | Checksum |
| size | 12 | 4 (uint32) | Size of fsa (cells) |
| start | 16 | 4 (uint32) | Start state |
| data_size | 20 | 4 (uint32) | Size of data (bytes) |
| data_type | 24 | 4 (uint32) | Type of data items (0=variable size, 1=fixed size) |
| fixed_data_size | 28 | 4 (uint32) | Data item size if fixed |
| has_perfect_hash | 32 | 4 (uint32) | Indicator for perfect hash (0=no, 1=yes) |
| serial | 36 | 4 (uint32) | Serial number |
| reserved | 40 | 216 (54*uint32) | Reserved (pads size to 256 bytes) |
The symbol table and state table contain the transitions of the automaton, each 1-byte entry in the symbol table corresponds to an uint32 entry in the state table. For each state, a list of at most 254 transistions is stored, as the symbol set is 8-bit characters, with 0x00 and 0xff reserved. Each state id is in fact an offset into these tables. For a given state state, there exists a valid transition for symbol sym if the symbol table contains sym at offset state+sym. 0x00 means the cell is empty, 0xff is a special symbol meaning that the given state is a final state, anything else means invalid transition (i.e. the cell is in use by some other state). For valid transitions, the corresponding entry in the state table yields the next state. For 0xff transitions, the state table entry contains the offset of the date item within the data store.
The data store contains the data items for the final states. The 'new state' entry of a final state transition in the state table (corresponding to the special final state symbol 0xff) contains the data store offset of the data item corresponding to that final state. If fixed size items are used, each item takes fixed_data_size bytes as defined in the header. Variable size items take 4 bytes (uint32 item_size) plus item_size bytes. The size of the data store is given in the header.
The perfect hash table has one uint32 entry for each transition in the symbol/state table, thus the size of the perfect hash table equals the size of the state table. The perfect hash value for a final state is calculated by adding all values in this table for the transitions along the path from the start state to the final state.