doc/dev.md
format/<name>format/format/bson.go is quite small, to format/<name>/<name>.goformat.BSON and add it
to format/format.go and don't forget to change the string constant.format/all/all.go*Decode*Len/Range/Limit functions. fq will also automatically add "gap" fields if
it finds gaps.func decodeHeader(d *decode.D) can then be use as d.FieldStruct("header", decodeHeader), d.SeekRel(1234, decodeHeader) or d.SeekRel(1234, func(d *decode.D) { d.FieldStruct("header, decodeHeader") }Commits:
mp3: Validate sync correctly
Tests:
testdata/file and testdata/file.fqtest where file.fqtest is $ fq dv file or $ fq 'dv,torepr' file if there is torepr support.dv produces a lof of output maybe use dv({array_truncate: 50}) etcgo test ./format -run TestFormats/<name> to test expected output.go test ./format -run TestFormats/<name> -update to update current output as expected output.If you have format specific documentation:
format/*/<name>.md and use //go:embed <name>.md/interp.RegisterFS(..) to embed/register it.### Section), paragraphs, lists and links.make doc and fq cli help system.testdata/<name>_help.fqtest with just $ fq -h <name> to test CLI help.mp4.md/mp4.go etc.make README.md doc/formats.md to update md files.Run linter make lint
Run fuzzer make fuzz GROUP=<name>, see usage in Makefile
*decode.D reader methods use this name convention:
<Field>?(<reader<length>?>|<type>Fn>)(...[, scalar.Mapper...]) <type>
Field a field will be added and first argument will be name of field. If not it will just read.<try>?<reader<length>?>|<try>?<type>Fn> a reader or a reader function
<try>? If prefixed with Try function return error instead of panic on error.<reader<length>?> Read bits using some decoder.
U16 unsigned 16 bit integer.UTF8 UTF8 with byte length as argument.<type>Fn> read using a func(d *decode.D) <type> function.
All Field functions takes a var args of scalar.Mapper:s that will be applied after reading.
<type> are these types:
<type> | Go type | jq type |
|---|---|---|
| U | uint64 | number |
| S | int64 | number |
| F | float64 | number |
| Str | string | string |
| Bool | bool | boolean |
| Nil | nil | null |
TODO: there are some more (BitBuf etc, should be renamed)
To add a struct or array use d.FieldStruct(...) and d.FieldArray(...).
TODO: nested formats, buffers, own decoders, scalar mappers
TODO: seeking, framed/limited/range decode
For example this decoder:
// read 4 byte UTF8 string and add it as "magic", return a string
d.FieldUTF8("magic", 4)
// create a new struct and add it as "headers", returns a *decode.D
d.FieldStruct("headers", func(d *decode.D) {
// read 8 bit unsigned integer, map it and add it as "type", returns a uint64
d.FieldU8("type", scalar.UintMapSymStr{
1: "start",
// ...
})
})
will produce something like this:
*decode.Value{
Parent: nil,
V: *decode.Compound{
IsArray: false, // is struct
Children: []*decode.Value{
*decode.Value{
Name: "magic",
V: scalar.Str{
Actual: "abcd", // read and set by UTF8 reader
},
Range: ranges.Range{Start: 0, Len: 32},
},
*decode.Value{
Parent: &... // ref parent *decode.Value>,
Name: "headers",
V: *decode.Compound{
IsArray: false, // is struct
Children: []*decode.Value{
*decode.Value{
Name: "type",
V: scalar.Uint{
Actual: uint64(1), // read and set by U8 reader
Sym: "start", // set by UintMapSymStr scalar.Mapper
},
Range: ranges.Range{Start: 32, Len: 8},
},
},
},
Range: ranges.Range{Start: 32, Len: 8},
},
},
},
Range: ranges.Range{Start: 0, Len: 40},
}
and will look like this in jq/JSON:
{
"magic": "abcd",
"headers": {
"type": "start"
}
}
*decode.D typeThis is the main type used during decoding. It keeps track of:
*decode.Value where fields will be added.New *decode.D are created during decoding when d.FieldStruct etc is used. It is also a kitchen sink of all kind functions for reading various standard number and string encodings etc.
Decoder authors do not have to create them.
*decode.Value typeIs what *decode.D produces and it used to represent the decoded structure. Can be array, struct, number, string etc. It is the underlying type used by interp.DecodeValue that implements gojq.JQValue to expose it as various jq types, which in turn is used to produce JSON.
It stores:
*decode.Value unless it's a root.scalar.S or *decode.Compound (struct or array)Decoder authors will probably not have to create them.
scalar.S typeKeeps track of
uint64, string etc. For example a value reader by a utf8 or utf16 reader both will ends up as a string.scalar.UintMapSymStr would map an actual uint64 to a symbolic string.The scalar package has scalar.Mapper implementations for all types to map actual to whole scalar.S value scalar.<type>ToScalar or to just to set symbolic value scalar.<type>ToSym<type>. There is also mappers to just set values or to change number representations scalar.Hex/scalar.SymHex etc.
Decoder authors will probably not have to create them. But you might implement your own scalar.Mapper to modify them.
*decode.Compound typeUsed to store struct or array of *decode.Value.
Decoder authors do not have to create them.
I usually use -d <format> and dv while developing, that way you will get a decode tree
even if it fails. dv gives verbose output and also includes stacktrace.
go run . -d <format> dv file
If the format is inside some other format it can be handy to first extract the bits and run
the decode directly. For example if working a aac_frame decoder issue:
fq '.tracks[0].samples[1234] | tobytes' file.mp4 > aac_frame_1234
fq -d aac_frame dv aac_frame_1234
Sometimes nested decoding fails then maybe a good way is to change the parent decoder to
use d.RawLen() etc instead of d.FormatLen() etc temporary to extract the bits. Hopefully
there will be some option to do this in the future.
When researching or investigating something I can recommend to use watchexec, modd etc to
make things more comfortable. Also using vscode/delve for debugging should work fine once
launch args are setup etc.
watchexec "go run . -d aac_frame dv aac_frame"
Some different ways to run tests:
# run all tests
make test
# run all go tests
go test ./...
# run all tests for one format
go test -run TestFormats/mp4 ./format/
# update all expected outputs for tests
go test ./pkg/interp ./format -update
# update actual output for specific tests
go run ./format -run TestFormats/elf -update
# color diff
DIFF_COLOR=1 go test ...
To lint source use:
make lint
Generate documentation. Requires FFmpeg and Graphviz:
make doc
TODO: make fuzz
Split debug and normal output even when using repl:
Write log package output and stderr to a file that can be tail -f:ed in another terminal:
LOGFILE=/tmp/log go run . ... 2>>/tmp/log
gojq execution debug:
GOJQ_DEBUG=1 go run -tags debug . ...
Memory and CPU profile (will open a browser):
make memprof ARGS=". file"
make cpuprof ARGS=". test.mp3"
main:main()
cli.Main(default registry)
interp.New(registry, std os interp implementation)
interp.(*Interp).Main()
interp.jq _main/0:
args.jq _args_parse/2
populate filenames for input/0
interp.jq inputs/0
foreach valid input/0 output
interp.jq open
funcs.go _open
interp.jq decode
funcs.go _decode
decode.go Decode(...)
...
interp.jq eval expr
funcs.go _eval
interp.jq display
funcs.go _display
for interp.(decodeValueBase).Display()
dump.go
print tree
empty output
*os.File, *bytes.Buffer
^
ctxreadseeker.Reader defers blocking io operations to a goroutine to make them cancellable
^
progressreadseeker.Reader approximates how much of a file has been read
^
aheadreadseeker.Reader does readahead caching
^
| (io.ReadSeeker interface)
|
bitio.IOBitReader (implements bitio.Bit* interfaces)
SectionBitReader
MultiBitReader
jq -n '[1,2,3,4] | .[null:], .[null:2], .[2:null], .[:null]'
git clone https://github.com/StefanScherer/windows-docker-machine.git
cd windows-docker-machine
vagrant up 2016-box
cd ../fq
docker --context 2016-box run --rm -ti -v "C:${PWD//\//\\}:C:${PWD//\//\\}" -w "$PWD" golang:1.18-windowsservercore-ltsc2016
Issues and PR:s related to fq:
#43 Support for functions written in go when used as a library
#46 Support custom internal functions
#56 String format query with no operator using %#v or %#+v panics #65 Try-catch with custom function
#67 Add custom iterator function support which enables implementing a REPL in jq
#81 path/1 behavior and path expression question
#86 ER: basic TCO #109 jq halt_error behavior difference
#113 error/0 and error/1 behavior difference
#117 Negative number modulus *big.Int behaves differently to int
#118 Regression introduced by "remove fork analysis from tail call optimization (ref #86)"
#122 Slow performance for large error values that ends up using typeErrorPreview()
#125 improve performance of join by make it internal
#141 Empty array flatten regression since "improve flatten performance by reducing copy"
Run and follow instructions:
make release VERSION=1.2.3
Commits since release
git log --no-decorate --no-merges --oneline v0.0.4..wader/master | sort -t " " -k 2 | sed 's/\(.*\)/* \1/'