Maps - Miller — ContextQMD

<div> Quick links:   <a class="quicklink" href="../reference-main-flag-list/index.html">Flags</a>   <a class="quicklink" href="../reference-verbs/index.html">Verbs</a>   <a class="quicklink" href="../reference-dsl-builtin-functions/index.html">Functions</a>   <a class="quicklink" href="../glossary/index.html">Glossary</a>   <a class="quicklink" href="../release-docs/index.html">Release docs</a> </div> # Maps

Miller data types are listed on the Data types page; here we focus specifically on maps.

On the whole, maps are as in most other programming languages. However, following the Principle of Least Surprise and aiming to reduce keystroking for Miller's most-used streaming-record-processing model, there are a few differences as noted below.

Types of maps

Map literals are written in curly braces with string keys any Miller data type (including other maps, or arrays) as values. Also, integers may be given as keys although they'll be stored as strings.

<pre class="pre-highlight-in-pair"> mlr -n put ' end { x = {"a": 1, "b": {"x": 2, "y": [3,4,5]}, 99: true}; dump x; print x[99]; print x["99"]; } ' </pre> <pre class="pre-non-highlight-in-pair"> { "a": 1, "b": { "x": 2, "y": [3, 4, 5] }, "99": true } true true </pre>

As with arrays and argument-lists, trailing commas are supported:

<pre class="pre-highlight-in-pair"> mlr -n put ' end { x = { "a" : 1, "b" : 2, "c" : 3, }; print x; } ' </pre> <pre class="pre-non-highlight-in-pair"> { "a": 1, "b": 2, "c": 3 } </pre>

The current record, accessible using $*, is a map.

<pre class="pre-highlight-in-pair"> mlr --csv --from example.csv head -n 2 then put -q ' dump $*; print "Color is", $*["color"]; ' </pre> <pre class="pre-non-highlight-in-pair"> { "color": "yellow", "shape": "triangle", "flag": "true", "k": 1, "index": 11, "quantity": 43.6498, "rate": 9.8870 } Color is yellow { "color": "red", "shape": "square", "flag": "true", "k": 2, "index": 15, "quantity": 79.2778, "rate": 0.0130 } Color is red </pre>

The collection of all out-of-stream variables, @*, is a map.

<pre class="pre-highlight-in-pair"> mlr --csv --from example.csv put -q ' begin { @last_rates = {}; } @last_rates[$shape] = $rate; @last_color = $color; end { dump @*; } ' </pre> <pre class="pre-non-highlight-in-pair"> { "last_rates": { "triangle": 5.8240, "square": 8.2430, "circle": 8.3350 }, "last_color": "purple" } </pre>

Also note that several built-in functions operate on maps and/or return maps.

Insertion order is preserved

Miller maps preserve insertion order. So if you write @m["y"]=7 and then @m["x"]=3 then any loop over the map @m will give you the kays "y" and "x" in that order.

String keys, with conversion from/to integer

All Miller map keys are strings. If a map is indexed with an integer for either read or write (i.e. on either the right-hand side or left-hand side of an assignment) then the integer will be converted to/from string, respectively. So @m[3] is the same as @m["3"]. The reason for this is for situations like operating on all records where it's important to let people do @records[NR] = $*.

Auto-create

Indexing any as-yet-assigned local variable or out-of-stream variable results in auto-create of that variable as a map variable:

<pre class="pre-highlight-in-pair"> mlr --csv --from example.csv put -q ' # You can do this but you do not need to: # begin { @last_rates = {} } @last_rates[$shape] = $rate; end { dump @last_rates; } ' </pre> <pre class="pre-non-highlight-in-pair"> { "triangle": 5.8240, "square": 8.2430, "circle": 8.3350 } </pre>

This also means that auto-create results in maps, not arrays, even if keys are integers. If you want to auto-extend an array, initialize it explicitly to [].

<pre class="pre-highlight-in-pair"> mlr --csv --from example.csv head -n 4 then put -q ' begin { @my_array = []; } @my_array[NR] = $quantity; @my_map[NR] = $rate; end { dump } ' </pre> <pre class="pre-non-highlight-in-pair"> { "my_array": [43.6498, 79.2778, 13.8103, 77.5542], "my_map": { "1": 9.8870, "2": 0.0130, "3": 2.9010, "4": 7.4670 } } </pre>

Auto-deepen

Similarly, maps are auto-deepened: you can put @m["a"]["b"]["c"]=3 without first setting @m["a"]={} and @m["a"]["b"]={}. The reason for this is for doing data aggregations: for example if you want compute keyed sums, you can do that with a minimum of keystrokes.

<pre class="pre-highlight-in-pair"> mlr --icsv --opprint --from example.csv put -q ' @quantity_sum[$color][$shape] += $rate; end { emit @quantity_sum, "color", "shape"; } ' </pre> <pre class="pre-non-highlight-in-pair"> color shape quantity_sum yellow triangle 9.8870 yellow circle 12.572000000000001 red square 17.011 red circle 2.9010 purple triangle 14.415 purple square 8.2430 </pre>

Looping

See single-variable for-loops and key-value for-loops.

Map-valued fields in CSV files

See the flatten/unflatten page.