Back to Dotnet

Extend support for TKey in Dictionaries to non-string types

src/libraries/System.Text.Json/docs/KeyConverter_spec.md

11.0.10011.8 KB
Original Source

Extend support for TKey in Dictionaries to non-string types

Motivation

Most of our users that serialize dictionary use Dictionary<string, TKey>; however there is a significant amount that relies on Dictionary<TKey, TValue> where TKey is a primitive other than string, e.g: int or Guid, most of them came form Newtonsoft.Json which offers a plenty amount of support for using several types as the TKey, other popular .Net serializers also offer support for other types, the most common are integers (int, uint, long, etc.), and enums (including Flags enums).

Goals

  • 80%+ of dictionaries with non-string keys work out of the box, especially if they can round-trip.
  • Remain high performance.

Non-goals

  • Complete parity with Newtonsoft.Json capabilities, especially in how string support is extended; any extension point can be through JsonConverter<MyDictionary<non-string, TValue>>.

Sample

cs
// (De)serialize into a dictionary with a non-string key.
Dictionary<int, string> root = new Dictionary<int, string>();
root.Add(1, "value");

string json = JsonSerializer.Serialize(root);
// JSON
// {
//   "1":"value"
// }

Dictionary<int, string> rootCopy = JsonSerializer.Deserialize<Dictionary<int, string>>(json);
Console.WriteLine(rootCopy[1]);
 // Prints
 // value

Strawman proposal

KeyConverter

Implement an internal custom mechanism that is in charge of converting a defined set of types to be supported as the dictionary TKey; more or less like internal JsonConverters work but for dictionary keys to JSON property names and viceversa.

  • The alternative that offers the best performance.

  • We need to define a criteria to choose what types we should support, I suggest to do as Utf8JsonReader/Writer and support the types supported by the Utf8Parser/Formatter.

  • Supported types (Types supported by Utf8Formatter/Parser + a few others that are popular):

    • Boolean
    • Byte
    • DateTime
    • DateTimeOffset
    • Decimal
    • Double
    • Enum
    • Guid
    • Int16
    • Int32
    • Int64
    • Object (Only on Serialization and if the runtime type is one of the supported types in this list)
    • SByte
    • Single
    • String
    • UInt16
    • UInt32
    • UInt64
  • https://github.com/Jozkee/runtime/tree/TKey_CustomConverter

Alternative considered

TypeConverter

Use TypeConverter to parse and write the string representation of the type and use that as the JSON property name.

Benchmark results

Using a dictionary that contains 100 elements.

Serialize/Write

The custom KeyConverter that calls Utf8Parser underneath performs slightly faster than calling TypeConverter, keep in mind that KeyConverter is a naive implementation, it also calls Encoding.UTF8.GetString since JsonNamingPolicy.ConvertName only takes strings, this could be fixed if we add an internal method that can take a ROS<byte>, also the allocations are currently super high; this might be alleviated by moving the KeyConverter store to the JsonSerializerOptions.

Dictionary<String, TValue> results show the same numbers across branches since that still uses DictionaryOfStringTValueConverter.

main:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
WriteDictionary<Dictionary<String, Int32>>SerializeToUtf8Bytes8.737 us0.1760 us0.1883 us8.743 us8.487 us9.030 us0.8867--3760 B
WriteDictionary<Dictionary<String, Int32>>SerializeUtf8ObjectProperty9.908 us0.6205 us0.6897 us9.800 us8.994 us11.343 us0.9392--4048 B

KeyConverter:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
WriteDictionary<Dictionary<Guid, Int32>>SerializeToUtf8Bytes9.874 us0.1383 us0.1226 us9.891 us9.660 us10.057 us1.1874--5 KB
WriteDictionary<Dictionary<Int32, Int32>>SerializeToUtf8Bytes8.877 us0.2902 us0.3105 us8.770 us8.534 us9.554 us0.5553--2.41 KB
WriteDictionary<Dictionary<String, Int32>>SerializeToUtf8Bytes8.859 us0.2583 us0.2871 us8.828 us8.456 us9.484 us0.8803--3.67 KB
WriteDictionary<Dictionary<Guid, Int32>>SerializeUtf8ObjectProperty10.155 us0.2284 us0.2136 us10.124 us9.818 us10.647 us1.2779--5.28 KB
WriteDictionary<Dictionary<Int32, Int32>>SerializeUtf8ObjectProperty8.633 us0.2301 us0.2558 us8.640 us8.275 us9.143 us0.6482--2.69 KB
WriteDictionary<Dictionary<String, Int32>>SerializeUtf8ObjectProperty8.845 us0.1065 us0.0831 us8.864 us8.666 us8.949 us0.9470--3.95 KB

TypeConverter:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
WriteDictionary<Dictionary<Guid, Int32>>SerializeToUtf8Bytes19.067 us0.6886 us0.7930 us18.971 us18.249 us20.809 us4.2226--17.5 KB
WriteDictionary<Dictionary<Int32, Int32>>SerializeToUtf8Bytes16.022 us0.2106 us0.1970 us16.021 us15.726 us16.331 us2.1591--9.01 KB
WriteDictionary<Dictionary<String, Int32>>SerializeToUtf8Bytes8.236 us0.1465 us0.1370 us8.232 us8.020 us8.477 us0.8754--3.67 KB
WriteDictionary<Dictionary<Guid, Int32>>SerializeUtf8ObjectProperty18.688 us0.3887 us0.4476 us18.726 us18.111 us19.540 us4.3006--17.78 KB
WriteDictionary<Dictionary<Int32, Int32>>SerializeUtf8ObjectProperty15.688 us0.2953 us0.3032 us15.658 us15.225 us16.271 us2.2737--9.29 KB
WriteDictionary<Dictionary<String, Int32>>SerializeUtf8ObjectProperty8.690 us0.1620 us0.1591 us8.688 us8.435 us8.969 us0.9363--3.95 KB

Deserialize/Read

main:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
ReadDictionary<Dictionary<String, Int32>>DeserializeFromUtf8Bytes22.05 us0.439 us0.470 us22.10 us21.23 us23.16 us4.0872--17176 B

KeyConverter:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
ReadDictionary<Dictionary<Guid, Int32>>DeserializeFromUtf8Bytes29.35 us0.724 us0.805 us29.23 us28.50 us31.51 us5.0274--20.72 KB
ReadDictionary<Dictionary<Int32, Int32>>DeserializeFromUtf8Bytes20.11 us0.313 us0.278 us20.08 us19.77 us20.50 us2.7725--11.48 KB
ReadDictionary<Dictionary<String, Int32>>DeserializeFromUtf8Bytes21.68 us0.453 us0.522 us21.73 us20.97 us22.79 us4.0213--16.77 KB

TypeConverter:

TypeMethodMeanErrorStdDevMedianMinMaxGen 0Gen 1Gen 2Allocated
ReadDictionary<Dictionary<Guid, Int32>>DeserializeFromUtf8Bytes34.83 us0.669 us0.593 us34.83 us33.84 us36.17 us5.8045--23.84 KB
ReadDictionary<Dictionary<Int32, Int32>>DeserializeFromUtf8Bytes26.39 us0.448 us0.419 us26.33 us25.79 us27.43 us3.3389--13.82 KB
ReadDictionary<Dictionary<String, Int32>>DeserializeFromUtf8Bytes21.73 us0.378 us0.336 us21.77 us21.20 us22.29 us4.02480.1750-16.77 KB

Prior-art

Newtonsoft.Json

On write:

  • if the TKey is a concrete primitive type*:

    • it calls Convert.ToString()   except for the next types:
      • DateTime (uses DateFormatHandling specified in options)
      • DateTimOffset
      • Double (uses double.ToString("R")) // 'R' stands for round-trip
      • Single
      • Enum (uses an internal helper method)
  • If the TKey is object or non-primitive.   * it calls the TypeConverter of the TKey runtime type.   Except for :     * Type, which returns the AssemblyQualifiedName.   * If the type does not have a TypeConverter, it calls ToString() on the TKey instance.

* A primitive type is a value cataloged as such by Json.Net from this list.

On read:

  • If the TKey is a concrete type.
    • If is a primitive that implements IConvertible:
      • Calls Convert.ChangeType(propertyName, concreteType) But first tries to manually convert on these types:
        • Enum
        • DateTime
        • BigInteger
      • If the type does not implement IConvertible:
        • It tries to manually convert or cast to the concrete type using a few custom helper methods.
  • If the TKey is object, the entries' keys will be of type string if they are quoted (Newtonsoft supports unquoted properties).

Utf8Json

Supported types:

  • String
  • Numerics (int, float, etc.)
  • Enum
  • Guid
  • Boolean
  • Nullables of all the above types

Jil

Supported types:

  • String
  • Numerics (int, long, etc. Note: does not support floating point types)
  • Enum

Notes

  1. DictionaryKeyPolicy will apply to the resulting string of the non-string types.
  2. Should we provide a way to allow users to customize the EnumKeyConverter behavior, as it is done in JsonStringEnumConverter? As of now KeyConverters are meant to be internal types, to enable the previously described behavior we either pass the options through JsonSerializerOptions or through an attribute.
  3. Discuss support for object as the TKey type on deserialization, should we support it in this enhancement? object is treated as a JsonElement on deserialization and is not part of the supported types on the Utf8Parser/Formatter. Consider to defer it when we add support for intuitive types (parse keys as string, etc. instead of JsonElement).