Back to Napajs

Benchmark

benchmark/README.md

0.2.311.1 KB
Original Source

Benchmark

Summary

  • JavaScript execution in napajs is on par with node, using the same version of V8, which is expected.
  • zone.execute scales linearly on number of workers, which is expected.
  • The overhead of calling zone.execute from node is around 0.1ms after warm-up. The cost of using anonymous function is neglectable.
  • transport.marshall cost on small plain JavaScript values is about 3x of JSON.stringify.
  • The overhead of store.set and store.get is around 0.06ms plus transport overhead on the objects.

We got this report on environment below:

NameValue
ProcessorIntel(R) Xeon(R) CPU L5640 @ 2.27GHz, 8 virtual processors
System Typex64-based PC
Physical Memory16.0 GB
OS versionMicrosoft Windows Server 2012 R2

Napa vs. Node on JavaScript execution

Please refer to node-napa-perf-comparison.ts.

node timenapa time
3026.763025.81

Linear scalability

zone.execute scales linearly on number of workers. We performed 1M CRC32 calls on a 1024-length string on each worker, here are the numbers. We still need to understand why the time of more workers running parallel would beat less workers.

nodenapa - 1 workernapa - 2 workersnapa - 4 workersnapa - 8 workers
time8,6495216006146.984912.574563.486168.41
cpu%~15%~15%~27%~55%~99%

Please refer to execute-scalability.ts for test details.

Execute overhead

The overhead of zone.execute includes

  1. Marshalling cost of arguments in caller thread.
  2. Queuing time before a worker can execute.
  3. Unmarshalling cost of arguments in target worker.
  4. Marshalling cost of return value from target worker.
  5. Queuing time before caller callback is notified.
  6. Unmarshalling cost of return value in caller thread.

In this section we will examine #2 and #5. So we use empty function with no arguments and no return value.

Transport overhead (#1, #3, #4, #6) varies by size and complexity of payload, will be benchmarked separately in Transport Overhead section.

Please refer to execute-overhead.ts for test details.

Overhead after warm-up

Average overhead is around 0.06ms to 0.12ms for zone.execute.

repeatzone.execute (ms)
20024.932
5000456.893
10000810.687
500003387.361

*10000 times of zone.execute on anonymous function is 807.241ms. The gap is within range of bench noise.

Overhead during warm-up:

Sequence of callTime (ms)
16.040
24.065
35.250
44.652
51.572
61.366
71.403
81.213
90.450
100.324
110.193
120.238
130.191
140.230
150.203
160.188
170.188
180.181
190.185
200.182

Transport overhead

The overhead of transport.marshall includes

  1. overhead of needing replacer callback during JSON.stringify. (even an empty callback will slow down JSON.stringify significantly)
  2. traverse every value during JSON.stringify, to check value type and get cid to put into payload.
    • a. If value doesn't need special care.
    • b. If value is a transportable object that needs special care.

2.b is related to individual transportable classes, which may vary per individual class. Thus we examine #1 and #2.a in this test.

The overhead of transport.unmarshall includes

  1. overhead of needing reviver callback during JSON.parse.
  2. traverse every value during JSON.parse, to check if object has _cid property.

We also evaluate only #1, #2.a in this test.

Please refer to transport-overhead.ts for test details.

*All operations are repeated for 1000 times.

payload typesizeJSON.stringify (ms)transport.marshall (ms)JSON.parse (ms)transport.unmarshall (ms)
1 level - 10 integers914.9018.05 (3.68x)3.5017.98 (5.14x)
1 level - 100 integers108165.4592.78 (1.42x)20.45122.25 (5.98x)
10 level - 2 integers18415654.402453.37 (3.75x)995.022675.72 (2.69x)
2 level - 10 integers99119.7466.82 (3.39x)27.85138.45 (4.97x)
3 level - 5 integers139633.66146.33 (4.35x)51.54189.07 (3.67x)
1 level - 10 strings - length 102013.8110.17 (2.67x)9.4620.81 (2.20x)
1 level - 100 strings - length 10219176.53115.74 (1.51x)77.71181.24 (2.33x)
2 level - 10 strings - length 10209130.1597.65 (3.24x)95.51213.20 (2.23x)
3 level - 5 strings - length 10264641.95155.42 (3.71x)123.82227.90 (1.84x)
1 level - 10 strings - length 10011017.7412.19 (1.57x)17.3429.83 (1.72x)
1 level - 100 strings - length 1001119166.17112.83 (1.71x)197.67282.63 (1.43x)
2 level - 10 strings - length 1001109168.46149.99 (2.19x)202.85298.19 (1.47x)
3 level - 5 integers1389689.46208.21 (2.33x)265.25418.42 (1.58x)
1 level - 10 booleans1262.848.14 (2.87x)3.0614.20 (4.65x)
1 level - 100 booleans134120.2859.36 (2.93x)21.59121.15 (5.61x)
2 level - 10 booleans134123.9289.62 (3.75x)31.84137.92 (4.33x)
3 level - 5 booleans182136.15138.24 (3.82x)55.71195.50 (3.51x)

Store access overhead

The overhead of store.set includes

  1. Overhead of calling transport.marshall on value.
  2. Overhead of put marshalled data and transport context into C++ map (with exclusive_lock).

The overhead of store.get includes

  1. Overhead of getting marshalled data and transport context from C++ map (with shared_lock).
  2. Overhead of calling transport.unmarshall on marshalled data.

For store.set, numbers below indicates the cost beyond marshall is around 0.070.4ms varies per payload size. (10B to 18KB). store.get takes a bit more: 0.060.9ms with the same payload size variance. If the value in store is not updated frequently, it's always good to cache it in JavaScript world.

Please refer to store-overhead.ts for test details.

*All operations are repeated for 1000 times.

payload typesizetransport.marshall (ms)store.save (ms)transport.unmarshall (ms)store.get (ms)
1 level - 1 integers102.5473.853.9865.57
1 level - 10 integers918.2798.5517.2390.89
1 level - 100 integers108197.10185.31144.75274.39
10 level - 2 integers184152525.182973.173093.063927.80
2 level - 10 integers99171.22174.01154.76276.04
3 level - 5 integers1396127.06219.73182.27337.59
1 level - 10 strings - length 1020114.4379.6831.2884.71
1 level - 100 strings - length 102191104.40212.44173.32239.09
2 level - 10 strings - length 10209179.54188.72189.29252.83
3 level - 5 strings - length 102646155.14257.78276.22342.95
1 level - 10 strings - length 100110115.2289.8430.8788.18
1 level - 100 strings - length 10011191119.89284.05287.17403.77
2 level - 10 strings - length 10011091137.10299.32244.13297.12
3 level - 5 integers13896183.84310.89285.80363.50
1 level - 10 booleans1265.7449.8922.6997.27
1 level - 100 booleans134157.41157.80106.30218.05
2 level - 10 booleans134176.93150.25104.02185.82
3 level - 5 booleans1821102.47171.44150.42207.27