Back to Apollo

CyberRT Performance Report

docs/附录/CyberRT Performance Report.md

11.0.011.1 KB
Original Source

Introduction

The purpose of this performance report is to compare the results of the old and new versions of cyberRT in terms of cross-process/cross-machine transmission under different transmission conditions, and to present detailed data on the transmission metrics when using different modules/sensors (Functional Module, Normal Sensor and High-End Sensor). All results are generated by the cyber_benchmark benchmarking tool.

Environment

Platform 1:

<table> <tbody> <tr> <td rowspan="3">Hardware</td> <td>cpu</td> <td>Intel(R) Core(TM) i9-9900K</td> </tr> <tr> <td>memory</td> <td>2 x Innodisk M4S0-AGS1OCIK DDR4 16GiB 2667 MHz</td> </tr> <tr> <td>store</td> <td>Samsung SSD 980 500GB</td> </tr> <tr> <td rowspan="3">Software</td> <td>system</td> <td>Ubuntu 18.04.5 2021.09.12 LTS</td> </tr> <tr> <td>kernel</td> <td>5.4.0-150-generic</td> </tr> <tr> <td>CyberRT version</td> <td>Apollo 9.0 / Apollo 10.0</td> </tr> </tbody> </table>

Platform 2:

<table> <tbody> <tr> <td rowspan="3">Hardware</td> <td>cpu</td> <td>8 core Arm® Cortex®-A78AE v8.2</td> </tr> <tr> <td>memory</td> <td>32GiB 256 bit LPDDR5 Onboard Memory 204.8GB/s</td> </tr> <tr> <td>store</td> <td>KINGSTON OM8PGP41024Q-A0</td> </tr> <tr> <td rowspan="3">Software</td> <td>system</td> <td>Ubuntu 20.04.4 LTS</td> </tr> <tr> <td>kernel</td> <td>5.10.104-tegra</td> </tr> <tr> <td>CyberRT version</td> <td>Apollo 9.0 / Apollo 10.0</td> </tr> </tbody> </table>

Cross-Process transmission Performance Test Results

This test will examine the performance of both the old and new versions of cyberRT with different message sizes and sending frequencies.

different message size

All tests were performed with message sizes of 16B, 1KB, 64KB ..... .5MB, 10MB, and conducted at 100 hz

The following images show the cpu usage, memory usage, message transfer latency, and packet loss metrics for transferring different message sizes on Platform 1. In this case, cyberRT of Apollo 10.0 has arena zero-copy communication enabled and arena shared memory is configured with 1GB size:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

The following images show the cpu usage, memory usage, message transfer latency, and packet loss metrics for transferring different message sizes on Platform 2. In this case, cyberRT of Apollo 10.0 has arena zero-copy communication enabled and arena shared memory is configured with 1GB size:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

different sending frequencies

All tests were performed at message frequencies of 10 hz, 20 hz, 50 hz, and 100 hz with a message size of 1 MB

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transmitting different send frequencies on Platform 1. In this case, cyberRT of Apollo 10.0 has arena zero-copy communication enabled and arena shared memory is configured with 1GB size:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transmitting different send frequencies on Platform 2. In this case, cyberRT of Apollo 10.0 has arena zero-copy communication enabled and arena shared memory is configured with 1GB size:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

performance of Apollo 10.0 cyberRT in different scenarios for cross-process transfers under platform 1

message size/frequencycpu usagelatencymsg loss ratememory usage
Functional Module(perception, planning etc.)64K/10hz9.14%84.6 us0.0%250MB
High Frequencies functional Module(localization)64k/100hz9.71%69.54 us0.0%250M + 1024M arena shared memory
Normal Sensor Module1M/10hz8.47%82.29 us0.0%250M + 1024M arena shared memory
High-End Sensor Module10M/10hz5.55%58.95 us0.0%250M + 1024M arena shared memory

Cross-machine transfer performance test results

This test will examine the performance of both the old and new versions of cyberRT with different message sizes and sending frequencies.

different message size

All tests were performed with message sizes of 1KB, 64KB ..... .5MB, 10MB, and conducted at 100 hz

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transferring different message sizes on Platform 1. The cyberRT of Apollo 10.0 is based on FastDDS version 2.x and the 9.0 cyberRT is based on fastrtps version 1.5:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transferring different message sizes on Platform 2. The cyberRT of Apollo 10.0 is based on FastDDS version 2.x and the 9.0 cyberRT is based on fastrtps version 1.5:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

different sending frequencies

All tests were performed at message frequencies of 10 hz, 20 hz, 50 hz, and 100 hz with a message size of 1 MB

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transferring different message sizes on Platform 1. The cyberRT of Apollo 10.0 is based on FastDDS version 2.x and the 9.0 cyberRT is based on fastrtps version 1.5:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

The following images show the cpu usage, memory usage, message transfer latency, and packet loss rate metrics for transferring different message sizes on Platform 2. The cyberRT of Apollo 10.0 is based on FastDDS version 2.x and the 9.0 cyberRT is based on fastrtps version 1.5:

<table> <tbody> <tr> <td></td> <td></td> </tr> <tr> <td></td> <td></td> </tr> </tbody> </table>

Performance of Apollo 10.0 cyberRT in different scenarios for cross-machine transfers under platform 1

message size/frequencycpu usagelatencymsg loss ratememory usage
Functional Module(perception, planning etc.)64K/10hz7.75%391 us0.0%247MB
High Frequencies functional Module(localization)64k/100hz10.3%369 us0.0%249M
Normal Sensor Module1M/10hz8.84%2124 us0.0%251M
High-End Sensor Module10M/10hz18.8%18886 us0.0%288M