pinot-tools/src/main/resources/generator/README.md
Mock data has many use-cases from testing over benchmarking to portable application demos. The generator configs in this directory produce neat synthetic time series data of an imaginary website. You can generate gigabytes of mock data with these patterns if you so desire.
simpleWebsite generates non-dimensional data with views, clicks, and error count metrics
complexWebsite generates similar metrics with a 3-dimensional breakdown across countries, browsers, and platforms
The command line examples below are meant to be executed from the pinot repository root. (This was tested with pinot-quickstart in batch mode. Requires DefaultTenant and broker)
This first step generates the raw data from a given generator file. By default, we generate the data as CSV, and you can have a look manually with your favorite spreadsheet tool.
(may require rm -rf ./myTestData to clear out existing mock data)
./pinot-tools/target/pinot-tools-pkg/bin/pinot-admin.sh GenerateData \
-numFiles 1 -numRecords 354780 -format csv \
-schemaFile ./pinot-tools/src/main/resources/generator/complexWebsite_schema.json \
-schemaAnnotationFile ./pinot-tools/src/main/resources/generator/complexWebsite_generator.json \
-outDir ./myTestData
Now we turn the verbose CSV data into an efficiently packed segment ready for upload into pinot.
./pinot-tools/target/pinot-tools-pkg/bin/pinot-admin.sh CreateSegment \
-tableConfigFile ./pinot-tools/src/main/resources/generator/complexWebsite_config.json \
-format CSV -overwrite \
-schemaFile ./pinot-tools/src/main/resources/generator/complexWebsite_schema.json \
-dataDir ./myTestData \
-outDir ./myTestSegment
Before we push the segment, let's ensure that we have a table namespace ready. You can skip this step if you created a table earlier already.
./pinot-tools/target/pinot-tools-pkg/bin/pinot-admin.sh AddTable -exec \
-tableConfigFile ./pinot-tools/src/main/resources/generator/complexWebsite_config.json \
-schemaFile ./pinot-tools/src/main/resources/generator/complexWebsite_schema.json
Now, we upload the segment. After this step, data should be available and query-able from the pinot console an any connected applications.
./pinot-tools/target/pinot-tools-pkg/bin/pinot-admin.sh UploadSegment \
-tableName complexWebsite \
-segmentDir ./myTestSegment
We can finally check data availability, e.g. by using pinot's built-in query console. If you're running a local pinot-quickstart image via docker the URL should be:
http://localhost:9000#