docs/published/handbook/engineering/data-warehouse.md
This is an internal guide to setting up and working with the data warehouse for PostHog engineers. If you're a PostHog user, check out our data warehouse docs instead.
Looking to add a new source to data warehouse? We have a detailed guide in the codebase.
If you're a customer of PostHog Cloud and are looking to import data into your project, then you're likely looking for this section of the docs instead
temporal-worker-data-warehouse will then import the data into your local MinIO instanceAll your data warehouse data is stored in your local MinIO instance. You can view all the files by going to http://localhost:19001/ and using the username object_storage_root_user and password object_storage_root_password. There should be a data-warehouse bucket that has a separate folder for each table you sync.
If you want to set up a local MySQL database as a source for the data warehouse, there are a few extra set up steps you'll need to complete:
First, install MySQL:
brew install mysql
brew services start mysql
Once MySQL is installed, create a database and table, insert a row, and create a user who can connect to it:
mysql -u root
CREATE DATABASE posthog_dw_test;
CREATE TABLE IF NOT EXISTS payments (id INT AUTO_INCREMENT PRIMARY KEY, timestamp DATETIME, distinct_id VARCHAR(255), amount DECIMAL(10,2));
INSERT INTO payments (timestamp, distinct_id, amount) VALUES (NOW(), '[email protected]', 99.99);
CREATE USER 'posthog'@'%' IDENTIFIED BY 'posthog';
GRANT ALL PRIVILEGES ON posthog_dw_test.* TO 'posthog'@'%';
FLUSH PRIVILEGES;
To verify everything is working as expected:
posthog)After the job runs, clicking on the synced table name should take you to your data.
You'll need to install MS SQL drivers for the PostHog app to connect to a MS SQL database. Learn the entire process in posthog/warehouse/README.md. Without the drivers, you'll get the following error when connecting a SQL database to data warehouse:
symbol not found in flat namespace '_bcp_batch'