airbyte-integrations/connectors/source-tiktok-marketing/BEHAVIOR.md
This document describes the biggest non-obvious gotchas in source-tiktok-marketing that deviate from
standard declarative connector patterns. Read this before making changes to the connector.
The connector dynamically selects between two completely different API base URLs based on the authentication type in the config:
https://sandbox-ads.tiktok.com/open_api/v1.3/https://business-api.tiktok.com/open_api/v1.3/This is evaluated via a Jinja expression in the url_base:
"sandbox-ads" if config.get('credentials', {}).get('auth_type', "") == "sandbox_access_token" else "business-api".
Why this matters: The sandbox and production APIs have different data availability and rate limits.
When using a sandbox account, it is impossible to retrieve advertiser IDs via the API (the
oauth2/advertiser/get/ endpoint does not work), which is why the config allows specifying an
advertiser_id directly. If you test changes against a sandbox account, be aware that some streams may
behave differently or return no data compared to production.
The connector uses two custom partition routers that handle advertiser IDs differently depending on whether the ID is provided in the config or fetched from the API:
SingleAdvertiserIdPerPartition: Used by most streams. If advertiser_id is in the config,
yields a single partition with that ID and skips the parent stream entirely. Otherwise, yields one
partition per advertiser ID from the advertiser_ids parent stream.MultipleAdvertiserIdsPerPartition: Used only by the advertisers stream. Batches up to 100
advertiser IDs into a single JSON array partition (e.g., '["id1", "id2", ...]'), because TikTok's
advertiser info endpoint accepts multiple IDs per request.Both routers check multiple config paths in priority order (credentials.advertiser_id then
environment.advertiser_id).
Why this matters: The advertisers stream sends advertiser IDs as a JSON array string in the
request parameter, not as individual values. If you change how advertiser IDs are partitioned, the
advertisers stream needs the batched format while all other streams need individual IDs. Mixing these
up will cause either missing data (single ID sent where array expected) or API errors (array sent where
single ID expected).
TikTok's Reporting API returns the string "-" (a literal dash) for metrics that have no data, rather
than returning null or 0. The TransformEmptyMetrics custom transformation iterates over the
metrics object in each report record and converts any "-" values to null.
Why this matters: Without this transformation, downstream schemas expecting numeric types for
metrics like spend, clicks, and impressions would receive string values, causing type errors in
destinations. Every report stream (daily, hourly, lifetime, audience, by-country) applies this
transformation. If you add a new report stream, you must include the TransformEmptyMetrics
transformation or the stream will emit invalid metric types.
TikTok's API does not use standard HTTP 429 status codes for rate limiting. Instead, it returns HTTP
200 with a code field in the JSON response body set to 40100. The error handler uses a predicate
(response.get('code') == 40100) to detect rate limiting, and a separate predicate
(response.get('code') != 0) to detect general API errors.
Why this matters: Standard HTTP status code-based rate limit detection will not work with TikTok's API. If you modify the error handler, make sure the response body code checks remain intact. The error message also specifically warns about concurrent connections with the same credentials, as TikTok's rate limits are per-access-token.
The ads stream includes a RecordFilter that drops records where modify_time is None. This is
specifically to handle TikTok's Smart+ Ad records, which can be returned by the API without a
modify_time value. Since the stream uses modify_time as its incremental cursor, records without
this field would cause cursor comparison failures.
Why this matters: This filter silently drops valid ad records from the sync output. If a user
reports missing ads data, Smart+ ads without modify_time values are the likely cause. This is a known
trade-off to maintain incremental sync reliability.
The TikTok Sandbox account has a rate limit of 10 requests per second. If you run CATs in CI while simultaneously testing locally with the same credentials, you will exceed this limit and the credentials may be temporarily restricted — preventing all requests from succeeding. The restriction appears to last a couple of hours, and there is evidence that continued request attempts during the restriction period extend the lockout duration. There is no official TikTok documentation on this restriction behavior or its exact duration.
Why this matters: Unlike most API rate limits that simply queue or retry, exceeding TikTok's sandbox rate limit can completely lock out the credentials for hours. Never run CI tests and local tests concurrently against the sandbox account. If you experience sudden 100% request failures with sandbox credentials, stop all requests and wait before retrying.