The Bulk Data Import module syncs existing commercetools data into Klaviyo. It queries data held in commercetools via it's APIs and sync the data to Klaviyo. A set of API endpoints are provided to trigger the bulk import of customers, orders and product information.
The Bulk Data Import module can be deployed in different ways, however due to it's potentially long-running nature it is less suited to serverless environments (see Timeout problems and CPU allocation).
Management APIs are provided that allow an ongoing import to be terminated. To avoid multiple imports of the same type (e.g. orders) running concurrently, the module creates a lock in commercetools using custom objects.
A sample Dockerfile is available in this repository that can be used to create a docker image that can be deployed for example on CloudRun. Note that horizontally scaling the Bulk Data Import module should be avoided. If Bulk Data Import is only expected to be run as a one-off or ad-hoc process, it can even be run on a local machine.
Serverless technologies are typically limited on the maximum execution time. If the amount of data to import is very
large it might take longer than the timeout.
Possible solutions are:
- Run the module from a local machine if the bulk data import needs to be done one off.
- Use a non-serverless deployment service, for example VMs that have CPU always allocated.
- Check the logs of the latest imported item ID before the process timed out and restart the import from that ID by using the partial import APIs.
Some serverless technologies (e.g. Cloud Run) by default allocate CPU only during request processing. The bulk import APIs once called, accepts the request and returns immediately the HTTP response 202, the import process then runs in background. In this case the service CPU needs to be allocated all the time to prevent the background import process from being killed.
The bulk data import module requires all the following environment variables to start:
NOTE: While
CT_SCOPEis not mandatory for deployment, the listed scopes are required for the component to function correctly. Themanage_subscriptionsscope is not required for bulk import functionality. However, if you are sharing the same commercetools credentials with the realtime component, you should includemanage_subscriptionsin the scope to prevent post-deploy failures related to insufficient scope when making requests to commercetools. While specifying the scope in your environment variables is not required, you do need to make sure your API credentials are created with the necessary scopes.
| NAME | VALUE | Required | Example |
|---|---|---|---|
| CT_API_URL | commercetools API url | Yes | https://api.us-central1.gcp.commercetools.com |
| CT_AUTH_URL | commercetools AUTH url | Yes | https://auth.us-central1.gcp.commercetools.com |
| CT_PROJECT_ID | commercetools project ID | Yes | my-project-prod |
| CT_SCOPE | commercetools API client scopes. The following scopes are required for the bulk import module to function correctly: view_orders view_published_products view_products manage_key_value_documents view_customers view_payments |
No | view_orders:project-key view_published_products:project-key view_products:project-key manage_key_value_documents:project-key view_customers:project-key view_payments:project-key |
| KLAVIYO_AUTH_KEY | Klaviyo private api KEY | Yes | pk_1234567890 |
| CT_API_CLIENT | Commercetools API client id and secret | Yes | {"clientId":"the-ct-client-id","secret":"the-ct-client-secret"} |
| APP_TYPE | BULK_IMPORT |
No | Prevents the real-time sync module from being started |
| BULK_IMPORT_PORT | 6779 | No | To change the default (6779) bulk import API server port |
| PRODUCT_URL_TEMPLATE | https://example-store.com/products/{{productSlug}} |
No | Set the template used for product URLs in Klaviyo, references frontend URLs (productSlug will be replaced by the product slug set in commercetools) |
| PREFERRED_LOCALE | your preferred locale for certain localized strings | No | Set your (optional) preferred locale to be used when getting string from LocalizedString properties, like product/category names for the Klaviyo catalogue |
| PREFERRED_CURRENCY | your preferred currency for certain price object arrays | No | Set your (optional) preferred currency to be used when getting prices from products, for Klaviyo catalogue items and custom_metadata |
| Endpoint | Purpose | Notes |
|---|---|---|
/sync/customers |
Imports all existing customers into Klaviyo | |
/sync/orders |
Imports all applicable orders as Klaviyo events for each customer | Has a very high rate-limit, unlikely to cause issues |
/sync/categories |
Imports all categories into Klaviyo Catalog | Uses basic catalog endpoints from Klaviyo, might rate limit with high category counts |
/sync/products |
Imports all published products into Klaviyo Catalog | Uses job-based catalog endpoints from Klaviyo, should hold with large datasets |
For /sync/categories and /sync/products there's an option to send the "deleteAll": true and "confirmDeletetion": "products" (or "categories"), to trigger a complete deletion of these resources from the Klaviyo Catalog. Keep in mind this DOES NOT differentiate between data that came from the plugin and data that might have been imported/created from another source. This is why both properties are required in body to start the process.
Additionally, all endpoints shown above support adding /stop to the URL to cancel the process. This only stops the process, any modifications will not be reverted and any import tasks still running on Klaviyo servers will still complete.
Setting up bulk import to run in a local machine is very straightforward. Just follow these steps:
- Head to the
plugindirectory and runyarn installto install all dependencies. - Copy the
.env.testfile to.envand set the required environment variables. Remove/change any other variables as needed..env.testmay have variables which are not needed for your use case or may be missing some variables. Double check the environment variables above to avoid issues.
- Run
yarn run start-tsto start the plugin. The port used for any of the components will be shown in your console. - Open Postman or similar, prepare a POST request with the right URL. For example:
http://localhost:6779/sync/customers. - Send the request. If all went well, you should get a
2XXstatus code right away. - Monitor progress in your console, you'll get a summary of imported/errored items at the end.
- Errors will be logged along the way, a decently sized console buffer is recommended.
- Errors similar to
Product with ID <id> does not exist in Klaviyoare expected, checks are performed before creating/updating items in Klaviyo.
Also, do keep in mind there are sequences/rules that should be followed when importing data:
- Customers and Orders don't have a strict dependency on each other, but importing Customers first is strongly recommended.
- Categories must be imported before Products, since there's a dependency between them.
- Products must have at least one (1) image. Prices are optional, but recommended.
- Undefined prices will send a price of 0 (zero) to Klaviyo, regardless of currency.
- If you set
PREFERRED_CURRENCYyou need at least a price to match said currency, otherwise the resulting price will be 0. If not set, the first price found will be picked. - Expiration dates which are still within range are preferred over basic prices.
- Prices with past expiration dates will be ignored. For future dates, the closest one will be used and the rest will be ignored.
- For products, in cases where more than one locale/currency/inventory channel is defined, only one will be chosen and imported based on configuration and priorities.
The bulk import component is intended to be a one-and-done, despite the fact it can be reused periodically as needed. It doesn't ship with any options to run import jobs on a schedule by default.
Code changes would be needed if this needed to be implemented in code. As a workaround, any tool or combination of tools capable of performing requests on a schedule (e.g.: a combination of cron and curl) would allow the user to schedule import jobs of any given type.
Regardless of the method use, it's important to keep in mind logs need to be checked manually and certain operations depend on existing data from other operations (see previous section).