Insight data

What is Insight data?

Insight data is storable information related to your customers’ purchasing histories and habits. It allows you to store collections of data against your contacts or account using our API.

So they’re additional contact data fields, extending the contact data fields we already store?

Yes – but with the prime difference being these are transaction-based ones. Any additional keyed data you have for a contact can be stored as Insight data, with very little restriction on the type of data it can be. As a broad rule, you can store anything that is serializable to JSON.

Just as you can currently segment upon contact data fields such as gender, age and geography, Insight data gives you the ability to segment upon types of items purchased, the regularity of purchases and amount spent on purchases, for instance. As ever, the quality of your Insight data dictates how well you can segment your contacts and personalize your campaigns and offers.

Dataset examples

Here are some examples of datasets that could be stored as Insight data:

  • For an auction site, a list of all bids a user has made and if they succeeded or not, including the date and time of the bid, the amount of the bid and the category and name of the item bid on.
  • For a travel agent, a list of destinations a user has visited via bookings on the site, including the number of bookings for the destination (solo, couple, family, etc.,), the amount spent and the type of accommodation.
  • A list of a user’s likes and dislikes.

The key point is that many different types of Insight data can be stored for each contact, and structured in a way of your choosing.

What can I do with this data?

Once Insight data is stored against your contacts, you can use it to write queries to segment your contacts against this data and create new lists. This enables you to send targeted, personalized campaigns based upon your contacts’ transactional history and habits. Using the above examples, you can run segmentations based on bids made, countries visited and likes and dislikes.

For more on segmenting Insight data, read our article on Insight data segmentation.

How is Insight data stored?

This section provides information and guidelines on storing your Insight data. This is a technical guide, so if you want a non-developers account of Insight data, check out this article.

Keys are used to refer to individual pieces of data

At its heart, Insight data is a key-value pair storage mechanism; you store a piece of data against a contact, using a key to refer to it. If you need to update or delete a piece of data at a later date, you use the key as a reference to retrieve the piece of data and alter it.

Restrictions on keys

Keys must:

  • Be unique (it is up to the user of the API to ensure uniqueness)

Keys must not:

  • Exceed 255 characters in length

Keys can only contain the following characters:

  • Alphanumeric (a-z, A-Z, 0-9)
  • Hyphen ("-")
  • Underscore ("_")

Data is stored against a contact or as account-scoped data

A contact identifier (email address, mobile number or ContactID) is submitted with each Insight data document uploaded. If the identifier used is "account" rather than an email address, then the record is stored as account-scoped Insight data. This is Insight data not linked to a specific contact record, for example, a product catalog.

JSON data representation

All data to be stored must be serializable as JSON. This means complex object structures can be used to represent data. For example, you could represent the contents of a user’s shopping basket like this:

{  
   "basketId": 4858,  
   "datePurchased": "2013-03-19T17:22:54.8042",  
   "productPurchased": [{  
           "name": "a dvd",  
           "cost": 10.99,  
           "isOnSale": false  
       }, {  
           "name": "a book",  
           "cost": 2.99,  
           "isOnSale": true  
       }  
    ]  
}

There are a few restrictions and extensions to the JSON format as outlined below.

Restrictions on collection names

Valid collection names can only contain the following characters:

  • Alphanumeric (a-z, A-Z, 0-9)
  • Hyphen ("-")
  • Underscore ("_")

Furthermore, collection names can't:

  • Begin with a number
  • Exceed 255 characters in length
  • Have exactly the same name as any other collection, even if the collections are differently scoped (e.g. one collection is contact-scoped and another collection is account-scoped)

Restrictions and extensions on values

  • String values are restricted to 1000 characters in length.
  • Date time values are also supported via a subset of the ISO 8601 standard (e.g. 2013-03-19T17:22:55.804Z)
  • Lists must be of the same type. Lists may only contain either Booleans, strings, numbers, objects, lists, etc.

See the JSON specification for all other value types.

📘

A note on date time formatting

Not all ISO 8601 formats are supported (for example the string "2023" would not be treated as a date). The following are valid for this implementation.

Complete date plus hours, minutes and seconds:  
YYYY-MM-DDThh:mm:ssTZD (e.g. 1997-07-16T19:20:30+01:00)  

Complete date plus hours, minutes, seconds and a decimal fraction of a second  
YYYY-MM-DDThh:mm:ss.sTZD (e.g. 1997-07-16T19:20:30.45+01:00)

where:

 YYYY = four-digit year
 MM   = two-digit month (01=January, etc.)
 DD   = two-digit day of month (01 through 31)
 hh   = two digits of hour (00 through 23) (am/pm NOT allowed)
 mm   = two digits of minute (00 through 59)
 ss   = two digits of second (00 through 59)
 s    = one or more digits representing a decimal fraction of a second
 TZD  = time zone designator (Z or +hh:mm or -hh:mm)

Note: All times without a time zone designator will be treated as UTC.

Retention policy

A retention policy is available for any insight data collection which includes a date type field in its records. The retention policy is a tool that allows you to expire insight data records after a period of time that you set. Expiring insight data means deleting it from your account. This tool can be helpful in managing your insight data allowance.

For a date field to be available to use for the purposes of setting a retention policy, it must be included in the root (top level) of the JSON structure.

JSON stores data in schema-bound collections

While you can store any JSON structure as Insight data, it is not strictly schemaless. Insight data requires that you store similarly structured data in a collection. Each collection may have a different schema, allowing you to store a variety of different data against contacts.

The schema of a collection is defined ‘on the fly’

Schemas are defined implicitly upon uploading data via the API for the first time. Furthermore, schemas are extendable but not editable.

For example, let’s say you’ve just uploaded the following object into a collection you’ve created called preferences:

{  
   "likes": {  
       "animals": [  
          {  
             "animal": "dogs"  
          },  
          {  
             "animal":"cats"  
          }  
       ],  
       "numbers": [  
          {  
             "number": 1,  
          },  
          {  
             "number": 2,  
          },  
          {  
             "number": 7  
          }  
       ]  
   },  
   "dislikes": {  
       "animals": [  
           {  
               "animal": "rats"  
           }  
        ],  
       "numbers": [13]

   }  
}

For all proceeding uploads, the preferences collection will expect an object with two properties, likes and dislikes, each of which refer to an object where each object has a property called animals which is a list of strings and numbers which is a list of numbers.

Types are immutable

Once you’ve defined a named property, its type can’t be changed. For example, if you upload this JSON object to define your schema for a collection:

{  
   "id": "1",  
   "name": "Tom Bloggs"  
}

You couldn’t then upload the following object:

{  
   "id": 2,  
   "name": {  
       "first": "john",  
       "second": "smith"  
   }  
}

Why? Because the id property has been changed from a string to a number and the name property has been changed from a string to an object.

Objects are extendable

The only things that are mutable are objects because they are extendable. Properties can be added to an object at any time (however they are then bound by the above rule). For example, if this object was uploaded to define a collection’s schema:

{  
   "fullName": "John Bloggs"  
}

Then uploading this object would extend the collection's schema to include the two new properties, firstName and lastName:

{  
   "fullName": "John Bloggs"  
   "firstName": "John",  
   "lastName": "Bloggs"  
}

📘

Get the schema for an Insight data collection

If you are having trouble inserting new records into a collection due to schema mismatches you can:

  • Retrieve the schema using the get insight data collection schema call, and study the schema to see where your new record differs.
  • Empty the collection using the empty insight data collection call, which empties all records and resets the schema. You can then start again and define the schema using the first record you add to the collection.

Bulk import Insight data

We strongly encourage the use of our bulk calls to add Insight data to Dotdigital, as bulk calls can be many factors more efficient than making individual calls.

Import using the API

2060

Bulk data upload calling pattern

The most efficient pattern is to pass your data to us in the fewest calls to bulk import Insight data to add or update insight data to either an account or contacts. To achieve this it is important to batch your data within your system and then send it to us. You can send a request up to 50MB, so many records can be sent in a single call.

You can perform multiple bulk imports at a time for an account. Typically we'd suggest 5 as a maximum, for optimal speed and performance. You should then poll periodically (recommend every 5 seconds) to check on the status of your bulk import using the retrieve Insight data import status call passing the import ID the bulk import Insight data returned. When imports complete you can then add more data using the bulk import Insight data call.

🚧

Ensure contacts already exist to successfully bulk add transactional data to them

You need to ensure that the contacts you're bulk adding transactional data to already exist in the account. Contacts aren't created in this operation.

Import using the app

To import using the app, check out the article Import insight data.

🚧

Plan ahead

Given the nature of Insight data storage, you should spend some time thinking about how your Insight data needs to work for you. A little bit of planning goes a long way: what sort of data do you want to store? How might you want to extend this data as you move forward?

The point to remember is that you don’t want to become backed into a corner by initiating a schema that later proves constrictive. For example, you probably want to avoid entering your product ID as an integer. This won’t give you very easily identifiable information and you won’t be able to change it once it is uploaded. To change it, you would need to start again with a new schema and re-upload all of your data to conform to it. This is definitely something to be avoided!

Data schemas

Set out below are the standard order and product Insight data schemas as commonly used by our integrations. These schemas are still extendable, however. Certain attributes are mandatory for using our product recommendations feature. These are indicated in bold.

Order Insight data schema

The order collection is used to provide sales information on a per contact basis. This information is used to power our product recommendations and ecommerce analytics and persona rating.

📘

This is a contact scoped collection

Collection name and type

When creating the collection using the create Insight data collection call the order collection must be named orders and the type must be orders

Fields

🚧

You must provide the required fields and correct collection name

Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.

FieldTypeRequiredNotes
idstringno
order_totalnumericalyes
paymentstringno
delivery_methodstringno
delivery_totalnumericalno
currencystringyesThe currency attribute must be one that is supported. You can find a list of supported currency codes and how to format them in the Currency conversion article.
order_statusstringno
emailstringno
quote_idstringno
purchase_datedateyes
billing_addressobjectno
billing_address_1stringno
billing_address_2stringno
billing_citystringno
billing_countrystringno
billing_postcodestringno
delivery_addressobjectno
delivery_address_1stringno
delivery_address_2stringno
delivery_citystringno
delivery_countrystringno
delivery_postcodestringno
productsarrayyes
namestringyes
pricenumericalyesThis price should correspond to the item's unit price
skustringyes
qtynumericalyes
order_subtotalnumericalyes
base_subtotal_incl_taxnumericalno
discount_amountnumericalnoThe discount amount that's deducted from the order total. This should be a positive value.
couponCodestringno

Cart Insight data schema

The cart collection is used to provide the information needed to support our abandoned cart features.

📘

This is a contact scoped collection

Collection name and type

When creating the collection using the create Insight data collection call the order collection must be named cartinsight and the type must be cartInsight

Fields

🚧

You must provide the required fields and correct collection name

Bold fields indicate that they are required for our abandoned cart features to operate correctly

FieldTypeRequiredNotes
cartidstringyes
cartphasestringyes
currencystringyesThe currency attribute must be one that is supported. You can find a list of supported currency codes and how to format them in the Currency conversion article.
subtotalnumericalyes
taxamountnumericalyesShould be 0 is taxes aren’t available or calculated yet.
grandtotalnumericalyesThe final total of the order, including tax, discounts, etc., if known
discountAmountnumericalnoThe amount discounted on the entire order. This should be a positive value.
carturlstringyes
createdatedate timeyesUTC in ISO8601 format
modifieddatedate timeyesUTC in ISO8601 format
lineitemsarrayyes
skustringyes
namestringyes
quantitynumericalyes
unitpricenumericalyesThis price should correspond to the item's unit price
totalpricenumericalyes
imageurlstringyes
producturlstringyes
programIDnumericalnoSystem fields used by Dotdigital native abandoned cart engine
cartDelaystringnoSystem fields used by Dotdigital native abandoned cart engine

Product Insight data schema

📘

This is an account scoped collection

The product catalog collection can be used to provide Dotdigital with one or more product catalogs that can be correlated with any order data and used to provide product recommendations.

Collection name and type

When creating the collection using the create Insight data collection call the product catalog collections must be named with a prefix of catalog_ (e.g. catalog_snowAndFun) and the type must be catalog

Fields

🚧

You must provide the required fields and correct collection name

Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.

FieldTypeRequiredNotes
idstringnoA unique id for this product
parent_idstringnoThis field is used to indicate whether the product is linked to a parent (or configurable) product. The value will be the ID of that parent product. If the product is standalone or is itself a parent product, the value can be blank or remain as the parent ID respectively.
namestringyes
pricenumericalyes
specialPricenumericalno
price_incl_taxnumericalno
specialPrice_incl_taxnumericalno
urlstringyes
skustringyes
stocknumericalno
typestringnoThis field is used to define the kind of product (in terms of hierarchy). Some possible values include 'Configurable', 'Bundle', 'Variant' or 'Virtual'. The possible values will be dictated by your ecommerce platform.
statusstringno
image_pathstringyes
categoriesarraynoThe categories field provides a way to link a product to a category or a set of categories. Using the categories field requires to synchronize a separate categories__ insight collection, as described further below. The categories field must be an array of objects, including an id field as a string value. These id(s) provided need(s) to match category ids available in the categoriescollection.
idobjectyesExample:
[
{ “id” : ”123” },
{ “id”: “456” }
]

Categories Insight data schema

📘

This is an account scoped collection

Categories insight collections can be used to store the categories of your products, as defined in your ecommerce store or ERP system. It allows you to segment contacts based on the product categories they purchase, or apply filters to your catalog or product recommendations.

Collection name and type

When creating the collection using the create Insight data collection call the product categories collection must be named with a prefix of categories_ and have a suffix matching the product catalog it relates to; the type must be productCategories

For example:

For the catalog SnowYo_Retail you should create categories_SnowYo_Retail

Fields

🚧

You must provide the required fields and correct collection name

Bold fields indicate that they are required for our product recommendations feature and other commerce analytics.

FieldTypeRequiredNotes
idstringyes
namestringyes
parent_idstringnoparent_id can be used to create parent-child hierachry between categories. For example, if 'Clothing' has the 'id':'123', 'Dresses' ('id':'456') could have 'parent_id' set to '123'. If a category has not got any defined parent, you can use 'parent_id':'0'.