Bringing the work of your data science team into your Segment environment using SQL traits

This article will focus on how different business teams use different tools to create value and how, in the context of Segment, Segment Engage uses SQL traits to create synergies and compatibilities between your data warehouse and the teams that operate them.


Why would you like Segment to work together with your data warehouse?

Imagine a scenario where on one hand you have a Customer Data Platform - Segment - that operates as the beating heart of your customer data. The data that is collected fuels applications in the fields of product analytics (Mixpanel, Amplitude, etc), marketing automation (Iterable, Braze, etc), sales (Salesforce, Hubspot, etc) and brings behavioural data all the way into the warehouse (Google, AWS, Azure, etc). In this case Segment fuels your use cases for marketing teams, product teams and brings additional behavioural data to your customer data warehouse.


On the other hand you have a team of brilliant data scientists working on heaps of data stored in your data warehouse. This team builds models that predict churn, propensity to buy, next best offers or event customer lifetime value. For each of your customers they’re able to compute a set of metrics and values that would be of interest for marketing, product and sales teams to act upon. But that data lives in the data warehouse and not in the CDP as we need the computational power and flexibility that warehouses offer.


The worst that can happen, which is a reality for many businesses, is that the brilliant and hard work of the data science team isn’t contributing at scale to the core of your organisation (i.e: marketing, sales, support, etc) as the data is stuck.


Enter Engage SQL traits


If you’re a Segment Engage customer (previously known as persona’s), the next chapter is for you.


The blueprint


A “classical Segment blueprint” in which Segment Engage would be used and which aligns with the scenario above would resemble something like in the picture below:

There are two main interlinked perimeters:

  • The Segment perimeter In the blueprint above this is defined as being Segment as a platform. Including the elements listed below:

  • Sources - Segment sources are Software Development Kits (SDKs), APIs or Server libraries that are used to send data into Segment from website, servers, mobile applications or other platforms.

  • Destinations - The Segment sources mentioned above send data to Segment destinations. These destinations forward the data to 3rd party applications such as Braze, Facebook, etc. or even Data Warehouse solutions - to which we’ll come back later.

  • Engage - Segment Engage, previously known as Twilio Engage or Segment Personas, allows Segment operators to build computed traits and audiences. This is where SQL traits reside.

  • The data warehouse perimeter This is being defined as your customer data warehouse infrastructure. Depending on the context of the organisation you’re in, you’ll be calling this data lake, data warehouse, data mart or any other equivalent. In this case we do not need to go into details of this piece of infrastructure and we’re mostly concerned by the elements listed below:

  • Data landing zone - This is the part of the data warehouse perimeter infrastructure where data is written to. Data captured by Segment can be sent to a data warehouse through a data warehouse Segment destination. Segment will send the data into your data warehouse of choice according to some predefined schemas.

  • Output production table - This table includes a clean data set that is ready to be ingested by Segment.

for real-time personalisation.

Building Segment Engage SQL traits


Inside the Engage audience tab you’ll be able to find a section called computed traits. Computed traits are a Segment feature that allows you to add additional user traits to your user data ingested by Segment without the need for actually using an identify call. One of the possible custom traits is a SQL trait.

SQL traits allow for Segment to read data from a number of different data warehouses and add these traits to users inside Segment in order to enrich destinations, take them into account for building audiences or simply make them available through the profile API for real-time personalization.

During the configuration you’ll need to select a data warehouse. At the time of writing the following options are available. A full and up-to-date list can always be consulted here.

Then it’s time to write your SQL query. The builder tab allows you to write raw SQL to retrieve the values you require for each of your users.


Segment started to provide some SQL templates to calculate RFM modelling, read schemas linked to different tools, etc.

A simple example can be found below. We’ll be using lifetime value as an example.


The result can be found below. Each userID for which a value “ltv” was found in the bigquery table now has a computed proprietary liveftime_value.

This can now be used to build audiences and send these to marketing automation tools, enrich analytics platforms etc from a single place.

The refresh of this data depends on the Segment contract you have but can happen up to every hour ensuring you have near real-time data and update values for your users.


Conclusion


In the above example we saw how Segment Engage SQL traits is a Segment feature that can help you to leverage the work between the marketing, product teams and data science teams in an environment where Segment exists and additional metrics are being calculated within a data warehouse. Segment SQL traits allows you to leverage the work from your data science teams and apply it for personalisation and automation within your existing Segment customer data infrastructure.


Do you need help? Reach out!