Aggregating data between two steps enables you to combine data, either to save it or to launch the following step, based on the new inputs.
It's what we called an Aggregation View.
What's beautiful with the aggregation view is that it automatically aggregates every data for you, based on your filter(s).
Let's say that you have:
- Output Step #1: company_url + company_name
- Output Step #3: company_url + number_employees + website
The aggregation is going to output: company_url + company_name + number_employees + website
Captain Data offers two types of aggregation: as input or as output.
Input Aggregation
To aggregate inputs, it means you're already on a specific step: the idea is to aggregate inputs inside the step.
In the following example, under the automation "Extraction LinkedIn..." we have a "Create an aggregation View" button 👇
Output Aggregation
Whenever you need to aggregate data and save such data, you'll need an independent task, a new step in your workflow.
To find the aggregation's step, you can search for the application Captain Data, and then "Generic Aggregation".
How does it work?
In order to aggregate data, you'll need two previous steps:
- FOR EACH - The first step you'll choose will be your index: for each first step's result, Captain Data will apply a filter to COMBINE data with the second step's results
- COMBINE - As stated, this second step will enrich the first step's data
- WHERE - Finally you can apply a set of filters against any data fields (see below for an example)
- You can then ADD A FILTER: add an unlimited number of optional filters with the options "AND" and "OR" to control the aggregation's behavior
Note that at the moment if you select the "OR" filter, every other filters will have to be "OR".
Example
Let's say you want to create a workflow to find specific leads on LinkedIn and that you need to enrich each profile with their company data and email address.
This is the workflow that could fit your use case:
The "Find Verified Email Dropcontact" step requires the following fields as input:
- full_name
- company_name
- website (optional) as inputs.
For this last step to work, we need to assign each lead data (step #2) to a specific company (step #3).
This will be possible by creating an aggregation view and more specifically using the output field "company_name".
This is our aggregation view, View #4:
This example could work - but it's way too simple.
What you need to do it to think about all the different cases you could find; in our example:
- The company ID from Step #3 could be contained in the company URL from Step #2
- The company public handle from Step #3 could be contained in the company URL from Step #2
- The URLs could match perfectly as-is
- and so on
Mapping: using a view as input
Before diving into more complex filtering, let's check how we can use the previous view into our mapping:
Almost everything is done for you automatically, based on the view you created.
Here, since we've merged everything into a single object, you can use all the fields that were merged.
Since we have a smart semantic layer, we can easily recommend you the perfect input to use :)
Advanced Filtering
In the previous example, things were pretty basic, the next step is to use advanced filters.
You can add multiple WHERE clause to refine how you aggregate data.
To give you an idea of how it works, you can think about the boolean logic:
- AND is very strict: you need every filters to perfectly match
- OR is more "wide": any of the filter will validate the condition
In the previous case, we mentioned that it's always a good idea to think about all the different cases possible.
Here we only need ONE of the filters to work: if one condition is valid, we'll aggregate the People Profile with the Company Profile.
Then it's pretty much up to you to test and iterate against your filters to fine tune the system 👌