Liquid Studio Documentation
Data Mapping / Using Aggregate Functions
In This Topic
    Using Aggregate Functions
    In This Topic

    Many of the Data Mapper functions use context inputs, including those in the Aggregate section. The output of these function varies depending on where you place the Context inputs. For example, you may wish to apply an Aggregate function such as Max, Average or Sum within the data source as a whole or within a parent node at a particular level. In this section we will work through examples using each of these three functions from the Aggregate section, outlining the impact of the Context input connection.

    We will be using data related to retail figures for a sales company. As is often the case, as well as mapping the existing data we need to use the input data to calculate new figures to be included in the output data, such as the sales for the company as a whole as well as for individual regions, the average sales within regions and so on. We will be using the retail_source.xsd, inferred from the following XML data:

    Source XML
    Copy Code
    <retail>
     <region>
      <region_ref>125</region_ref>
      <outlet>
       <manager>James Jones</manager>
       <sales>350690</sales>
      </outlet>
      <outlet>
       <manager>Mark Simpson</manager>
       <sales>450880</sales>
      </outlet>
      <outlet>
       <manager>Linda Parks</manager>
       <sales>22300</sales>
      </outlet>
     </region>
     <region>
      <region_ref>102</region_ref>
      <outlet>
       <manager>Patricia Clark</manager>
       <sales>122000</sales>
      </outlet>
      <outlet>
       <manager>Frank Graham</manager>
       <sales>321560</sales>
      </outlet>
      <outlet>
       <manager>James Yates</manager>
       <sales>22800</sales>
      </outlet>
      <outlet>
       <manager>Ken Livingston</manager>
       <sales>665430</sales>
      </outlet>
     </region>
     <region>
      <region_ref>112</region_ref>
      <outlet>
       <manager>Laura Dean</manager>
       <sales>210090</sales>
      </outlet>
      <outlet>
       <manager>Stephen Watson</manager>
       <sales>30800</sales>
      </outlet>
     </region>
    </retail>
    

    For our first example, we will use the retail_target.xsd which is inferred from the following structure:

    Target XML
    Copy Code
    <brand>
     <peak_sales>112300</peak_sales>
     <area ref="143">
      <store manager="Jill Gray">
      23400
      </store>
      <store manager="Peter Thomson">
      112300
      </store>
      <store manager="Derek Roberts">
      78900
      </store>
     </area>
     <area ref="110">
      <store manager="Tom Brown">
      33280
      </store>
      <store manager="Kay Mitchell">
      45000
      </store>
     </area>
    </brand>
    
    Max

    In the first instance, we want to map the maximum sales figure for the company as a whole, writing it to the "peak_sales" item you see above. Here is the Mapper with Source and Target imported:

    Reader and Writer

    We drag a Max function onto the Mapper, connecting its Context input to the "retail" output in the XML Reader and its Value input to the "sales" output. The Max function output connects to the "peak_sales" input in the XML Writer. In this case the function will output the highest sales figure for any outlet listed in the data source, as "retail" is the document root element.

    Max

    Max Function Connected

    We can now make the remaining connections, remembering to map "retail" to "brand" and "sales" to the Element Value in the XML Writer. Although we have used these inputs to calculate the maximum sales figure, we have not yet mapped them individually.

    Inputs and Outputs Connected

    We can now execute the transform by pressing Shift-F5 or the Execute button (Execute). The transform is applied and the file we selected as output opens in the editor:

    Output XML
    Copy Code
    <brand>
     <peak_sales>665430</peak_sales>
     <area ref="125">
      <store manager="James Jones">350690</store>
      <store manager="Mark Simpson">450880</store>
      <store manager="Linda Parks">22300</store>
     </area>
     <area ref="102">
      <store manager="Patricia Clark">122000</store>
      <store manager="Frank Graham">321560</store>
      <store manager="James Yates">22800</store>
      <store manager="Ken Livingston">665430</store>
     </area>
     <area ref="112">
      <store manager="Laura Dean">210090</store>
     <store manager="Stephen Watson">30800</store>
     </area>
    </brand>
    

    The output includes the highest sales figure included in the source data. If we wanted to map the highest sales figure for each region rather than the company as a whole, we would simply alter the example above to connect the Max Context input to the "region" output in the XML Reader instead, connecting the output to an appropriate input in the XML Writer - we will do this next with the Average function.

    Before we move on from the Max function, in the example above, the effect would actually be identical if we left the Max Context input unconnected. Connecting the Context input is actually optional. However, if you do not specify a Context, the Mapper will apply the function across the entire input data source. In this case we are using all values in the source anyway, but of course this depends on your required context.

    Max With No Context Input

    Average

    Let's now assume that the output needs to include an indicator of the average sales figure for each region, to match the retail_alt_target.xsd target Schema, inferred from the following XML:

    Target XML
    Copy Code
    <brand>
     <area ref="143">
      <average_sales>67850</average_sales>
      <store manager="Jill Gray">
       23400
      </store>
      <store manager="Peter Thomson">
      112300
      </store>
      <store manager="Derek Roberts">
      78900
      </store>
     </area>
     <area ref="110">
      <average_sales>18890</average_sales>
       <store manager="Tom Brown">
      33280
      </store>
      <store manager="Kay Mitchell">
      45000
      </store>
     </area>
    </brand>
    

    Here is our updated Source and Target in the Mapper:

    Reader and Writer

    We drag an Average component onto the Mapper, connecting its Context input to the "region" output in the XML Reader and its Value input to the "sales" output. The Average function output can be connected to the "average_sales" input in the XML Writer. This will calculate the average across each region, rather than across the data as a whole, as in the previous example.

    Average

    Average Function

    We can now make the remaining connections, remembering to map "region" to "area" and "sales" to the Element Value input.

    Inputs and Outputs Connected

    We can now execute the transform by pressing Shift-F5 or the Execute button (Execute). The transform is applied and the file we selected as output opens in the editor:

    Output XML
    Copy Code
    <brand>
     <area ref="125">
      <average_sales>274623.333333333</average_sales>
      <store manager="James Jones">350690</store>
      <store manager="Mark Simpson">450880</store>
      <store manager="Linda Parks">22300</store>
     </area>
     <area ref="102">
      <average_sales>282947.5</average_sales>
      <store manager="Patricia Clark">122000</store>
      <store manager="Frank Graham">321560</store>
      <store manager="James Yates">22800</store>
      <store manager="Ken Livingston">665430</store>
     </area>
     <area ref="112">
      <average_sales>120445</average_sales>
      <store manager="Laura Dean">210090</store>
      <store manager="Stephen Watson">30800</store>
     </area>
    </brand>
    

    The output contains an indicator of average sales for each area. Again, to output average sales for the whole company, we could have connected the Context input to the "retail" output or left it unconnected, as this makes the Mapper calculate the average across the entire data source.

    Sum

    Let's finally look at a slightly more complex example in which we refer to two different contexts within our Aggregate functions. We now need to observe the following target data, from which the retail_complete_target.xsd Schema has been inferred:

    Target XML
    Copy Code
    <brand>
     <total_sales>292880</total_sales>
     <area ref="143">
      <area_sales>214600</area_sales>
      <store manager="Jill Gray">
      23400
      </store>
      <store manager="Peter Thomson">
      112300
      </store>
      <store manager="Derek Roberts">
      78900
      </store>
     </area>
     <area ref="110">
      <area_sales>78280</area_sales>
      <store manager="Tom Brown">
      33280
      </store>
      <store manager="Kay Mitchell">
      45000
      </store>
     </area>
    </brand>
    

    We need to map not only the total sales for the company as a whole, but also for each area within it. Here is the Mapper with Source and Target:

    Reader and Writer

    Let's drag a Sum component onto the Mapper firstly to handle the total sales for the company. We connect its Context input to the "retail" output in the XML Reader, although in this case we could leave the Context unconnected. We connect the Value input to the "sales" output and the Sum function output to the "total_sales" input in the XML Writer.

    Sum

    Sum Function

    Now we can do the same for the regions, dragging another Sum function over. We connect the Context input this time to the "region" output, with the Value input again connecting to the "sales" output. This time the Sum output connects to the "area_sales" input in the XML Writer.

    Sum

    Sum Function

    Now we can make our remaining connections, again remembering to map those outputs we have already used in our Sum functions. The "sales" output is now connected to three different inputs: the two Sum functions and the XML Writer. This is a typical example of how one item of data in your source can be used to generate multiple values in your output.

    Inputs and Outputs Connected

    Let's go ahead and execute the transform by pressing Shift-F5 or the Execute button (Execute). The transform is applied and the file we selected as output opens in the editor:

    Output XML
    Copy Code
    <brand>
     <total_sales>2196550</total_sales>
     <area ref="125">
      <area_sales>823870</area_sales>
      <store manager="James Jones">350690</store>
      <store manager="Mark Simpson">450880</store>
      <store manager="Linda Parks">22300</store>
     </area>
     <area ref="102">
      <area_sales>1131790</area_sales>
      <store manager="Patricia Clark">122000</store>
      <store manager="Frank Graham">321560</store>
      <store manager="James Yates">22800</store>
      <store manager="Ken Livingston">665430</store>
     </area>
     <area ref="112">
      <area_sales>240890</area_sales>
      <store manager="Laura Dean">210090</store>
      <store manager="Stephen Watson">30800</store>
     </area>
    </brand>
    

    The output contains an indicator of the total sales figures for both regional and company-wide levels. The process is essentially the same for each of the Aggregate functions.

    A good way to familiarise yourself with the Aggregate Context inputs, or any function in the Mapper, is to experiment with some input data, observing the difference in output when you alter your input connections.
    See Also