DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Generative AI has transformed nearly every industry. How can you leverage GenAI to improve your productivity and efficiency?

SBOMs are essential to circumventing software supply chain attacks, and they provide visibility into various software components.

Related

  • Empowering Insights: Unlocking the Potential of Microsoft Fabric for Data Analytics
  • Building Analytics Architectures to Power Real-Time Applications
  • An Approach To Construct Curated Processed Dataset for Data Analytics
  • Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

Trending

  • Threat Modeling for Developers: Identifying Security Risks in Software Projects
  • The Scrum Guide Expansion Pack
  • It’s Not Magic. It’s AI. And It’s Brilliant.
  • Modernize Your IAM Into Identity Fabric Powered by Connectors
  1. DZone
  2. Data Engineering
  3. Data
  4. External Data Operations on Salesforce Analytics Using Mulesoft Salesforce Analytics Connector Part 2

External Data Operations on Salesforce Analytics Using Mulesoft Salesforce Analytics Connector Part 2

This article is a continuation of the series dedicated to Salesforce Analytics Integration using Mulesoft's Salesforce Analytics Connector.

By 
Rahul Kumar user avatar
Rahul Kumar
·
Jun. 22, 18 · Tutorial
Likes (2)
Comment
Save
Tweet
Share
6.2K Views

Join the DZone community and get the full member experience.

Join For Free

This post is a continuation of the series dedicated to Salesforce Analytics Integration using Mulesoft's Salesforce Analytics Connector.

If you have missed reading Part 1 of the series, make sure you read it first in order to be familiarized with the various terms and grasp the basic concepts of Salesforce Analytics Integration. 

Let's start going with the various scenarios that can come into the picture for loading data in Salesforce Analytics Cloud System. 

Scenario 1: Creating a New Dataset/Adding to an Existing Dataset and Adding Records in the Dataset

This is a sample MetaData File, which I have created for our example:

{
"fileFormat": {
"charsetName": "UTF-8",
"fieldsDelimitedBy": ",",
"fieldsEnclosedBy": "\"",
"linesTerminatedBy": "\n",
"numberOfLinesToIgnore": 1
},
"objects": [{
"connector": "CSV",
"fullyQualifiedName": "sampledataforWave_csv",
"label": "Sample Data for Wave",
"name": "sampledataforWave_csv",
"fields": [{
"fullyQualifiedName": "Field1",
"name": "Field1",
"type": "Text",
"label": "Field1"
}, {
"fullyQualifiedName": "Field2",
"name": "Field2",
"type": "Text",
"label": "Field2"
}, {
"fullyQualifiedName": "Field3",
"name": "Field3",
"type": "Date",
"label": "Field3",
"format": "MM/dd/yyyy",
"firstDayOfWeek": -1,
"fiscalMonthOffset": 0,
"isYearEndFiscalYear": true
}, {
"fullyQualifiedName": "Field4",
"name": "Field4",
"type": "Text",
"label": "Field4"
}, {
"fullyQualifiedName": "Field5",
"name": "Field5",
"type": "Text",
"label": "Field5"
}]
}]
}

a. Uploading Record In Multiple Batches

    <sub-flow name="salesforce-analytics-batchappend-Sub_Flow">
        <set-variable variableName="dataSetContainerName" value="${dataSetContainerName}" doc:name="Variable : DataSetContainerName" doc:description="DataSet Container Name - Salesforce ID or Developer Name of the App in which Dataset is to be created"/>

        <enricher source="#[payload]" target="#[flowVars.datasetname]" doc:name="Message Enricher" doc:description="Get the Salesforce ID of the Dataset Created in a variable.">
            <sfdc-analytics:create-data-set config-ref="Salesforce_Analytics_Cloud__Basic_authentication" operation="APPEND" description="Sample data Set" label="Data Set 2" dataSetName="demodataset2" edgemartContainer="#[flowVars.dataSetContainerName]" type="metadata\sampledataforWave.json:RELATIVE" doc:name="Salesforce Analytics Cloud : Create DataSet"/>
        </enricher>
        <dw:transform-message doc:name="Create Sample Data for DataSet">
            <dw:set-payload><![CDATA[%dw 1.0
%output application/java
%var sampleSize = 10000
---

(1 to sampleSize) map {
"Field1" : "Field1Value" ++ "$",
"Field2" : "Field2Value" ++ "$",
"Field3" : now as :date,
"Field4" : "Field4Value" ++ "$",
"Field5" : "Field5Value" ++ "$"
}]]></dw:set-payload>
        </dw:transform-message>
        <batch:execute name="salesforce-analytics-appBatch" doc:name="Batch Execute"/>
    </sub-flow>

    <batch:job name="salesforce-analytics-appBatch">
        <batch:process-records>
            <batch:step name="Batch_Step">
                <batch:commit size="1000" doc:name="Batch Commit">
                    <sfdc-analytics:upload-external-data config-ref="Salesforce_Analytics_Cloud__Basic_authentication" type="metadata\sampledataforWave.json:RELATIVE" dataSetId="#[flowVars.datasetname]" doc:name="Salesforce Analytics Cloud : Upload Data Part">
                        <sfdc-analytics:payload ref="#[payload]"/>
                    </sfdc-analytics:upload-external-data>
                </batch:commit>
            </batch:step>
        </batch:process-records>
        <batch:on-complete>
            <sfdc-analytics:start-data-processing config-ref="Salesforce_Analytics_Cloud__Basic_authentication" dataSetId="#[flowVars.datasetname]" doc:name="Salesforce Analytics Cloud : Trigger Data Processing" doc:description="Trigger the processing of data which was uploaded in Parts till now.
On the Data processing is triggered the status can be monitored in Data Manager"/>
        </batch:on-complete>
    </batch:job>

The Edgemart Container that is used here is "SharedAPP." The SharedApp's Developer name is configured in a System Property, which is collected in a variable. Create Data Set operation is invoked with the Edgemart Container obtained above. APPEND sub-Operation is used herein "Create Data Set" operation.
When creating a new dataset: The name of the dataset provided in "Create Data Set" operation should be unique across the organization.
When appending on an existing dataset: The dataset name needs to be used if the same name is not provided, then a different Dataset would be created.

Create Data Set operation is invoked inside a Message Enricher in order to collect the Salesforce ID of the InsightsExternalData object in a variable, which is needed in later operations for uploading data and data processing. A Transform Message component is used here to generate some random data, but in actual use cases, we would be passing/Transforming the data from some other source(s). Make sure that the actual data being passed is aligned properly to the MetaData JSON. For example: notice that for the Field3 of type Date in Metadata JSON, A Dataweave :date Object (java.util.Date) is passed. Similarly, for Text types, a String is passed.

The transformed data is passed on to the Batch Job, which has only one Batch Step with Batch Commit, which contains the Salesforce connector with "Upload External Data" operation. This operation uses the transformed data and the Salesforce ID received from the previous operation to create various Data Parts associated with the same parent record. The size of the data part is controlled by the Batch Commit Size. Though it is configured to 1000 here, it can be customized to some other value as per requirement, keeping in mind that the maximum size of the InsightsExternalDataPart is 10 MB. After the batch job is completed, the data processing is triggered in the on-complete phase using the "Start Data Processing" operation. This initiates creation of the Salesforce Analytics "Job," which will take care of adding the records from the Data parts created into the actual DataSet.

b. Uploading Record In one Batch

<sub-flow name="salesforce-analytics-append-dataset-Sub_Flow">
        <set-variable variableName="dataSetContainerName" value="${dataSetContainerName}" doc:name="Variable : DataSetContainerName" doc:description="DataSet Container Name - Salesforce ID or Developer Name of the App in which Dataset is to be created"/>       
        <dw:transform-message doc:name="Create Sample Data for DataSet">
            <dw:set-payload><![CDATA[%dw 1.0
%output application/java
%var sampleSize = 1000
---

(1 to sampleSize) map {
"Field1" : "Field1Value" ++ "$",
"Field2" : "Field2Value" ++ "$",
"Field3" : now as :date,
"Field4" : "Field4Value" ++ "$",
"Field5" : "Field5Value" ++ "$"
}]]></dw:set-payload>
        </dw:transform-message>

        <sfdc-analytics:upload-external-data-into-new-data-set-and-start-processing config-ref="Salesforce_Analytics_Cloud__Basic_authentication" type="metadata\sampledataforWave.json" operation="APPEND" description="Sample Data Set 1" label="Data Set 1" dataSetName="demodataset1" edgemartContainer="#[flowVars.dataSetContainerName]" doc:name="Salesforce Analytics Cloud : Create,Upload and Start Processing">
            <sfdc-analytics:payload ref="#[payload]"/>
        </sfdc-analytics:upload-external-data-into-new-data-set-and-start-processing>  
    </sub-flow>

This approach will create just one data part on InsightsExternalData record and is ideal for scenarios where the amount of data to be loaded is low. "Upload External Data into new Dataset and Start Processing" operation is used for this. The configuration of parameter's Edgemart Container, Operation, Type is same as the when uploading in batches. The payload is prepared and the Connector is invoked, the rest is taken care by the Connector.

When creating a new dataset: The name of the dataset provided in"Upload External Data into new Dataset and Start Processing" operation should be unique across the organization.

When appending on an existing dataset: The dataset name needs to be used, if the same name is not provided then a different Dataset would be created.

Scenario 2: Overwriting the dataset with the new set of Data.

For Overwriting the dataset, the dataset's name needs to be configured in "Create Data Set" or "Upload External Data into new Dataset and Start Processing" operation whichever is used and OVERWRITE sub-operation needs to be selected. All other configurations are same as APPEND scenario.

    <sub-flow name="salesforce-analytics-batch-overwrite-Sub_Flow">
        <set-variable variableName="dataSetContainerName" value="${dataSetContainerName}" doc:name="Variable : DataSetContainerName" doc:description="DataSet Container Name - Salesforce ID or Developer Name of the App in which Dataset is to be created"/>

        <enricher source="#[payload]" target="#[flowVars.datasetname]" doc:name="Message Enricher" doc:description="Get the Salesforce ID of the Dataset Created in a variable.">
            <sfdc-analytics:create-data-set config-ref="Salesforce_Analytics_Cloud__Basic_authentication" operation="OVERWRITE" description="Sample data Set" label="Data Set 2" dataSetName="demodataset2" edgemartContainer="#[flowVars.dataSetContainerName]" type="metadata\sampledataforWave.json" doc:name="Salesforce Analytics Cloud : Overwrite DataSet"/>
        </enricher>
        <dw:transform-message doc:name="Create Sample Data for DataSet">
            <dw:set-payload><![CDATA[%dw 1.0
%output application/java
%var sampleSize = 10000
---

(1 to sampleSize) map {
"Field1" : "Field1Value" ++ "$",
"Field2" : "Field2Value" ++ "$",
"Field3" : now as :date,
"Field4" : "Field4Value" ++ "$",
"Field5" : "Field5Value" ++ "$"
}]]></dw:set-payload>
        </dw:transform-message>
        <batch:execute name="salesforce-analytics-appBatch" doc:name="Batch Execute"/>
    </sub-flow>    

The source code of the above scenarios can be found here.

Please note that this post is applicable only for Mule 3. For Mule 4, we have a Salesforce analytics Module instead of a connector. I will be doing a post for it later.

Data processing Analytics Connector (mathematics) MuleSoft

Opinions expressed by DZone contributors are their own.

Related

  • Empowering Insights: Unlocking the Potential of Microsoft Fabric for Data Analytics
  • Building Analytics Architectures to Power Real-Time Applications
  • An Approach To Construct Curated Processed Dataset for Data Analytics
  • Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • [email protected]

Let's be friends: