Data Summaries & Snapshots (DSS)
Contents:
Introduction to DSS
Data Summaries & Snapshots (DSS) is an easy to configure and amazingly powerful data analysis engine that’s expands on the idea behind Salesforce’s Reporting Snapshots combined with the usability of Declarative Lookup Roll-up Summaries (DLRS).
Correlating information from many sources across your organization can be a headache, especially when your reporting needs collide with your data architecture. This is often true when you want to show trends over time. Salesforce’s Reporting Snapshots can help; but it’s hard to configure and has more limitations than capability.
DSS solves these problems by allowing you to create a robust single-source of information that combines data from your key objects (say Accounts or Contacts) with rollups of data from any number of child records, summary a single summary for a given moment in time.
For example, you can easily create a monthly snapshot of the contact’s age, income, along with an analysis of service-oriented data stored in countless objects, all on one record.
While reports easily allow you to do each of these tasks, DSS enables you to create cross correlations and evaluate how they evolved over time.
As an open-source package, DSS is free to use.
Key Features
- Summarize data from any number of sources to the same record giving a comprehensive view at a moment in time.
- Multiple snapshots push data to a shared snapshot record for each parent record and timeframe.
- There’s practically no limit to the number of child records you can summarize. (Reporting Summaries is limited to 2,000)
- Summaries can be set to run Daily, Monthly, Quarterly, Yearly, or at the end each Fiscal Year (based on your FY setting)
- Works with any standard or custom object
- Configuration screens were inspired by DLRS so are very easy to use; and partially auto-configuring based on your settings.
- Combines summarization and snapshotting so you can be certain that you’re summarizing the data before creating the snapshot.
- Summary records can have as many fields as Salesforce permits in a custom object.
- The functions that can be performed on records when being summarized include:
- Count, Sum, Min, Max
- Average, Median
- Standard Deviation (for a sample or a population)
- Oldest Value, Newest Value
- Count Unique Values, Count Unique Values (ignore case)
- List Unique Values
- Copy (including blanks), Copy (only non-blank)
- You can control the order in which summaries run, allowing you to build multiple levels of summaries based on previously summarized data.
- Some types of snapshots can be can revised and re-run without causing problems as needs evolve
- Eases migrating by allowing easy retrofitting of some types of historic data; as if the tool has been in use for ages.
Click here to install the DSS Package into your Salesforce org.
If you’re interested in modifying the source code, you’ll find it in this GitHub repo.
Demo Scenario
Given the abstract nature of DSS, it’s best to show how to use it by example.
Let’s say you have a collection of cartoon characters (Contacts) in your org who you want to create trend data for on a monthly basis. DSS allows you to create a single snapshot for each person record containing their:
- Favorite color during the month, as entered on the Contact record
- Information about countries they visited during the month from child object
- Statistics about parties they attended during the month from another child object
The structure of the operational data is straightforward
To this we’ll add a custom object we’ll call “Snapshots” where we’ll summarize taken from each person’s Contact record along with analyses of their Country and Party records.
Snapshot Object
Before configuring DSS, it’s best to start with the snapshots object.
- Create a new custom object as you would any other in Setup > Object Manager.
- Add a date field for the object since this is all about time. (required)
- Add a field that relates the snapshot object to the parent object.This can be done with either a look-up or master-detail relationship.
- Add fields to hold each summarized datapoints you intend to create.
- Note:While you can use Text fields, you can’t summarize to Text Area fields.
Data Summary Tasks
Important: Throughout the configuration, DSS always refers to fields by their API names.
Data Summary Tasks can be thought of as reports or the core parts of SOQL queries. Below shows how it might be configured to evaluate the Parties Attended described above.
Go to the “DSS – Data Summaries & Snapshots” App, select the Data Summary Tasks tab and create your first Data Summary Task. Below shows how we’d summarize the parties attended.
The key fields are:
- Task Status – A summary of any issues with the configuration of the task. If all is good, it’ll show “Ready”. You can’t edit this field but it’ll help you make sure your configured correctly.
- Description – A description that explains the intent of the task for you and your team.
- Parent Object – In our example this is Contact.
- Analysis Timeframe – May be monthly, quarterly, etc. In our example, we want to analyze for the previous month. Notice they come in pairs – Current and Previous periods.
- Active – This acts as an on/off switch allowing you to configure and test tasks before they go live. When you click Run Summary Now the task will run even if inactive.
- Child object – In this example Parties_Attended__c. Notice we’re entering the API name of the object as we will throughout the configuration. Since Country_Visited__c is another child object in our scenario, we’ll have tasks for that as well.
- Date Field on Child Record – The field on Parties_Attended__c which contains the date of each party. In our case it’s party_date__c.
- Snapshot object – The object where snapshots will be created. In our case it’s Snapshot__c.
- Snapshot Date Field (on snapshot object) – The field on Snapshot__c which contains the date of each snapshot. In our case it’s date__c.
- Where Clause – The filter to use when querying the Child Object. Notice that it refers to fields by their API names and is described as you would in a SOQL query. Behind the scenes, the system adds the appropriate timeframe to your Where Clause so you needn’t include it here. You can also filter the parent records here but you’d do it from the perspective of the Child Object. In our example, notice how the Where Clause includes Contact__r.Cartoon_Character__c = true filters the parent object (Contact) from the perspective of the Child Object. Also, if you’re tempted to add a timeframe to the where clause, don’t. DSS takes care of this for you.
Advanced Settings (all explained to me in the Complex Use Cases section below)
- Batch Size – When in doubt, stay with the default value of 25.
- Task Priority – Sets the priority of the task.
- Snapshot As Of Date – For analyzing historical data. Typically you’ll leave this blank.
- Clear Outliers When Done – Typically you’ll leave this blank.
While the most of the remaining fields are set automatically for you, it’s good to double check that they’re correct.
- Comment – A brief English language description of task the configuration that’s automatically created.
- Active Fields – Shows the number of fields configured for this task (see below) where the Field Status is “Ready”.
- Last Started/Last Completed – As their names imply, these show when the task last ran.These are populated by the system and are useful when verifying operations
- Child Relationship Name – The name used by the Parent Object to refer to the Child Object.(Confusing – yes, that’s why it’s automatic)
- Relationship Field (on Child) – The field on the Child Object that relates it to the Parent Object.
- Snapshot Relationship Name – The name used by the Parent Object to refer to the Snapshot Object.
- Relationship Field (on Snapshot) – The field on the Snapshot Object that relates it to the Parent Object.
- Note that any number of Summary Tasks can share a parent, child and snapshot objects. What makes each task unique is the combination of those plus the Analysis Timeframe and Where Clause.
Data Summary Fields
Once your Summary Task is configured it’s time to start adding the Data Summary Fields. A task can have as many Summary Fields as you need. Each Data Summary Task includes a number of fields which are effectively the columns in a report
The key fields are:
- Field Status – A summary of any issues with the configuration of the task. If all is good, it’ll show “Ready”. You can’t edit this field but it’ll help you make sure your configured correctly.
- Active – This acts as an on/off switch allowing you to control which field summarizations are to be used.When you click Run Summary Now, only the active fields will be processed.
- Source Field (on Child) – This is the field you’d like to analyze
- Summarize To (Target Field) – The field on the snapshot record where you’d like the summarization stored.
- How to Summarize – How you’d like the data summarized. The full list of options is below.
While the remaining fields aboveare all set automatically for you, it’s good to double check that they’re correct.
- Field Status – A summary of any issues with the configuration of the field. If all is good, it’ll show “Ready”
- Child Object – The Child Object, as set on the Summary Task
- Source Field Type – The datatype of the selected source field, or an error message as shown in the Target Field Type field
- Summarizing To – The object where the summary will be stored, also set on the Summary Task
- Target Field Type – The datatype of the selected target field or error message.Notice that this is showing an error in the example.
- Notes:
- While it shows you when a field is missing from either the source or target object; it does not verify that the source field is appropriate for the summarization method or if the target field is appropriate to store the result.
- You can summarize to Text fields, but not to Text Area fields.
- You can mix timeframes on the same Snapshot Object; you just need to careful to use separate fields and be smart when reporting. We recommend a good naming convention such as Num Parties vs Num Parties in FY.
The ways to summarize data are:
- Count
- Sum
- Max/Min
- Average
- Median
- Deviation (for a sample or population)
- Count Unique Values/Count Unique Values (ignore case)
- List Unique Values
- Copy (including blanks)/Copy (only non-blank)
- Oldest Value/Newest Value; returns value of the field on the first or last record in the time period. This option requires the Summary Task’s “Date Field on Child Record” be set accurately
Running DSS Nightly
DSS is most useful when set to run daily, ideally in the early hours of the morning. When you click the Run Summary Now button it checks if it’s currently scheduled. If not, it’ll show a screen like below prompting the time you’d like it to run every day. For eample if you’d like to set it for 12:47am, enter 0 as the hours and 47 as the minutes, click Next and you’re set.
Advanced Settings
Batch Size
This sets the number of records from the Parent Object to are processed at a time. Note that the system pulls all child records that are applicable for the selected analysis timeframe. Put another way, let’s say you’re summarizing Opportunities (child) to Accounts (parent). As it processes each Account it pulls all of it’s child Opportunities to analyze. This allows you to control the number of Accounts processed in each batch.
With the Summer 2022 release, we’ve noticed Salesforce has become a lot more rigid about the Governor Limits, thus we recommend you choose a small number to prevent any CPU timeout errors, etc.
When in doubt, stay with the default value of 25. If you have of child records to analyze within the timeframe (month, year, etc) set it super low, maybe even to 1.
Prioritizing Tasks
Every night DSS runs all tasks that are due based on their Analysis Timeframe and Process Data Settings. When it finds multiple tasks to run, it starts with the one that was completed the longest ago.
The Task Priority setting lets you control the order in which tasks run, where higher priority tasks run before lower ones. If multiple are found with the same priority, again the oldest starts first.
A key reason to set priorities is to do Secondary or Derivative Analysis. Imagine you have a large, complex org with lots of disparate types of data and many Summary Tasks already in place. You can create secondary analysis of data by creating Summary Tasks that analyze and combine data from across other snapshots. Accomplishing this is easy.
- Create your Snapshot Object for the secondary analysis as you would any other with the necessary date and relationship fields.
- Create your Summary Tasks with the existing snapshot objects as the child object, as they’re now a data source.
- Set the Task Priority to be low, to ensure the secondary tasks run after the other summarizations are complete.
Creating Historical Summaries (Retro’ing Data)
By setting the “Snapshot as of Date” field and adjusting the Where Clause accordingly you can build out historical data as if you’ve been using DSS since long before it was installed.
- This is about summarizing old child records.If you’re snapshotting values from the parent object, you sadly can’t go back in time.
- Set Snapshot as of Date to a date in the timeframe you’d like to summarize.
- Click the Run Summary Now button to start the task with the current settings.
- The system sets the date on the summary records (the date the information is recorded on) to the last day of the Analysis Timeframe, as it would have been had it actually run on that date.
- The system automatically adjusts the Where Clause in the background to match that timeframe for you based on the date and Analysis Timeframe.
Let’s say you set Snapshot as of Date to 5/15/2020…
Analysis Timeframe | Dates Analyzed | Data Recorded On | Commentary |
---|---|---|---|
Daily – Current | 5/15/2020 | 5/15/2020 | |
Daily – Previous | 5/14/2020 | 5/14/2020 | Analyzing the previous day |
Monthly – Current | 5/1/2020 – 5/31/2020 | 5/31/2020 | Recording on the last day of this month |
Monthly – Previous | 4/1/2020 – 4/30/2020 | 4/30/2020 | Analyzing the previous month, and recording on the last day of that month |
Quarterly – Current | 4/1/2020 – 6/30/2020 | 6/30/2020 | Recording on the last day of this quarter |
Quarterly – Previous | 1/1/2020 – 3/31/2020 | 3/31/2020 | Analyzing the previous quarter, and recording on the last day of that quarter |
Semi-Annually – Current | 1/1/2020 – 6/30/2020 | 6/30/2020 | Analyzing the current half year, and recording on the last day of the half-year |
Semi-Annually – Previous | 7/1/2019 – 12/31/2019 | 12/31/2019 | Analyzing the previous half-year, and recording on the last day of that half-year |
Yearly – Current | 1/1/2020 – 12/31/2020 | 12/31/2020 | Analyzing the current year, and recording on the last day of the year |
Yearly – Previous | 1/1/2019 – 12/31/2019 | 12/31/2019 | Analyzing the previous year, and recording on the last day of that year |
FY – Current | 7/1/2019 – 6/30/2020 | 6/30/2020 | Analyzing the current fiscal year, and recording on the last day of the fiscal year (assuming FY is set to end on 6/30) |
FY – Previous | 7/1/2018 – 6/30/2019 | 6/30/2019 | Analyzing the previous fiscal year, and recording on the last day of that fiscal year (assuming FY is set to end on 6/30) |
Last 12 Months | 5/1/2019 – 4/30/2020 | 4/30/2020 | Analyzing the 12 months ending just before the current month |
Clearing Outlier Records
This is best explained by example. Let’s say you’re tracking the number of opps created each month for an account, and storing it in the field Opps Created During Month on your snapshot object. If an account had just one created during the month and that opp was subsequently deleted, still during the month, the count on the snapshot would be artificially high. Checking the Clear Outliers When Done box clears these outlying records and ensures the data’s accurate.
In the real world; let’s say you have two tasks, each with their own filtering parameters that both populate Opps Created During Month you’ll be headed for trouble if either has Clear Outliers set. Why? Because unless the filtering is identical for both tasks, a given snapshot record may be an outlier for one task but not the other. This will cause one to clear the field where the other sets it. Thus the values summarized on the snapshot will depend on which task ran last.
Recommendations
- When only one task is populating a given field, it’s best to check the Clear Outliers box
- If you have multiple tasks populating a field, it’s best to leave Clear Outliers unchecked
Notes & Limitations
- While it’s simple to configure and somewhat smart about its configuration, it’s possible to configure tasks and fields in ways that cause errors.We advise testing each new configuration to ensure it works as intended, using the Run Summary Now button. To test we suggest that you:
- Watch the logs in the Developer Console as you do a test run
- Look for errors on the ErrLogs tab
- Verify that you’re getting the results you expect.
- DSS does not (yet) insure that source field, summarization method, and target field are all of compatible type.In other words, it won’t stop you from trying to finding the standard deviation of a text field or trying to put a list of unique values in a numeric field. Again, test before you release your work into the wild.
- DSS only runs on a schedule (typically overnight), but not in real time.
- Data Summary Tasks won’t run more than once each day.You can manually run any task at any time using the Run Summary Now button.
vs. Reporting Snapshots
Reporting Snapshots has long been part of Salesforce and is an attempt to address the same need. However, it’s use is hampered by severe limitations and usability issues. In comparison, DSS…
- Can save multiple analyses to a shared snapshot record based on time
- Has no record limits; Reporting Snapshots only allows 2,000 child records
- Consolidates trend data all in one place.
- Allows as many as you can have in a custom object (500 for Enterprise Edition); reporting snapshots are limited to 100 fields
- Snapshot records are related back to original parent record
- Many more ways to analyze data when creating snapshots.
- Is easy to configure and automates some of the settings for you.
vs. Declarative Lookup Roll-up Summaries (DLRS)
We are huge fans of DLRS and rely on it extensively. We also recognize that its sweet spot is summarizations that happen in real time.
If you need to do many scheduled analyses that all store data to the same record, DLRS can generate record lock errors. These errors happen when two things are trying to write to the same place at the same time. The result being one field has accurate data and does not, but you don’t know which. In comparison, DSS…
- Only runs on a schedule, ideally overnight
- Avoids all record lock errors
- Is just as easy to configure
- Offers more ways to analyze data
- Can store data to the parent or a snapshot object
- Designed around trending data over time
Uninstalling and Upgrading DSS
Deactivate the flow
- In Setup, navigate to Process Automation > Flows
- Find the flow called “Run Selected Data Summary”
- Click the down arrow at the far right end of the line and select View Details and Versions
- On the list of Flow Versions, click Deactivate
Uninstall the package
- Still in Setup, navigate to Apps > Packaging > Installed Packages
- Find Data Summaries & Snapshots and click Uninstall
- Make sure the “Save a copy of this package’s data…” is selected
- Complete the uninstall process
Important: Once the uninstall is complete, be sure to…
- Go to the bottom of the Installed Packages page
- Download your data
- Delete the Package
Downloading the data will save a zip file to your hard disk containing three CSV files
- Data_Summary__c – Your Summary Tasks
- Data_Summary_Field__c – The Fields used in each of your Summary Tasks
- ErrLog__c – not needed
Reinstall DSS
- Install DSS anew using the link above
- Once it’s complete reload the information in Data_Summary__c to the LeftProp_DSS__Data_Summary__c object.
- Note that you’ll need to reference the record IDs in the file and the new ones created by the insert process.
- Use the old/new ID pairs mentioned above to update the values in the LeftProp_DSS__Summary__c of the Data_Summary_Field__c file.
- Once the above step is complete load the contents of Data_Summary_Field__c into the object LeftProp_DSS__Data_Summary_Field__c
You’re done. Don’t worry – it’s a ton easier than it sounds