Administration of Archiving in PeopleStage
Archiving
PeopleStage offers the ability to archive data from the PeopleStage database into an Archive database. You can set up an automated maintenance routine either globally or for individual Areas and Campaigns to define a period after which records will be archived. The diverse types of history, for example outbound Communications versus inbound Responses, can each have different settings applied to determine when the records become due for archiving. Overall management of the transfer of records is controlled centrally, allowing this to be restricted to quiet periods.
Diagram Settings Editor
Controlling when the records are archived
The Archiving tab in the Diagram Settings Editor is used to control when the archiving should take place. The top section sets the marking schedule and will typically run on a recurring basis, 1/2 hourly(recommended), daily between 18:00 & 23:00 hours to reduce the impact the process has. The lower section controls when the moving of marked records is permitted to take place (shown in green). Within these periods the system will wait for a slack period (10 seconds with no jobs) and then archive a batch of records from one of the tables. The defaultbatch size is 1 million records. The next time it runs, a new table is chosen until the end of the time window or all permitted records are exhausted.
-
Select - File > Administration > Diagram Settings Editor - Archiving tab
There are two stages to the archiving process:
-
Marking and Validating which records are due to be archived
-
Moving the validated records to the archive database
Controlling which records are archived
The Settings View on Areas and Campaigns is used to control which records are archived.
Settings at an Area level are inherited by all inner Areas and Campaigns. This allows default settings to be applied at the Marketing Processes level, and then exceptions to be made for specific areas/campaigns.
Different settings can be applied for each type of history stored in the PeopleStage database. These are listed alongside the Step in the diagram that they are associated with.
Broadcast Interaction History
-
Broadcast Interaction history can be controlled in detail, applying different thresholds for each type of interaction. For example, here, clicks are kept for longer and bounces are never archived
Step | Dependent record set | Associated PS Database Table | Notes |
Pool | Delivery, Content, Response | State History, Deletion Control | This is the record of when each person enters and leaves each pool. This is required for the Journey History table in the standard FS design. |
Delivery | Content, Response | Communications | This holds one record per communication sent and is required for the Communications table in the standard FS design. |
Communications Delivery | This holds one record per communication sent but is not required for the standard FS design | ||
Content | No dependencies | Communications Content | This gives details of the content variation an individual received and so has multiple records per communication sent (one per Content Field that is being tracked). It is required for the Content table in the FS design. |
Communications Tracking History | This holds the attributes which are output with each communication and has multiple records per communication. It is NOT currently required for standard FS design which does not by default include Attribute information. | ||
Response | No dependencies | Response Attribution | This holds each transactional record that has been attributed as a response to a communication. It is required for the Responses Attributed and Communications Responsible tables in the standard FS design. |
Broadcast Interaction | No dependencies | Email Response Email Response Details | These tables are in the Email Response database and record clicks, opens and bounces etc as retrieved from the Email Service Provider using the FERGE tool. They are not currently included in the standard FS design. |
Audience | No dependencies | Journey History | This table gives details of the precise route an individual took through a Campaign. It is currently not used and so could be archived with no impact. It is NOT needed for the Journey History in the standard FS design (this comes from the summarized version in the State History). |
Considerations for Archiving
Care should be taken when deciding how soon to archive, especially when the Campaign is still Live, since the PeopleStage history tables are, in many cases, required for normal running.
Impact on Constraints
Communication constraints set on Campaigns and Areas rely on the Communication History.
Potential conflicts can arise if a constraint such as “1 of this Campaign per Year” is set within an Area where the archiving of Delivery History is set to be “after 6 months”.
In this situation, the system automatically resolves the conflict by postponing the archiving until the constraint period has elapsed, and these numbers are now identified in the standard “Archive” report.
Impact on Pools
The setting for the archiving of Pool History is not applied when people are still in Pools. The history records when people entered and subsequently left each pool. Only records where people have both entered and left a pool are archived. The archiving of records where people are still in a Pool is postponed until they have moved on.
Impact on Journey and Interaction Selections
These selections rely on the PeopleStage history tables to identify which people to select. Archiving identifies these dates when marking records and will delay the archiving until this date is exceeded.
Reports
The PeopleStage reports rely on either the PeopleStage SQL database or the FastStats system and, therefore, will not be able to report on records that have been archived. If only some aspects of the history have been archived then the reports will show misleading results. For example, if clicks are held for longer than other interactions then a report could show a 100% click rate!
PeopleStage History in FastStats
The standard PeopleStage design extracts data from views which are based on the history tables. Once records have been archived they will not be loaded into FastStats (unless extra work is carried out to the design to draw from both data sources). The table above summarises the relationship between the FastStats tables and the PS history.
Dependencies
A Communication record can only itself be archived if the associated Content, Response and Broadcast Interaction records have also been archived. This means that in cases where, for example, a “Bounce” interaction is being kept forever, the associated parent records, Communication and Pool record will also be kept for ever.
History of Archive activity
The History view will display a summary of the archiving activity that has been taking place within a given Area or Campaign. Firstly, it shows how many records have been marked as ready for archiving, those records held back due to constraint or journey selections, based on the settings, for example archive after 6 months. Secondly, it shows how many records have been moved to the archive database.
Records are marked according to the schedule in the upper part of the Diagram Settings. This happens on a recurring schedule of hourly (recommended), daily or weekly.
Records are moved to the archive database in batches of up to 1 million records (configurable in the FastStats service). This can only occur within the valid time periods defined in the lower part of the Diagram Settings. Within these periods the system will wait for 10 seconds to elapse with no jobs running, and will then move up to a million records into the archive database.
Administration of Archiving
Archiving is on by default from Q2 2017 onwards for newly created systems. Otherwise it must be turned on using the FastStats configurator:
- Webservice > Client Configuration > Features Allow archiving
An archive database must also be created using the FastStats Configurator.
You must assign the PeopleStage Archiving role to individual users using FastStats > Users Explorer > right-click on User > Roles
This controls both the display of the Diagram Settings options and the settings on individual Areas / Campaigns.
Faststats Service settings
The service allows the administrator to set the limits within the archiving process to control the effects of the time taken to complete the transaction, and on the size of the database logs. This has shown to be crucial in mature systems with years of datathat needs to be archived.
The following settings can be changed for the two parts of the process. The process is as follows:
The marking process runs step [1] until condition [2] or [3] is met. When condition [2] or [3] are met then a relatively expensive operation is undertaken to mark the records that are affected by constraints or journeys.This is then written to the Archive Control table.
-
Marking Batch Size - This is the maximum number of records that we ask SQL to return from a table that are ready for archiving (based on the user’s time parameters, set in the ‘Diagram Editor’). SQL will return a set of records ranging from 0 to this maximum
-
Marking Max time (minutes) - This is the cut-off time to stop the initial marking before we start on the next stage
-
Marking Max Records - This is the cutoff maximum records before we start on the next stage
-
Archiving Batch Size - The maximum number of records from one table that can be selected in the process to move records to the archive database. The process will continue to run picking a new table each time until all records are moved or the scheduled time slot, set in the ‘Diagram Editor’, is exceeded
Reference
Key words in server log search
-
Marked*records
-
Info (-1) Marked * records:*
-
Info (-1) Validated*