Skip to main content
Skip table of contents

Redundancy and Fault Tolerance for DPE

The following are our recommendations. 

DPE Server

  • Use a load balancer in fail-over configuration in front of two DPE Server machines.

  • One DPE Server is used actively (receives all requests) the other is a hot standby.

  • On failure the load balancer switches transparently for clients.

Databases

  • Content databases are running on a fail-over database cluster.

  • One central DpeCoreDb is running on a fail-over database cluster.

  • DpeCoreDb is shared between the two DPE Servers. This maximizes the preserved (workflow and job) state on fail-over.

  • Alternatively you could have an independent DpeCoreDb for each instance of DPE Server but fail-over would be less seamless.

PAR Files

  • DigaSystem allows to configure a redundant set of PAR file locations (in the windows registry). This mechanism is also supported by DPE Server.

  • Use one central, shared set of PAR files for both DPE Servers.

  • Configure a redundant location for PAR files.

  • Synchronize the master files to the redundant location periodically, e.g. once a night.

  • Alternatively you could store PAR files on a fault-tolerant, distributed file system, e.g. https://moosefs.com/

WorkflowSystem

DPE Processors

  • Processors are running in a farm and are redundant by design.

  • Have more than one processor of each type running.

  • Distribute processors of one type over more than one machine.

  • Processor should be in front of the load balancer and not be assigned to one of the DPE Servers directly.

WorkflowWorker

  • Option 1 (Standby): Configure more than one WorkflowWorker executing the same workflow types on a fail-over cluster (only one is active at the same time).

  • Option 2 (Concurrent): Configure more than one WorkflowWorker executing the same workflow types (both are active at the same time).

WorkflowScheduler

  • Option 1 (Standby): Configure more than one WorkflowScheduler for the same task on a fail-over cluster (only one is active at the same time).

  • Option 2 (Concurrent): Configure more than one WorkflowScheduler for the same task (both are active at the same time). 

Example: workflow that hard-deletes entries at night.

For option 2 we recommend to use slightly different schedules, e.g. one at 2:00 a.m the other one 2:30.

WorkflowTableWatcher

  • Option 1 (Standby): Configure more than one WorkflowTableWatcher for the same task on a fail-over cluster (only one is active at the same time).

  • Challenge with Option 1: WorkflowTableWatcher writes a local memory file that "mirrors" the states of entries (Created, Updated, Deleted). 

  • Option 2 (Concurrent): Configure more than one WorkflowTableWatcher for the same task on a fail-over cluster (both are active at the same time).

  • Challenge with Option 2: workflow could be be executed twice; avoid this by workflow naming

WorkflowFolderWatcher

  • Option 1 (Standby): Configure more than one WorkflowFolderWatcher for the same task on a fail-over cluster (only one is active at the same time).

  • Challenge with Option 1: WorkflowFolderWatcher writes local memory files that "mirror" the states of files (Created, Updated, Deleted). 

  • Option 2 (Concurrent): Configure more than one WorkflowFolderWatcher for the same task on a fail-over cluster (both are active at the same time).

  • Challenge with Option 2: workflow could be be executed twice; avoid this by workflow naming

Clients

  • All clients of DPE must be able to cope with short time unavailabilites of DPE Services.

  • A simple solution to support this are client-side retries.

  • DPE components already support retries.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.