General system configuration of Replicant#
You can optionally configure several system parameters of Replicant. These parameters control Replicant’s various behavior and functions in replication—for example, tracing, logging, and information dashboard.
This feature is available since version 20.07.16.5.
To configure Replicant’s system parameters:
- Specify the general configuration parameters in a YAML configuration file.
- Run Replicant with the
--generaloption and provide the full path to the general configuration file.
System configuration parameters#
liveness-monitor#
Controls liveness checks of Replicant. This allows you to configure how Replicant stops and resumes replication in different situations.
enable{true|false}Enable liveness monitoring.
inactive-timeout-ms- Specifies the replication inactivity time in milliseconds. If the liveness monitor detects no replication activity in this time period, Replicant stops and resumes replication.
Default:
900_000(15 minutes). snapshot-extractor-inactive-timeout-msSpecifies the time period when no snapshot extraction activity occurs. If the liveness monitor detects no snapshot extraction activity in this time period, Replicant stops and resumes replication. If you don’t specify this parameter, it takes the value of
inactive-timeout-ms.Default: The value of
inactive-timeout-ms.snapshot-applier-inactive-timeout-msSpecifies the time period when no snapshot Applier activity occurs. If the liveness monitor detects no snapshot Applier activity in this time period, Replicant stops and resumes replication. If you don’t specify this parameter, it takes the value of
inactive-timeout-ms.Default: The value of
inactive-timeout-ms.realtime-extractor-inactive-timeout-msSpecifies the time period when no realtime extraction activity occurs. If the liveness monitor detects no realtime extraction activity in this time period, Replicant stops and resumes replication. If you don’t specify this parameter, it takes the value of
inactive-timeout-ms.Default: The value of
inactive-timeout-ms.realtime-applier-inactive-timeout-msSpecifies the time period when no realtime Applier activity occurs. If the liveness monitor detects no realtime Applier activity in this time period, Replicant stops and resumes replication. If you don’t specify this parameter, it takes the value of
inactive-timeout-ms.Default: The value of
inactive-timeout-ms.min-free-memory-threshold-percentIf free memory drops below this threshold, Replicant stops and resumes operation.
liveness-check-interval-msSpecifies the time interval between two successive liveness checks in milliseconds.
enable{true|false}.Enables schema validation. Replicant validates the target schema against the source schema.
error-typesSpecifies the error types in an array. The following error types are supported:
ALLERRORSWARNINGSCOL_CNT_MISMATCHCOL_TYPE_MISMATCH
Default:
[ALL].warning-as-error{true|false}.Whether to consider warnings as errors.
Default:
false.dump-schema-mapping{true|false}.Controls whether or not Replicant dumps the mapping between source and target schemas.
- Microsoft SQL Server
- MySQL
- Oracle
- Snowflake
enable{true|false}.Enables permission validation.
Default:
truefor Databricks and Snowflake targets,falseotherwise.- In
DDLfencing, Replicant embeds the validation token into the table name. - In
DMLfencing, Replicant embeds validation token into a row value and keeps it in the respective fencing table - Secifying
NONEdisables fencing for the respective storage. enable-metadata-fenceEnables and specifies metadata fencing.
The following values are supported:
DDLDMLNONE
Default:
DDLfor JDBC metadata databases.enable-dst-fenceEnables and specifies fencing on the destination database.
The following values are supported:
DDLDMLNONE
Default:
DDLfor JDBC databases.enable-dst-query-fence[v20.02.01.13]Enables and specifies query fencing on the destination database.
The following values are supported:
DDLDMLNONE
Default:
DDLfor JDBC databases.heartbeat-interval-msSpecifies the time interval between successive heartbeat signals in milliseconds.
Default:
30_000DEBUGINFOERRORWARNINGenable{true|false}.Default:
false.storage{FILE|SQLITE}.Default:
FILE.locationDirectory location for the dashboard dump file. The dump file is periodically udpated.
format{TEXT|JSON}.Specifies the file format for the dashboard dump file.
Default:
TEXT.interval-msSpecifies the time interval for updating the dashboard dump file in milliseconds.
Default:
1000.reuse-metadata-tables{true|false}.Whether to reuse metadata tables instead of creating new ones.
Default:
false.
schema-validation[v20.09.14.8]#
Enables and configures schema validation. Replicant displays schema validation errors in the information dashboard.
permission-validation#
Validates whether user possesses appropriate permissions to read table data in a particular database. This parameter works insnapshot andfull mode replication.
permission-validation shows expected behavior for the following databases:
fencing[v20.10.07.3]#
This parameter allows you to prevent multiple instances of Replicant from executing simultaneously. Consider the situation when the same replication getsresumed twice, leading to two replication processes trying to perform the same job. Fencing ensures that the older replication process terminates as soon as a new replication process starts.
Replicant achieves this functionality by using validation tokens. A validation token consists of a monotonically increasing counter. Each replication obtains this counter at the start. Before each action on the respective storage, the replication job performs validation against this counter. Thus a validation token acts as afence around the metadata and destination storage.
Fencing works in the following manner depending on the configuration:
data-dir[v20.12.04.4]#
Specifies the directory to store temporary files related to bulk loading.
If you don’t specifytrace-dir,data-dir also storesthetrace.log file.
Default:data/.
trace-dir[v20.12.04.4]#
Specifies the directory location forthetrace.log file.
If you setdata-dir, Replicant createstrace-dir inside thedata-dir directory.
Default:data-dir/default.
error-trace-dir#
Specifies the directory location fortheerror-trace.log file. If you setdata-dir, Replicant createserror-trace-dir inside thedata-dir directory.
Default:data-dir/default.
trace-time-zone[v20.12.04.8]#
Thetrace.log file contains timestamps in a specific timezone. This parameter allows you to specify the timezone to use.
For example, withtrace-time-zone: Asia/Kolkata,trace.log contains timestamps as2021-01-07 19:08:24.530 IST.
Default:UTC. For example,2021-01-07 13:40:23.462.
trace-level[v20.12.04.12]#
Specifies the level of logback tracing. You can choose among the following trace levels:
Default:DEBUG.
archive-trace[v20.12.04.12]#
{true|false}.
Archives trace logs on a daily basis into time stamped files.
Default:true.
purge-trace-before-days[v20.12.04.12]#
Specify the number of days to keeptrace.log archives. Older trace logs are automatically deleted.
Default:0.
sensitive-info-trace-dir[v20.12.04.16]#
{true|false}.
Iftrue, Replicant logs sensitive trace messages into a separate file in thesensitive_trace_directory directory.
Default:true.
dashboard-dump-file[v21.04.06.1]#
Replicant can dump the contents of the information dashboard in a file. This parameter allows you to configure its behavior.
license-path[v21.05.04.3]#
Specifies the location of the license file.
db-connection-tracing[v21.05.04.6]#
{true|false}.
Replicant can collect diagnostics on database connection usage. This parameter allows you to enablestack trace dump during the diagnostics.
Default:false.
metadata#
This parameter allows you to reuse metadata tables.
report-dir#
Controls where Replicant stores reports, such as the permission validation report.
log-pattern#
Allows you to change the log format—for example,"%d{HH:mm:ss.SSS} [%t] [replicant] %-5level %logger{35} - %msg %n". For more information, seethe logback documentation.
enable-console-logger#
Enables logging in the console. Normally, these logs go to thetrace.log file.
cleanup-dst#
{true|false}.
Whether to clean up target metadata tables.
ntp-server#
Allows you to specify the time server for license validation.
Default:time.google.com.
Sample configuration#
You can find a sample Replicant system configuration file inside theconf/general directory of yourArcion self-hosted download.
The following shows a sample configuration:
liveness-monitor:enable:trueinactive-timeout-ms:900000min-free-memory-threshold-percent:5liveness-check-interval-ms:60000schema-validation:enable:falsepermission-validation:enable:falsearchive-trace:truepurge-trace-before-days:30fetch-schema:skip-tables-on-failure:truemetadata:reuse-metadata-tables:trueRun Replicant with your configuration#
After configuring the system parameters, run Replicant with the--general option and give it the full path to your configuration file. For example:
./bin/replicant delta-snapshot conf/conn/oracle.yaml conf/conn/singlestore.yaml\--general conf/general/general.yaml\--extractor conf/src/oracle.yaml\--applier conf/dst/singlestore.yaml\--replace-existing --overwrite --id repl1 --resume