Extractions

The extraction process consists In getting a set of consistent data from a source into an archive (H2 or database schema), while guaranteeing the referential integrity of the database. All of the extractions defined in a request are stored in the same archive and are based on the same data model. The definition of the referenced data to retrieve is defined in the extraction package.

The extraction, based on the source data model, retrieves all of the dependent data so as to get a consistent set of data in the archive. Filters are applied during the extraction to restrict the rows selected, avoiding retrieving all rows from the tables.

The results of the extraction are kept for each extraction instance, unless the instance is deleted. The extraction results can also be injected as many times as necessary, as long as the target database is compatible, meaning that it contains the same table names and structures as the extraction tables stored in the archive.

For each extraction process, the extraction request contains the extraction branches, each of it containing extractions. The extraction request is made of one or multiple extraction definition. The initial extraction is the first request made to the database to get initial data from one table. Each referenced or dependent data linked to this first set of data fetched at level 0 constitute the next level, and eventually, each fetched data has a new level. As a level of extraction is represented by a table, filters can be applied on levels.

Extractions

Extractions are accessed and managed in the Extractions view.

Extraction modes

The extraction process gets a set of consistent data from a source and stores that data in an archive. The definition of the data to retrieve is defined in the extraction package. The extraction, with the help of the provided source data model, is in charge of retrieving all of the direct and intermediate linked table data of the initial table. A filter can be defined in the extraction package so as to define the rows to extract, avoiding retrieving all rows from the initial extracted table.

Data linked to extracted data can be referenced or dependent. To comply with these two options and validate the data model with respect of referential integrity constraints, DOT Extract offers three extraction modes: referenced, all and all limited.

Note
Once the extraction mode is selected for an extraction request, it stays the same for all of the following extraction processes.
Referenced Data
With this mode of extraction, the parent data is extracted in accordance with the declared SQL referential integrity constraints. This type of extraction extracts the data linked to the filter and the data that is referenced by it.
In this mode, only referenced tables are extracted.
All Limited
With this mode of extraction, dependent and referential tables are taken from the initial table to extract.
Referenced tables can only extract referenced table, whereas dependent tables can extract both referenced and dependent tables. This type of extraction allows to exclude data dependencies during the extraction.
All

With this mode of extraction, the parent and the child data is extracted in accordance with the declared SQL referential integrity constraints. This type of extraction allows to extract the data linked to the filter, the data that are referenced by it as well as the data that are dependent on it.

In this mode, referenced and dependent tables are extracted.

Important!

Use this extraction mode with caution, as the extraction process tends to fetch the entire data included in the database.

Manage extractions

Create an extraction

Follow the subsequent steps to create a new extraction.

Step 1   Click on the Extractions tab to open the Extraction view, then click the  Add button.

Step 2   Define the main properties required for the new extraction.

Important!

All fields are mandatory.

Name
Enter a unique Name for the new extraction.
Extraction Type
Select the extraction mode to be used, among the ones available in the drop-down list.
Reference
For more information about extraction modes, refer to Extraction modes.
Archive password
Define a password to access the H2 database where the extraction is stored.
Data Source
Select a data source for the extraction in the drop-down list.
Reference
For more information about data sources, refer to Manage data source connections.
Description
Add a textual description of the extraction project.
Archiving Data Source
Set the data source to use as archive to store the extraction results. The drop-down list is populated with the previously created data sources that have been set as archive.

Click Next.

Step 3   Define the entry tables required for the new extraction. Tick the checkbox corresponding to the table(s) that need to be included in the extraction.

Note
The list of tables available also displays the schema it comes from, if the extraction project is made on a data source that has several schemas.

Click Next.

Step 4   Define the extraction branches required for the new extraction. Click the  Add button to add a new line.

Table
Select a Table in the drop-down list. The list of tables is related to the chosen data source.
Filter
Select a Filter in the drop-down list. The list of filters is related to the chosen table in the data source. Selecting a filter is not mandatory. It is possible to extract a table in an extraction project without adding a filter.

Step 5   Click DONE to confirm and close the extraction creation pages.

Result   The new extraction is created and appears in the list of extractions.

Note

It is also possible to create an extraction project, by duplicating an existing one and make changes. To do so, click the  Duplicate button. Edit the name of the extraction to copy to avoid having duplicates in the list, then click Done.

Manage extractions

To edit an extraction, click the  View icon to view the details of the extraction project, then click the  Edit icon. The Edit page opens and enables you to edit the Name, the Extraction Type, the Data source and the Archive passwords fields.

If you want to refresh statistics before an extraction execution, to obtain consistent data in the extraction results, check the Refresh statistics option.

You can also add one or several tables from different schemas to the extraction project, by clicking the  New Branch icon. Select the schema(s) and table(s) in the Table Selection window, then click the Add button. Once the new table is added, the table displays its corresponding schema.

Click the  Edit icon on a table to add a filter. Click the  Exclude icon to exclude dependencies to be extracted from the master table with this filter.

For each table, you can display the list of dependent tables of the current table, by clicking the  icon, or display the list of referenced tables by clicking the  icon.

It is also possible to enable the child Extraction mode with the  Enable icon, or disable the child extraction mode with the  Disable icon.

Note

Make sure you click the  Save icon to save the changes.

Execute an extraction

To execute an extraction, click the Extract button. A dialog opens to confirm that the extraction is launched. You can track the status and the results of the extraction in the Extraction Results view.

If you created an extraction project with variable filter values, an Execution Parameters dialog opens when you click Extract, so you can define the value of these filter parameters. Once the parameters are set for the extraction, click Done to launch the extraction.

Use the Estimate option before you execute an extraction, to get a realistic estimation of its final extraction results. This estimation process works the same way as the actual extraction: it is based on the data source content and filters, you can refresh the statistics before the estimation, or add a value if you estimate an extraction with parameters. This estimation is made to be as accurate as possible and displays the expected extraction result, with the only exception that the extraction results are not saved in a H2 file or a data source set as archive.

Reference

For more information about extraction results, refer to Extraction results.

Delete an extraction

To delete an extraction, click the  View icon to open the Edit page of the extraction, then click the  Delete icon.

A confirmation dialog opens, click Delete to confirm, or Cancel to keep the extraction.

Warning!

Deleted extractions cannot be accessed or recovered.