...
This document is intended for:
SPRNGY Administrators
SPRNGY Developers
SPRNGY Architects
This document is a walkthrough to create and edit an import Model through the Sprngy UI.
...
This two step form takes information such as entity description and business owner, data owner and IT owner information working on this entity. Once done click on Submit.
Field Name | Description | Data Type | is Required? | Validation
|
---|---|---|---|---|
Module Name | Choose from the list of modules | String | Yes | One of the seven on the website |
Entity Name | Choose from the list of entities | String | Yes | Only appears after the module name is selected |
Entity Description | The description of the respective entity | String | Yes |
|
Field Name | Description | Data Type | is Required? | Validation |
---|---|---|---|---|
Entity Business Owner Name | First name + Last name of Entity Business Owner | String | Yes |
|
Entity Business Owner Email | The Business Owner email Id | String | Yes | should have @ |
Entity Data Owner Name | The Data Owner Name of Entity | String | Yes |
|
Entity Data Owner Email | The Data Owner Email of Entity | String | Yes | should have @ |
Entity It Owner Name | The IT owner name of an entity | String | Yes |
|
Entity It Owner Email | The IT owner email of the entity | String | Yes | should have @ |
Vendor Support Name | The vendor support name of the entity. Default to support@sprngy.com, non-editable | String | Auto-Fill |
|
Vendor Support Email | The vendor support name of the entity. Default to support@sprngy.com, non-editable | String | Auto-Fill |
|
After these details have been filled out, you can hit on “Back” to go to “Step 1” or if you are done filling out all details you can hit on “Submit”.
...
If you hit “Submit”, you will be taken to the form page to fill out all of the Query Details.
Step 1:
...
Import Model Column | Sample Data | What it means | Import model description | |
---|---|---|---|---|
1 | Rule Name | QUERY_IMPORT_PROCESSOR/ IN_MEMORY_IMPORT_PROCESSOR_NAME | Processor/Rule name to run | Import Model description - |
2 | Process Rule Use Ind | ENDOFDAY/INTRADAY | Process Rule Use indicator for the given entity and rulename (ex. If it’s ENDOFDAY it’s going to run import pipeline in endofday mode. and If it’s INTRADAY then it’s going to run in intraday mode. | Import Model description - |
3 | Connection URL | jdbc:mysql://localhost/schema_name?serverTimezone=UTC | JDBC Connection URL from which data needs to be imported i.e. MySQL, Oracle, MySQL Server, Azure MySQL etc. | Import Model description - |
4 | SQL Query | (select t1.* from (select t.*, ROW_NUMBER() OVER() row_id from (select * from schema_name.table_name where 1=1) t) t1) t2 | SQL Query to run on the given JDBC connection to import data | Import Model description - |
5 | Split by Column Name | row_id | The unique integer column name base on which partition is going to take place in order to optimise the query | Import Model description - |
6 | Num Mappers | 1/4/8/11 etc. | The number of partitions required for the given driver table to make import more efficient, | Import Model description - |
7 | Target Directory | /SprngyPlatform/modulename/Entityname/SDL/Land | The HDFS directory where imported data should be overwritten | Import Model description - |
8 | Driver Table | select count(*) as COUNT from Databasename.Tablename | The main driver table of JDBC connection based on which the the count of rows and columns is going to fetch | Import Model description - |
9 | Memory Table Name | Tablename | After reading data from JDBC connection in SPARK, this is the name of that spark memory table. | Import Model description - |
10 | Coalesce Ind | YES/NO | Whether to coalesce data or not when writing data to HDFS | Import Model description - |
Step 2:
...
Import Model Column | Sample Data | What it means | Import model description | |
---|---|---|---|---|
1 | Step Number | 1(2,3,4) | The step number to assign based on the other information provided. | Import Model description - |
2 | Step Sequence | 1/2/3/4 | The step sequence / sub steps. | Import Model description - Step Sequence |
3 | Fetch Size | 100000/250000/500000 | The fetch size to import those many number of rows from JDBC connection | Import Model description - |
4 | On Failure | STOP/CONTINUE | If the current step fails then whether to stop the pipeline or continue | Import Model description - |
5 | Data Write Mode | overwrite/append | When writing data, whether to append or overwrite the data to the provided target directory. | Import Model description - |
6 | Retry Count | Any number (count) | When reading data from RDBMS using spark, if there is any failure then this feature indicate the number of times it should retry. (For QUERY_IMPORT_PROCESSOR) | Import Model Description - Load In Memory |
7 | Retry Interval | Any number (seconds) | When reading data from RDBMS using spark, if there is any failure then this feature indicate the time/interval it should retry in context of retry count.(For QUERY_IMPORT_PROCESSOR) | Import Model Description - Load In Memory |
8 | Load In Memory | YES/NO | When reading data, whether to load/cache data in memory. | Import Model Description - Load In Memory |
9 | Bound Query | select 1 as LOWERBOUND, count(*) as UPPERBOUND from <schema_name>.<table_name> | When reading data from RDBMS, this feature describes to get the upper and lowerbound for partitions to optimize the read operation in spark. | Import Model Description - Load In Memory |
After you are done filling out the details for a single query, click on the “submit“ button which will redirect to the grid page showing the query that was just added. Now from that page, we can add new queries or edit/view/delete the existing queries.
...