Sprngy Platform Import Model Documentation

1 Disclaimer
2 Pre Requisites
3 Import Model
- 3.1 Create/Edit an Import Model
  - 3.1.1 Step 1
  - 3.1.2 Step 2
  - 3.1.3 Step 3

Disclaimer

Sprngy Platform Documentation Guide

Release 2.0.0

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

This document is intended for:

SPRNGY Administrators
SPRNGY Developers
SPRNGY Architects

This document is a walkthrough to create and edit an import Model through the Sprngy UI.

Pre Requisites

Application is created in Admin UI.

Import Model

An import model is used to import dataset from source location to sprngy datalake.

Create/Edit an Import Model

The image below is the Home screen of the UI. From the side menu open ‘Meta Data Configuration' option, then select 'Import Model’ option.

Step 1

After clicking on the Import Model menu option below screen will be displayed.

To create a new Import Model, select the required Module name and Entity name from the drop down and click 'Submit'.

Alternatively, using the Upload box, import model (in csv) format can be uploaded for the selected entity.

This checks the existence of a import model. If there is a previously created or even initiated import model then it will bring back that data, if creating import model for the very first time then provides blank fields for input.

Step 2

After submission, the below screen will be rendered. Here you can provide information regarding the import model for given entity.

This two step form takes information such as entity description and business owner, data owner and IT owner information working on this entity. Once done click on Submit.

Field Name	Description	Data Type	is Required?	Validation

Field Name	Description	Data Type	is Required?	Validation
Module Name	Choose from the list of modules	String	Yes	One of the seven on the website
Entity Name	Choose from the list of entities	String	Yes	Only appears after the module name is selected
Entity Description	The description of the respective entity	String	Yes

Field Name	Description	Data Type	is Required?	Validation

Field Name	Description	Data Type	is Required?	Validation
Entity Business Owner Name	First name + Last name of Entity Business Owner	String	Yes
Entity Business Owner Email	The Business Owner email Id	String	Yes	should have @
Entity Data Owner Name	The Data Owner Name of Entity	String	Yes
Entity Data Owner Email	The Data Owner Email of Entity	String	Yes	should have @
Entity It Owner Name	The IT owner name of an entity	String	Yes
Entity It Owner Email	The IT owner email of the entity	String	Yes	should have @
Vendor Support Name	The vendor support name of the entity. Default to support@sprngy.com, non-editable	String	Auto-Fill
Vendor Support Email	The vendor support name of the entity. Default to support@sprngy.com, non-editable	String	Auto-Fill

After these details have been filled out, you can hit on “Back” to go to “Step 1” or if you are done filling out all details you can hit on “Submit”.

Step 3

If you hit “Submit”, you will be taken to the form page to fill out all of the Query Details.

Step 1:

	Import Model Column	Sample Data	What it means	Import model description

	Import Model Column	Sample Data	What it means	Import model description
1	Rule Name	QUERY_IMPORT_PROCESSOR/ IN_MEMORY_IMPORT_PROCESSOR_NAME	Processor/Rule name to run	Import Model description - Rule Name
2	Process Rule Use Ind	ENDOFDAY/INTRADAY	Process Rule Use indicator for the given entity and rulename (ex. If it’s ENDOFDAY it’s going to run import pipeline in endofday mode. and If it’s INTRADAY then it’s going to run in intraday mode.	Import Model description - Process Rule Use Ind
3	Connection URL	jdbc:mysql://localhost/schema_name?serverTimezone=UTC	JDBC Connection URL from which data needs to be imported i.e. MySQL, Oracle, MySQL Server, Azure MySQL etc.	Import Model description - Connection URL
4	SQL Query	(select t1.* from (select t., ROW_NUMBER() OVER() row_id from (select from schema_name.table_name where 1=1) t) t1) t2	SQL Query to run on the given JDBC connection to import data	Import Model description - SQL Query
5	Split by Column Name	row_id	The unique integer column name base on which partition is going to take place in order to optimise the query	Import Model description - Split by Column Name
6	Num Mappers	1/4/8/11 etc.	The number of partitions required for the given driver table to make import more efficient,	Import Model description - Num Mappers
7	Target Directory	/SprngyPlatform/modulename/Entityname/SDL/Land	The HDFS directory where imported data should be overwritten	Import Model description - Target Directory
8	Driver Table	select count(*) as COUNT from Databasename.Tablename	The main driver table of JDBC connection based on which the the count of rows and columns is going to fetch	Import Model description - Driver Table
9	Memory Table Name	Tablename	After reading data from JDBC connection in SPARK, this is the name of that spark memory table.	Import Model description - Memory Table Name
10	Coalesce Ind	YES/NO	Whether to coalesce data or not when writing data to HDFS	Import Model description - Coalesce Ind

Step 2:

	Import Model Column	Sample Data	What it means	Import model description

	Import Model Column	Sample Data	What it means	Import model description
1	Step Number	1(2,3,4)	The step number to assign based on the other information provided.	Import Model description - Step Number
2	Step Sequence	1/2/3/4	The step sequence / sub steps.	Import Model description - Step Sequence
3	Fetch Size	100000/250000/500000	The fetch size to import those many number of rows from JDBC connection	Import Model description - Fetch Size
4	On Failure	STOP/CONTINUE	If the current step fails then whether to stop the pipeline or continue	Import Model description - On Failure
5	Data Write Mode	overwrite/append	When writing data, whether to append or overwrite the data to the provided target directory.	Import Model description - Data Write Mode
6	Retry Count	Any number (count)	When reading data from RDBMS using spark, if there is any failure then this feature indicate the number of times it should retry. (For QUERY_IMPORT_PROCESSOR)	Import Model Description - Load In Memory
7	Retry Interval	Any number (seconds)	When reading data from RDBMS using spark, if there is any failure then this feature indicate the time/interval it should retry in context of retry count.(For QUERY_IMPORT_PROCESSOR)	Import Model Description - Load In Memory
8	Load In Memory	YES/NO	When reading data, whether to load/cache data in memory.	Import Model Description - Load In Memory
9	Bound Query	select 1 as LOWERBOUND, count(*) as UPPERBOUND from <schema_name>.<table_name>	When reading data from RDBMS, this feature describes to get the upper and lowerbound for partitions to optimize the read operation in spark.	Import Model Description - Load In Memory

After you are done filling out the details for a single query, click on the “submit“ button which will redirect to the grid page showing the query that was just added. Now from that page, we can add new queries or edit/view/delete the existing queries.

After you are done with creating all your queries, click on “Final Submit”. This will create your import model in Sprngy datalake and route you back to your home screen.

Sprngy Knowledge Base