Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

Version 1 Next »

Disclaimer

Big Analytixs Platform Azure Engineering Guide

Release 1.0.0.1

Copyright © Big Analytixs. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy, reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited. The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please report them to us in writing.

This document is intended for:

  • BAPCORE Administrators

  • BAPCORE Developers

  • BAPCORE Architects

This document is a walkthrough to create, edit, delete and view import model through Big Analytixs UI

Pre-requisites:

The list of tools required to run/develop in the UI are as follows:

  1. Visual Studio Code (Developers only)

  2. Python 3.8.10

  3. Flask 1.1.4

  4. Node 14.17.6

  5. npm 7.24.1

  6. R 3.6.3

  7. Big Analytixs Libraries

Import Model

An Import model is used to push data from a relational database to HDFS.

There are three aspects to this model:

  1. Create/Edit Import Model - This flow is to create (in the case it does not exist) or edit (in the case that it exists) an Import Model manually.

  2. Bulk Upload Import Model - Big Analytixs gives you the capability to upload a whole model created in a CSV file without manually creating it Big Analytixs Platform Bulk Uploads UI Documentation

  3. Run Import Workload - After creating an Import Model, we can run the import workload in order to have data processed for that respective Algorithm

Creating an Import Model:

Home Screen:

The image below is the Home screen of the UI.

Click on the three lines (pulls up the menu) and click on the “Create Import Model”

Screen0:

Clicking on the option above will route you to Screen zero, which is a page where you can select the Module Name and Entity Name that you want to create the Import model for.

For example, we are choosing ‘BAPRAM’ as the module name and ‘Customer’ as the entity name.

Form 1:

Clicking on “Create Import Model” will navigate you to the Form1(step 1) page. Form1 requires you to fill out the preEntity details and is a page with two steps. The following is the “Step 1”:

Field Name

Description

Data Type

is Required?

Validation

 

Module Name

Choose from the list of modules

String

Yes

One of the seven on the website

Entity Name

Choose from the list of entities

String

Yes

Only appears after the module name is selected

Entity Description

The description of the respective entity

String

Yes

 

Once you click on “Next”, you will be sent to “Step 2” to fill out the other preEntity details. The following image is “Step 2”:

Field Name

Description

Data Type

is Required?

Validation

Entity Business Owner Name

First name + Last name of Entity Business Owner

String

Yes

 

Entity Business Owner Email

The Business Owner email Id

String

Yes

should have @

Entity Data Owner Name

The Data Owner Name of Entity

String

Yes

 

Entity Data Owner Email

The Data Owner Email of Entity

String

Yes

should have @

Entity It Owner Name

The IT owner name of an entity

String

Yes

 

Entity It Owner Email

The IT owner email of the entity

String

Yes

should have @

Vendor Support Name

The vendor support name of the entity.Default Non-editable to support@biganalytixs.com

String

Auto-Fill

 

Vendor Support Email

The vendor support name of the entity.Default Non-editable to support@biganalytixs.com

String

Auto-Fill

 

After these details have been filled out, you can hit on “Back” to go to “Step 1” or if you are done filling out your PreEntity Details you can hit on “Submit”.

Form2:

If you hit “Submit”, you will be taken to the form2 page to fill out all of the Query Details.

Step 1:

Import Model Column

Sample Data

What it means

Import model description

1

Rule Name

QUERY_IMPORT_PROCESSOR/ IN_MEMORY_IMPORT_PROCESSOR_NAME

Processor/Rule name to run

Import Model description -
Rule Name

2

Process Rule Use Ind

ENDOFDAY/INTRADAY

Process Rule Use indicator for the given entity and rulename (ex. If it’s ENDOFDAY it’s going to run import pipeline in endofday mode. and If it’s INTRADAY then it’s going to run in intraday mode.

Import Model description -
Process Rule Use Ind

3

Connection URL

jdbc:mysql://localhost/schema_name?serverTimezone=UTC

JDBC Connection URL from which data needs to be imported i.e. MySQL, Oracle, MySQL Server, Azure MySQL etc.

Import Model description -
Connection URL

4

SQL Query

(select t1.* from (select t.*, ROW_NUMBER() OVER() row_id from (select * from schema_name.table_name where 1=1) t) t1) t2

SQL Query to run on the given JDBC connection to import data

Import Model description -
SQL Query

5

Split by Column Name

row_id

The unique integer column name base on which partition is going to take place in order to optimise the query

Import Model description -
Split by Column Name

6

Num Mappers

1/4/8/11 etc.

The number of partitions required for the given driver table to make import more efficient,

Import Model description -
Num Mappers

7

Target Directory

/BigAnalytixsPlatform/modulename/Entityname/SDL/Land

The HDFS directory where imported data should be overwritten

Import Model description -
Target Directory

8

Driver Table

select count(*) as COUNT from Databasename.Tablename

The main driver table of JDBC connection based on which the the count of rows and columns is going to fetch

Import Model description -
Driver Table

9

Memory Table Name

Tablename

After reading data from JDBC connection in SPARK, this is the name of that spark memory table.

Import Model description -
Memory Table Name

10

Coalesce Ind

YES/NO

Whether to coalesce data or not when writing data to HDFS

Import Model description -
Coalesce Ind

Step 2:

Import Model Column

Sample Data

What it means

Import model description

1

Step Number

1(2,3,4)

The step number to assign based on the other information provided.

Import Model description -
Step Number

2

Step Sequence

1/2/3/4

The step sequence / sub steps.

Import Model description - Step Sequence

3

Fetch Size

100000/250000/500000

The fetch size to import those many number of rows from JDBC connection

Import Model description -
Fetch Size

4

On Failure

STOP/CONTINUE

If the current step fails then whether to stop the pipeline or continue

Import Model description -
On Failure

5

Data Write Mode

overwrite/append

When writing data, whether to append or overwrite the data to the provided target directory.

Import Model description -
Data Write Mode

6

Retry Count

Any number (count)

When reading data from RDBMS using spark, if there is any failure then this feature indicate the number of times it should retry. (For QUERY_IMPORT_PROCESSOR)

Import Model Description - Load In Memory

7

Retry Interval

Any number (seconds)

When reading data from RDBMS using spark, if there is any failure then this feature indicate the time/interval it should retry in context of retry count.(For QUERY_IMPORT_PROCESSOR)

Import Model Description - Load In Memory

8

Load In Memory

YES/NO

When reading data, whether to load/cache data in memory.

Import Model Description - Load In Memory

9

Bound Query

select 1 as LOWERBOUND, count(*) as UPPERBOUND from <schema_name>.<table_name>

When reading data from RDBMS, this feature describes to get the upper and lowerbound for partitions to optimize the read operation in spark.

Import Model Description - Load In Memory

After you are done filling out the details for a single query, click on the “submit“ button which will redirect to the grid page showing the query that was just added. Now from that page, we can add new queries or edit/view/delete the existing queries.

After you are done with creating all your queries, click on “Final Submit”. This will create your import model in your local HDFS and route you back to your home screen.

Editing Import Model

Home Screen:

Click on the three lines (pulls up the menu) and click on the “Edit Import Model”

Edit Import Selection Screen:

In the screenshot below, you will be required to fill out the Module and Entity Name.

Clicking on the Submit button will fetch the import model from HDFS and navigate to Form1.

Form1:

If an import model exists, you will be navigated to Form1 with the import model preEntity details pre-populated and editable as shown below:

Query Grid Screen:

Clicking on the “Submit” button on the Form1 page, will save your preEntity details locally and route you to the Query/Grid Screen. In this screen, you have the list of SQL queries that exist in the import model.

There are multiple functionalities that are available on this screen:

  1. Add New Query - It navigates you to the Form2 page (just as adding another query in the create flow), and you can create another query.

  2. The View Icon - This icon navigates you to another screen that displays the details of the particular query

  3. The Edit Query - This icon navigates to the add query page with pre-filled values. You can change the required field and save the updated query.

  4. Delete Icon - The Delete Icon gives you the ability to delete that particular query.

After making all changes to the import model, you can click on the “Final Submit” button in order to update the existing import model in the HDFS. It is IMPORTANT to remember that you have to click on the “FINAL SUBMIT” button in order for your changes to be permanently saved.

Here if we want to add another import model for a given Module and Entity, we need to delete/edit the existing import model first.

  • No labels