Page Comparison

This document details steps to setup new applications using Sprngy Admin UI, defining models and running workloads.

Table of Contents

minLevel	1
maxLevel	3

Demo Case

outline	false
style	none
type	list
printable	false

Use Case Category: Finance

Example

Demo Application 1: Assess the Risk of a Loan

Objective - Determine whether a loan will be approved or not by the issuing bank

Overview

From source:

“The bank wants to improve their services. For instance, the bank managers have only vague idea, who is a good client (whom to offer some additional services) and who is a bad client (whom to watch carefully to minimize the bank loses). Fortunately, the bank stores data about their clients, the accounts (transactions within several months), the loans already granted, the credit cards issued The bank managers hope to improve their understanding of customers and seek specific actions to improve services. A mere application of a discovery tool will not be convincing for them.

The data about the clients and their accounts consist of following relations:

relation account - each record describes static characteristics of an account,
relation client - each record describes characteristics of a client,
relation disposition - each record relates together a client with an account i.e. this relation describes the rights of clients to operate accounts,
relation permanent order - each record describes characteristics of a payment order,
relation transaction - each record describes one transaction on an account,
relation loan - each record describes a loan granted for a given account,
relation credit card - each record describes a credit card issued to an account,
relation demographic data - each record describes demographic characteristics of a district.

Each account has both static characteristics (e.g. date of creation, address of the branch) given in relation "account" and dynamic characteristics (e.g. payments debited or credited, balances) given in relations "permanent order" and "transaction". Relation "client" describes characteristics of persons who can manipulate with the accounts. One client can have more accounts, more clients can manipulate with single account; clients and accounts are related together in relation "disposition". Relations "loan" and "credit card" describe some services which the bank offers to its clients; more credit cards can be issued to an account, at most one loan can be granted for an account. Relation "demographic data" gives some publicly available information about the districts (e.g. the unemployment rate); additional information about the clients can be deduced from this.”

Classification:

The file structure is going to look like this:

Download the files:

Districtdistrict

Accountaccount

Clientclient

Loanloan

Orderorder

Transactiontransaction

Classifying the Application:

Based on the data technology used and the business use, the application is classified as below:

Quality of Data

Data Lake

Info
Data Lake is a centralized repository to store large amount of raw data

Data Lakehouse / Data Warehouse

Info
Data Lakehouse combines concepts of Data Lake and Data Warehouse providing large storage capacity combined with data management features

Curated

Info
Curating data involves creating or preparing data to make it usable for business analysis.

✔

Correlated

Info
Correlated data means running Algorithms to discover patterns and relationships within the data.

✔

Normalized

Info
Normalizing data involves structuring the data to enable rapid access.

✔

Analyze

Info
Data Analysis involves identifying useful information to support decision-making, often using visualizations

✔

Modelling

Info
This involves building statistical models and testing those.

Step 1: Setting up the Application

Now that the business use and classification of the application is established, the application can be created using the UI. In AdminUI, set up the application by going to the Set-up Application Configuration tab, select Create New, and filling out the file structure. Make sure to indicate District as the parent of Account and Client, Account as the parent of Loan, Order, Transaction, Display, and Display as the parent of Card.

We will begin first with the District Dataset:

Step 2: Meta Model, Ingest Model, and Workloads

We are going to use the Bulk Upload Meta feature to create our meta model for the District dataset. In the AdminUI menu , under the Meta Model dropdown, select the Bulk Upload Meta option. Select the Loan Application module and District entity, and import the meta model CSV file into the bulk upload. Once you hit submit, this will automatically generate the meta model for the District dataset.

Download the csv files for meta and ingest model:

District Ingest Model

Loan Meta Model

Loan Ingest Model

Image Removed

Next, similar to meta model you can bulk upload ingest model in AdminUI. Or create ingest model by selecting processors. You can refer to this page to understand what each of the processors do.

Image Removed

Once we submit the Ingest Model, we can Image Added

Next, similar to meta model you can bulk upload ingest model in AdminUI. Or create ingest model by selecting options.

Image Added

Once we submit the Ingest Model, we can run the workloads under the Workload Management/Run Workloads page.

Once you confirm in HDFS that the SDL-FDL workload ran correctly, run the FDL-BDL workload next. This will apply the processors we selected in the Ingest Model for the FDL to BDL layer. You can see if the workload ran correctly by going to the LOANAPPLICATION/District/BDL/Fact directory in HDFS

Data Upload

Upload data using 'Data Upload' option. Select the Loan Application module and District entity, and CSV file into the data upload. Once you hit submit, this will automatically upload the district dataset to SDL/Land folder in datalake.

SpringyBi

Visualizations

After adding the datasets, you can select any one of them to bring up the Charts page. Here, you can pick which type of visualization you would like to create and the parameters for that visualization. Below are three examples of charts you can make in Superset SprngyBI that show the correlations between the various districts, regional statistics, loan amounts and durations, and the status of the loans.

Demo

Use Case Category: Sports

Example