Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

View file
nameInflation_dataset.csv

NOTE: Before we start creating an application make sure you have create a google drive folder on your local through google drive connector, more information on how to create it can be found on this page:

 https://biganalytixs.atlassian.net/l/cp/8P16hNvz , so that when we upload the data file into google drive, it directly reflects onto your google drive folder which is created in your local.This dataset can be hosted in a google Drive. To setup google Drive connector, please contact Sprngy Support (support@sprngy.com)

Classifying the Application:

...

Now that the business use and classification of the application is established, the application can be created using the UI. In AdminUI, set up the application by going to the Set-up Application tabConfiguration screen, select Create New, and filling out the file structure. Since we have just one layer file system data we will have just one entity in it.

...

Running the analytical workload for INFLATION application will load the data directly from your local to the BDL fact dbdatalake.

...

Step 6: Importing Database and Dataset into

...

SprngyBI and creating a dashboard for the charts

Once you are in SupersetSprngyBI, select the Datasets option from the Data dropdown in the top menu. From there, select the add Dataset option. Set the Database to Apache Hive, select your database from Schema, and select which table you would like to add. Superset SprngyBI will only allow you to add one table at a time, but you can add as many tables as you want one by one.

(see this page for further reference)

 

...

A Datetime column is supposed to be added into csv dataset file to get the time series visualization for SupersetSprngyBI. Here, for Inflation application we have created "year_new" column which is the copy of "year1" column, but have just added "yyyy-01-01" to get datetime column. And then run the analytical workloads again.
Initially while adding that column it's datatype can be String and then later on in SupersetSprngyBI, click on edit symbol beside your dataset name and then under CALCULATED COLUMNS you need to enter the SQL Query "from_unixtime(unix_timestamp(year_new, 'yyyy-MM-dd'))" and select the datatype as DATETIME, click on Save. To plot the time series graph it is necessary to have a Datetime column and this can be done by following the document of /wiki/spaces/BIGANALYTI/pages/1147181.

...

 

Demo Application: Predicting Wealth Inequalities between Black, White and Hispanic

...

groups.

Objective

  • Predicting Wealth Inequalities between Black, White and Hispanic people group for the upcoming years, using analytical model we are training the machine learning model on the given dataset and dataset has target column as mean_net_worth.

...

Now that the business use and classification of the application is established, the application can be created using the UI. In AdminUI, set up the application by going to the Set-up Application Configuration tab, select Create New, and filling out the file structure. ce weSince Since we have just one layer file system data, we will have just one entity in it.

...

Next, create the ingest model in AdminUI. The first part will be defining which processors to use from SDL to FDL. These are the SDL-FDL processors we select for our Teams data. You can refer to this page to understand what each of the processors do.

...

Step 4: Running Workloads

...

First, run the SDL-FDL workload. This will apply the processors we selected in the Ingest Model for the SDL to FDL layer. You can see if the workload ran correctly by going to the FDL/Stage directory in HDFS‘Workload Management’ page.

...

Once you confirm in HDFS that the SDL-FDL workload ran correctly, run the FDL-BDL workload next. This will apply the processors we transformations selected in the Ingest Model for the FDL to BDL layer. You can see if the workload ran correctly by going to the BDL/Fact directory in HDFS‘Workload Management’ page.

Step 5: Creating Analytical Model

The data is taken from the BDL fact db datalake and is used for training the ML model onto it and making predictions on the target column using analytical model.

...

Running the analytical workload for INFLATION application will load the data directly from your local to the BDL fact dbdatalake.

...

 Step 6: Importing Database and Dataset into

...

SprngyBI and creating a dashboard for the charts

Once you are in SupersetSprngyBI, select the Datasets option from the Data dropdown in the top menu. From there, select the add Dataset option. Set the Database to Apache Hive, select your database from Schema, and select which table you would like to add. Superset SprngyBI will only allow you to add one table at a time, but you can add as many tables as you want one by one.

(see this page for further reference)

 

...

A Datetime column is supposed to be added into csv dataset file to get the time series visualization for SupersetSprngyBI. Here, for Wealth Inequalities application we have created "year_new" column which is the copy of "year1" column, but have just added "yyyy-01-01" to get datetime column. And then run the analytical workloads again.
Initially while adding that column it's datatype can be String and then later on in SupersetSprngyBI, click on edit symbol beside your dataset name and then under CALCULATED COLUMNS you need to enter the SQL Query "from_unixtime(unix_timestamp(year_new, 'yyyy-MM-dd'))" and select the datatype as DATETIME, click on Save. To plot the time series graph it is necessary to have a Datetime column and this can be done by following the document of FAQs.

...

 

...

Copyright © Springy Corporation. All rights reserved. Not to be reproduced or distributed without express written consent.