What is Meta Model?

Meta model is the format that allows capturing detailed information about attributes of data sets that are to be processed.

How do I build the Meta Model?

Use admin UI to create it. Please refer to its documentation.

or

The ingest model can be built using the Ingest Model UI or CSV format. Please consult the product documentation for sequencing.

What type of information can be captured in Meta Model?

The following types of information can be captured in meta model on a per attribute basis for a given entity.

  1. Data set Description - The module, entity, and description of the data set.

  2. Attribute Description - Name, Description, Format, and Length. This can be captured at both the data model and BI model levels.

  3. Attribute Security - Masking requirement and masking information.

  4. Attribute Transforms - Transform rule that needs to be applied at the inbound and outbound. By default, trim is applied to all categorical attributes.

  5. Attribute Filtering - Whether the attribute should be part of the data filtering process and if so what rule.

  6. Attribute IQM - Whether the attribute should be utilized in IQM match. For more information on IQM, please refer to IQM FAQs and BAPCore documentation.

  7. Attribute EDA - Whether the attribute should be utilized in EDA analysis.

  8. Attribute Calendar - Whether the attribute should be utilized in calendar join operations.

  9. Attribute Validation - Whether the attribute should be part of the data validation process and if so what rule.

  10. Attribute Quality - Whether the attribute should be part of the data validation process and if so what rule.

What are the data types supported in meta model?

Currently, character, integer, and numeric are supported.

Should I specify data columns as character data types as well?

Yes.

How do I validate meta model against the data set?

Developers can use the dataops.applymodeldatacompatibility to check the compatibility of the metamodel with the data.

Where are the Meta Models stored?

Meta models are stored within the Meta folder of the core data lake (BAPCORE/Meta/BDL/Fact).

How is the storage and retrieval of Meta Models?

Meta models are stored within the Meta folder of the core data lake using the BAPCore API. They can be retrieved using the BAPCore API as well.

How is the life cycle of the Meta Model managed?

The most current version of Meta Model is stored in Current folder (BAPCORE/Meta/BDL/Fact/Current). The archived version is stored in Archive folder (BAPCORE/Meta/BDL/Fact/Archive). When a new version is pushed, the current version is moved to the archive folder.

Can I know what Meta Model was used for a particular batch process?

Every time an metamodel is used for any data movement, the definition of the meta model is stored in the track folder (BAPCORE/Meta/BDL/Fact/Track) along with the batch_id of the data movement process.

Can I define more than one parent entity?

Currently, that is not supported. A child entity can only belong to one parent entity. If you specify more than one parent, workloads will fail on orphan and nested records processors.

How does nesting work if I can define only one parent entity?

When processing child entity, you can pass a coalesced meta model that has all parents to that particular child based on a strict parent-child relationship. For example, X is the child of Y and Z is the child of X, you can pass the coalesced meta model (both X and Y) along with entity model Z while processing the data set for Z.

How are references to parent meta model stored in entity meta model for a particular batch process?

The parent meta model coalesces with the entity meta model. The coalesced (entity meta model and parent entity meta model) are stored in the track folder (BAPCORE/Meta/BDL/Fact/Track) along with the batch_id of the data movement process. None of the attributes of the parent entity model are changed when coalesced with the child entity model meta model. this helps in troubleshooting if all parent entity models are appropriately applied to the child entity model for nesting processes.

What should the format of entity_effective_date_time be?

Entity effective DateTime should be a string else metamodel.read will fail.

What are the best practices for meta model?

What if some module’s entities have a dependency on other module’s entities (ex: BAPOIM (FieldActivity) depends on BAPRAM(ServicePoint))?

In this case, when you are uploading a meta-model for the parent entity(ServicePoint as per example), need to change the parent module name to the current module name. (As per the given example, For ServicePoint, need to change the module name from ‘BAPRAM' to 'BAPOIM’)

I have a large number of small files in Meta folder that may impede read performance?

You can consolidate multiple Meta files into one using consolidate functionality. This is assuming that the schema of all files in the meta-model folder is the same.

Can I consolidate Meta files?

You can consolidate multiple Meta files into one using consolidate functionality. This is assuming that the schema of all files in the meta-model folder is the same.

What if there is Parent-child relationship in your data module?

You need to update the field (through which you are planning to make a join with the parent entity) of your child entity in meta model by updating the “Entity Attribute Compare Key Role“ to “Yes”, “Entity Attribute Parent Lookup Key Role“ to “Yes” and in “Entity Attribute Parent Lookup Location“ you need to give the FDL/Stage location of your parent entity. Please refer above screenshot, where “movies” is the parent entity and “roles” is the child entity.

How can I create custom transform rules in the Meta Model?

Transform rules are SQL queries you can write in order to filter data by a variable, change certain values in a column, and apply the other functions that are in the Spark SQL library. We can write the transform rules in the meta model by clicking the edit icon next to the column name and going to the Rules page.

Here are a couple things to keep in mind when writing your transform rules:

  1. Do not put ‘Select’ at the beginning of the query, as it is appended automatically to the front when the backend processor runs

  2. The SQL query should end in “as <column_name>”

The transform rules that you define will be executed in the SDL-FDL workload when you define the TRANSFORM_RECORDS_PROCESSOR in the Ingest Model