Common Data Integration Interview Preparation Guide
Download PDF

Data Integration frequently Asked Questions by expert members with experience in Data Integration. These questions and answers will help you strengthen your technical skills, prepare for the new job test and quickly revise the concepts

24 Data Integration Questions and Answers:

Table of Contents:

Common  Data Integration Job Interview Questions and Answers
Common Data Integration Job Interview Questions and Answers

1 :: Explain how Full Outer Join is implemented BODI?

- Full Outer Join is implemented by using SQL Transformation and writing custom query.

- Following example describes SQL Transformation to implement Full Outer Join:
select emp.*, dept.deptname, dept.deptno dno, dept.location from scott.employee emp
FULL OUTER JOIN
scott.department dept on (emp.deptno = dept.deptno) ;

- Following example illustrates custom query to implement Full Outer Join:

1. Drag EMPLOYEE, DEPARTMENT tables as src.
2. Place the query transform for performing the Left Outer Join.
3. Place one more query transform for performing the Right Outer Join.
4. Merge and load them into the target.

2 :: Tell me how SNMP Agent is associated with Data Integrator?

- Error events are communicated using applications are best supported by SNMP Agent.

- Errors are monitored better using SNMP.

- DI SNMP Agent need to be installed on any Job Server.

- Job servers information is recorded by DI SNMP Agent while running jobs.

- Network Management Software need to be configured for applications to communicate with DI SNMP Agent.

- The status of NMS applications will monitor the Data Integrator jobs.

3 :: What is the difference between Cascade and Drill Through?

Cascade:

- Cascade process involves taking values from various other prompts.

- The result is a single report.

- The result is used when a criteria is to be implemented.

Drill Through:

- Drill Through process is implemented when navigation from summary to detailed information.

- Drill Through has a parent and a child report.

- Data of another report can be seen based on the current details of data.

4 :: Explain what are the prime responsibilities of Data Integration Administrator?

- Scheduling and executing the batch jobs.

- Configuring, starting and stopping the real-time services

- Adapters configuration and managing them.

- Repository usage, Job Server configuration.

- Access Server configuration.

- Batch job publishing.

- Real-time services publishing through web services.

5 :: Can you explain Pivot – Columns to Rows?

- Data Integrator produces a row in the output data set for every value in the designated pivot column.

- More than one pivot column can be set as per the need of application’s data integration.

- Pivot Sequence Column – Data Integrator increments a sequence number for every row created from a pivot column.

- Non-Pivot column – The columns that need to appear in the target.

- Pivot Set – A group of pivot columns, unique data field and header column.

- Data Field Column – It contains the pivot data along with pivot columns values.

- Header Column – Lists the name of the columns.

6 :: Explain History Preserving?

- History Preserving is for providing new row in the target instead of updating the existing row.

- The columns are indicated for transforming the changes that are to be preserved.

- New rows are created when the value of certain column changes.

- Each of these rows is flagged as UPDATE.

- The UPDATE flag is applied for the input data set.

7 :: Explain is Data integration And ETL programming is same?

- No, Data Integration and ETL programming are different.

- Passing of data to different systems from other systems is known as data integration.

- It may integrate data within the same application.

- ETL, on the other hand, is to extract the data from different sources.

- The primary ETL tool job is to transform the data and loads into other objects or tables.

8 :: What is Uniform Data Access Integration?

- UDAI places the data in the source systems.

- A set of views are defined for providing access the unified view to the clients / customers.

- Zero latency of data can be propagated from the source system.

- The generated consolidated data need not require separate storage space.

- Data history and version management is limited and applied only to the similar type of data.

- Accessing to the user data overloads on the source systems.

9 :: How to measure progress in Data Integration?

Look for the existence of the following items:-

- Generic Data Models

- An Enterprise Data Platform

- Identify the Data Sources

- Selection of a MDM Product

- Implementation of a Customer Master Index or appropriate alternative

10 :: Explain what are the factors that are addressed to integrate data?

Following are the data integration factors:

- Sub set of the available data should be optimal.

- Noise/distortion estimation levels because of sensory/processing conditions at the
time of data collection.

- Accuracy, spatial and spectral resolution of data.

- Data formats, storage and retrieval mechanisms.

- Efficiency of computation for integrating data sets to reach the goals.

11 :: What is Data Integration?

- The process of combining data from different resources.

- The combined data is provided to the users with unified view.

- Information from different enterprise domains are integrated – known as Enterprise Information Integration.

- Useful for merging information from different technologies among enterprises.

- The sub areas of data integration are

1. Data Warehousing.
2. Data Migration.
3. Master Data Management.

12 :: Explain Manual Integration and Application Based Integration?

Manual Integration:

- Also known as Common User Interface.

- All the relevant information to access form the source system or web page interface is operated by the users.

- Unified view of the data does not exist.

Application Based Integration:

- ABI requires specific applications for implementing all the integration efforts.

- When the number of applications is limited, this approach is well manageable.

13 :: What is Drill Through?

- Drill Through process is implemented when navigation from summary to detailed information.

- Drill Through has a parent and a child report.

- Data of another report can be seen based on the current details of data.

14 :: Explain Cascade?

- Cascade process involves taking values from various other prompts.

- The result is a single report.

- The result is used when a criteria is to be implemented.

15 :: Explain various caches available in Data Integrator?

- NO_CACHE – It is used for not caching values.

- PRE_LOAD_CACHE – Result column preloads and compares the column into the memory, prior to executing the lookup.

- PRE_LOAD_CACHE is used when the table can exactly fit in the memory space.

- DEMAND_LOAD_CACHE – Result column loads and compares the column into the memory when a function performs the execution.

- DEMAND_LOAD_CACHE is suitable while looking up the highly repetitive values with small subset of data.

16 :: What are Data Integrator Metadata Reports?

- Browser-based analysis and reporting capabilities are provided by Metadata reports.

- The DI Metadata Reports are generated on metadata that associates with
1. Data Integration jobs.
2. Other BO applications those are associated with Data Integration.

- Three modules are provided by Metadata Reports. They are
1. Operational Dashboards.
2. Auto Documentation.
3. Impact and Lineage analysis.

17 :: Explain what is Hierarchy Flattening?

- Construction of parent/child relationships hierarchy is known as Hierarchy Flattening.

- A description of hierarchy in the vertical or horizontal format is produced.

- The hierarchy pattern includes Parent column, Child Column, Parent Attributes and Child Attributes.

- Hierarchy Flattening allows to understand the basic hierarchy of BI in a lucid manner.

- As the flattening is done in horizontal or vertical format, the sub elements are easily identified.

18 :: What is Data Integration hierarchy?

The DI hierarchy is as follows:

- Project->JOB->WorkFlow->DataFlow.

- WorkFlow also has scripts.

- Source, Query, Target are under Data Flow and known as Transformations.

- Workflow, Dataflow, data, files or tables usage for certain number of times, is specified by usage count.

- Objects can be used more than once in Data Integration. These objects are known as reusable objects.

19 :: What is Physical Data Integration?

- Physical Data Integration is all about creating new system that replicates data from the source systems.

- This process is done to manage the data independent of the original system.

- Data Warehouse is the example of Physical Data Integration.

- The benefits of PDI include data version management, combination of data from various sources, like mainframes, flat files, databases.

- A separate system is needed for handling vast data volumes.

20 :: What is Application Based Integration?

- ABI requires specific applications for implementing all the integration efforts.

- When the number of applications is limited, this approach is well manageable.

21 :: What is Manual Integration?

- Also known as Common User Interface.

- All the relevant information to access form the source system or web page interface is operated by the users.

- Unified view of the data does not exist.

22 :: Can you explain what are the three major types of data integration jobs?

Following are the major data integration jobs:

- Transformation jobs – for preparing data

- Should be used when data must not be changed unless job completion of transforming data of a particular subject of interest

- Provisioning jobs - for transmission of data

- Should be used when data must not be changed unless job transformation when the data provisioning is large.

- Hybrid jobs – to perform both transformation and provisioning jobs.

- Data must be changed irrespective of success / failure.

- Should be implemented neither the transformation nor the provisioning requirements are large

23 :: Explain what are the benefits of data integration?

Following are the benefits of data integration:

- Makes reporting, monitoring, placing customer information across the enterprise flexible and
convenient.

- Data usage is efficient.

- Cost Effective.

- Risk adjusted profitability management as it allows accurate data extraction.

- Allows timely and reliable reporting, as data quality is the prime technology for business challenges.

24 :: Explain how to adjust the performance of Data Integrator?

Following are the ways to perform this:

- Using array fetch size.

- Ordering the joins.

- Extracted data minimizing.

- Locale conversion minimization.

- Setting target-based options to optimize the performance.

- Improving throughput.

- Data type conversion minimization.