Data Warehouse Developer Interview Preparation Guide
Download PDF

Data Warehouse Developer related Frequently Asked Questions by expert members with professional career as Data Warehouse Developer. These list of interview questions and answers will help you strengthen your technical skills, prepare for the new job interview and quickly revise your concepts

55 Data Warehouse Developer Questions and Answers:

Table of Contents:

Operational  Data Warehouse Developer Job Interview Questions and Answers
Operational Data Warehouse Developer Job Interview Questions and Answers

1 :: Can you define data warehouse?

Data warehouse is a subject oriented, integrated, time-variant, and nonvolatile collection of data that supports management's decision-making process.

2 :: What is the phases involved in the data warehouse delivery process?

The stages are IT strategy, Education, Business Case Analysis, technical Blueprint, Build the version, History Load, Ad hoc query, Requirement Evolution, Automation, and Extending Scope.

3 :: Can you list any five applications of data warehouse?

Some applications include financial services, banking services, customer goods, retail sectors, controlled manufacturing.

4 :: Do you know what is load manager?

A load manager performs the operations required to extract and load the process. The size and complexity of load manager varies between specific solutions from data warehouse to data warehouse.

5 :: Tell us how do you load the time dimension?

Time dimensions are usually loaded by a program that loops through all possible dates appearing in the data. It is not unusual for 100 years to be represented in a time dimension, with one row per day.

6 :: Explain me the functions of data warehouse tools and utilities?

The functions performed by Data warehouse tool and utilities are Data Extraction, Data Cleaning, Data Transformation, Data Loading and Refreshing.

7 :: Explain me the functions of a warehouse manager?

The warehouse manager performs consistency and referential integrity checks, creates the indexes, business views, partition views against the base data, transforms and merge the source data into the temporary store into the published data warehouse, backs up the data in the data warehouse, and archives the data that has reached the end of its captured life.

8 :: Do you know what does Metadata Respiratory contain?

Metadata respiratory contains definition of data warehouse, business metadata, operational metadata, data for mapping from operational environment to data warehouse, and the algorithms for summarization.

9 :: Explain me what is the difference between view and materialized view?

View:
☛ Tail raid data representation is provided by a view to access data from its table.
☛ It has logical structure that does not occupy space.
☛ Changes get affected in corresponding tables.
Materialized view:
☛ Pre-calculated data persists in materialized view.
☛ It has physical data space occupation.
☛ Changes will not get affected in corresponding tables.

10 :: Tell us what does subject-oriented data warehouse signify?

Subject oriented signifies that the data warehouse stores the information around a particular subject such as product, customer, sales, etc.

11 :: Do you know what is XMLA?

☛ XMLA is XML for Analysis which can be considered as a standard for accessing data in OLAP, data mining or data sources on the internet. It is Simple Object Access Protocol.XMLA uses ‘discover’ and ‘Execute’ methods. Discover fetches information from the internet while Execute allows the applications to execute against the data sources.
☛ XMLA is an industry standard for accessing data in analytical systems, such as OLAP. It is based on XML, SOAP and HTTP.
☛ XMLA specifies MDXML as the query language. In the XMLA 1.1 version, the only construct in MDXML is an MDX statement enclosed in the tag

12 :: Can you define metadata?

Metadata is simply defined as data about data. In other words, we can say that metadata is the summarized data that leads us to the detailed data.

13 :: What is the functions of a load manager?

A load manager extracts data from the source system. Fast load the extracted data into temporary data store. Perform simple transformations into structure similar to the one in the data warehouse.

14 :: Tell us what is the very basic difference between data warehouse and operational databases?

A data warehouse contains historical information that is made available for analysis of the business whereas an operational database contains current information that is required to run the business.

15 :: Explain me why do we override the execute method is struts?

As part of Struts Framework, we can develop the Action Servlet, ActionForm servlets (ActionServlet means what class extends the Action class, and ActionForm means what class extends the Action Form class) and other servlet classes.
In case of ActionForm class, we can develop validate() method. This method will return the ActionErrors object. In this method we can write the validation code. If this method returns null or ActionErrors with size=0, the web container will call execute() as part of the Action class.

If it returns size > 0, it will not call the execute() method. It will rather execute the jsp, servlet or html file as value for the input attribute as part of the attribute in struts-config.xml file.

16 :: Explain me what is active data warehousing?

☛ An active data warehouse represents a single state of the business. Active data warehousing considers the analytic perspectives of customers and SUPPLIERS. It helps to deliver the updated data through reports.
☛ A form of repository of captured transactional data is known as ‘active data warehousing’. Using this concept, trends and patterns are found to be used for future decision making. Active data warehouse has a feature which can integrate the changes of data while scheduled cycles refresh. Enterprises utilize an active data warehouse in drawing the company’s image in statistical manner.

17 :: Tell us what is the difference between agglomerative and divisive Hierarchical Clustering?

☛ Agglomerative Hierarchical clustering method allows the clusters to be read from bottom to top so that the program always reads from the sub-component first then moves to the parent whereas Divisive Hierarchical clustering uses top-bottom approach in which the parent is visited first than the child.
☛ Agglomerative hierarchical method consists of objects in which each object creates its own clusters and these clusters are grouped together to create a large cluster. It defines a process of continuous merging until all the single clusters are merged together into a complete big cluster that will consist of all the objects of child clusters. However, in divisive clustering, the parent cluster is divided into smaller cluster and it keeps on dividing until each cluster has a single object to represent.

18 :: Explain me what is ODS?

☛ An operational data store (“ODS”) is a database designed to integrate data from multiple sources for additional operations on the data. Unlike a master data store, the data is not sent back to operational systems. It may be passed for further operations and to the data warehouse for reporting.
☛ In ODS, data can be scrubbed, resolved for redundancy and checked for compliance with the corresponding business rules. This data store can be used for integrating disparate data from multiple sources so that business operations, analysis and reporting can be carried while business operations occur. This is the place where most of the data used in current operation is housed before it’s transferred to the data warehouse for longer term storage or archiving.
☛ An ODS is designed for relatively simple queries on small amounts of data (such as finding the status of a customer order), rather than the complex queries on large amounts of data typical of the data warehouse.
☛ An ODS is similar to your short term memory where it only stores very recent information. On the contrary, the data warehouse is more like long term memory storing relatively permanent information.

19 :: Explain me which one is faster, Multidimensional OLAP or Relational OLAP?

☛ Multidimensional OLAP is faster than Relational OLAP.
☛ MOLAP: Multi-dimensional OLAP
☛ Data is stored in a multidimensional cube. The storage is not in the relational database, but in proprietary formats (one example is PowerOLAP’s .olp file). MOLAP products can be compatible with Excel, which can make data interactions easy to learn.
☛ ROLAP: Relational OLAP
☛ ROLAP products access a relational database by using SQL (structured query language), which is the standard language that is used to define and manipulate data in an RDBMS. Subsequent processing may occur in the RDBMS or within a mid-tier server, which accepts requests from clients, translates them into SQL statements, and passes them on to the RDBMS.

20 :: Do you know what is Data Warehousing?

Data Warehousing is the process of constructing and using the data warehouse.

21 :: Tell us what do you mean by Data Extraction?

Data extraction means gathering data from multiple heterogeneous sources.

22 :: Explain me how many dimensions are selected in dice operation?

For dice operation two or more dimensions are selected for a given cube.

23 :: Can you explain me how does a Data Cube help?

Data cube helps us to represent the data in multiple dimensions. The data cube is defined by dimensions and facts.

24 :: Tell us what is the benefit of normalization?

Normalization helps in reducing data redundancy.

25 :: Can you tell me which one is faster, Multidimensional OLAP or Relational OLAP?

Multidimensional OLAP is faster than Relational OLAP.
Data Warehouse Developer Interview Questions and Answers
Data Warehouse Developer Interview Questions and Answers