Data Warehousing Interview Questions & Answers
Download PDF

Sharpen your Data Warehousing interview expertise with our handpicked 131 questions. These questions are specifically selected to challenge and enhance your knowledge in Data Warehousing. Perfect for all proficiency levels, they are key to your interview success. Get the free PDF download to access all 131 questions and excel in your Data Warehousing interview. This comprehensive guide is essential for effective study and confidence building.

131 Data Warehousing Questions and Answers:

Data Warehousing Job Interview Questions Table of Contents:

Data Warehousing Job Interview Questions and Answers
Data Warehousing Job Interview Questions and Answers

1 :: How can we run the graph? What is the procedure for that? How can we schedule the graph in UNIX?

If you want to run the graph through GDE then after save the graph just press F5 button of your keyboard, it will run automatically. If you want to run through the shell script then you have to fire the command at your UNIX box.

2 :: What is a real-time data warehouse? How is it different from near to real-time data warehouse?

As the term suggests, a real-time data warehouse is a system, which reflects all changes to its sources in real time. As simple as it sounds, this is still an area of active research in the field. In traditional DWH, the operational system(s) are kept separate from the DWH for a good reason. The Operational systems are designed to accept inputs or changes to data regularly, hence have a good chance of being regularly queried. On the other hand, a DWH is supposed to do just the opposite - it is used to query data for reports only. No changes to data, through user actions is expected (or designed). The only inputs could come from the ETL feed at stipulated times. The ETL would source its data from the Operational systems just explained above.
To create a real-time DWH we would have to merge both systems (several ways are being explored), a concept that is against the reason of creating a DWH. Bigger challenges occur in terms of updating aggregated data in facts at real time, still maintaining the surrogate keys. Besides, we would need lightening fast hardware to try this.Near Real time DWH is a trade-off between the conventional design and the dream of all clients today. The frequency of ETL updates in higher in this case for e.g. once in 2 hours. We can also analyze and use selective refreshes at shorter time intervals, while complete refreshes may still be kept further apart. Selective refreshes would look at only those tables that get updated regularly.

3 :: What is difference between drill & scope of analysis?

Drilling can be done in drill down, up, through, and across; scope is the overall view of the drill exercise.

5 :: For faster process, what we will do with the Universe?

For a faster process create aggregate tables and write better sql so that the process would fast.

6 :: What is type 2 version dimension?

Version dimension is the SCD type II in real time it using because of it will maintain the current data and full historical data.

7 :: What is unit testing?

The Developer created the mapping that can be tested independently by the developer individually.

8 :: What is Informatica Architecture?

Informatica Architecture contains Repository, Repository server, Repository server administration console, sources, repository server and Data warehousing and it have the Designer, Work for manager, work for monitor combination of all these are called Informatica Architecture.

9 :: What is data warehouse architecture?

Data warehousing is the repository of integrated information data will be extracted from the heterogeneous sources. Data warehousing architecture contains the different; sources like oracle, flat files and ERP then after it have the staging area and Data warehousing, after that it has the different Data marts then it have the reports and it also have the ODS - Operation Data Store. This complete architecture is called the Data warehousing Architecture.

10 :: What is data analysis? Where it will be used?

Data analysis: consider that you are running a business and u store the data of that; in some form say in register or in a comp and at the year end you want know the profit or loss then it called data analysis .Data analysis use: then u want to know which product was sold the highest and if the business is running in a loss then finding, where we went wrong we do analysis.

11 :: What are data modeling and data mining? Where it will be used?

Data modeling is the process of designing a data base model. In this data model data will be stored in two types of table fact table and dimension table

Fact table contains the transaction data and dimension table contains the master data. Data mining is process of finding the hidden trends is called the data mining.

12 :: What is "method/1"?

Method 1 is system develop lifecycle create by Arthur Anderson a while back.

13 :: After the generation of a report to whom we have to deploy or what we do after the completion of a report?

The generated report will be sent to the concerned business users through web or LAN.

14 :: After the complete generation of a report who will test the report and who will analyze it?

After the completion of reporting, reports will be sent to business analysts. They will analyze the data from different points of view so that they can make a proper business decisions.

15 :: Can you pass sql queries in filter transformation?

We cannot use sql queries in filter transformation. It will not allow you to override default sql query like other transformations (Source Qualifier, lookup)

16 :: Where the Data cube technology is used?

A multi-dimensional structure called the data cube. A data abstraction allows one to view aggregated data from a number of perspectives. Conceptually, the cube consists of a core or base cuboids, surrounded by a collection of sub-cubes/cuboids that represent the aggregation of the base cuboids along one or more dimensions. We refer to the dimension to be aggregated as the measure attribute, while the remaining dimensions are known as the feature attributes.

17 :: How can you implement many relations in star schema model?

Many-many relations can be implemented by using snowflake schema .With a max of n dimensions.

18 :: What is critical column?

Let us take one ex: Suppose 'XYZ' is customer in Bangalore, he was residing in the city from the last 5 years, in the period of 5 years he has made purchases worth of 3 lacs. Now, he moved to 'HYD'. When you update the 'XYZ' city to 'HYD' in your Warehouse, all the purchases by him will show in city 'HYD' only. This makes warehouse inconsistent. Here CITY is the Critical Column. Solution is use Surrogate Key.

19 :: What is the main difference between star and snowflake star schema? Which one is better and why?

If u have one to may relation ship in the data then only we choose snowflake schema, as per the performance-wise every-one go for the Star schema. Moreover, if the ETL is concerned with reporting means choose for snowflake because this schema provides more browsing capability than the former schema.

20 :: What is the difference between dependent data warehouse and independent data warehouse?

Dependent departments are those, which depend on a data ware to for their data.Independent department are those, which get their data directly from the operational data sources in the organization.

22 :: What is Virtual Data Warehousing?

A virtual or point-to-point data warehousing strategy means that end-users are allowed to get at operational databases directly using whatever tools are enabled to the "data access network"

23 :: What is the difference between metadata and data dictionary?

Meta data is nothing but information about data. It contains the information (i.e. data) about the graphs, its related files, abinitio commands, server information etc i.e. all kinds of information about project related information etc.

24 :: What is the difference between mapping parameter & mapping variable in data warehousing?

Mapping Parameter defines the constant value and it cannot change the value throughout the session.Mapping Variables defines the value and it can be change throughout the session

25 :: Explain the advantages of RAID 1, 1/0, and 5. what type of RAID setup would you put your TX logs.

The basic advantage of RAID is to speed up the data reading from permanent storage device (hard disk).
Data Warehousing Interview Questions and Answers
131 Data Warehousing Interview Questions and Answers