Q. What
is a data warehouse?
Ans:
A data warehouse is a huge store of data
accumulated from a broad range of sources within an organization and used to
guide business decisions.
Q. What is Data Mining?
Ans:
Data mining is the phase of analyzing data
from several perspectives and summarizing it into useful data.
Q. What
is Business Intelligence?
Ans:
Business Intelligence defines the
technologies, functions, and systems for the collection, integration, analysis,
and demonstration of business data and sometimes to the data itself. The
purpose of business intelligence is to provide better business decision making.
Thus, BI is also defined as a decision support system (DSS).
Q. What is Dimension Table?
Ans:
Dimension table is a table which contain attributes of
measurements stored in fact tables. This table consists of hierarchies,
categories and logic that can be used to traverse in nodes.
Q. What are the stages of Data
warehousing?
Ans:
There are four stages of Data warehousing:
- Offline
Operational Database
- Offline
Data Warehouse
- Real Time
Datawarehouse
- Integrated
Datawarehouse
Q. What is OLTP?
Ans:
OLTP is abbreviated as On-Line Transaction Processing, and it is
an application that modifies the data whenever it received and has large number
of simultaneous users.
Q. What is OLAP?
Ans:
OLAP is abbreviated as Online Analytical Processing, and it is
set to be a system which collects, manages, processes multi-dimensional data
for analysis and management purposes.
Q. What is ODS?
Ans:
ODS is abbreviated as Operational Data Store and it is a
repository of real time operational data rather than long term trend data.
Q. What is the difference between
View and Materialized View?
Ans:
Ø
A view is nothing but a virtual
table which takes the output of the query and it can be used in place of
tables.
Ø
A materialized view is nothing
but an indirect access to the table data by storing the results of a query in a
separate schema.
Q. What is ETL?
Ans:
ETL is abbreviated as Extract, Transform and Load. ETL is a
software which is used to reads the data from the specified data source and
extracts a desired subset of data. Next, it transform the data using rules and
lookup tables and convert it to a desired state.
Then, load function is used to load the resulting data to the
target database.
Q. What is Fact Table?
Ans:
Ø
Fact table contains the
measurement of business processes, and it contains foreign keys for the
dimension tables.
Ø
Example – If the business process
is manufacturing of bricks
Ø
Average number of bricks produced
by one person/machine – measure of the business process
Q. What is VLDB?
Ans:
VLDB is abbreviated as Very Large Database and its size is set
to be more than one terabyte database. These are decision support systems which
is used to server large number of users.
Q. What is real-time data
warehousing?
Ans:
Real-time data warehousing captures the business data whenever
it occurs. When there is business activity gets completed, that data will be
available in the flow and become available for use instantly.
Q. What are the different types
of Datawarehouse?
Ans:
Following are the different types of Data warehousing:
- Enterprise
Data warehousing
- Operational
Data Store
- Data Mart
Q. What needs to be done while starting
the database?
Ans:
Following need to be done to start the database:
1.
Start an Instance
2.
Mount the database
3.
Open the database
Q. What needs to be done when the
database is shutdown?
Ans:
Following needs to be done when the database is shutdown:
1.
Close the database
2.
Dismount the database
3.
Shutdown the Instance
Q. What is the difference between
metadata and data dictionary?
Ans:
Metadata is defined as data about the data. But, Data dictionary
contain the information about the project information, graphs, abinito commands
and server information.
Q. What
is OLTP?
Ans:
OLTP stands for online transaction
processing. This system is a function that modifies data the instance it
receives and has a huge number of concurrent users.
Q. What
is OLAP?
Ans:
OLAP stands for online analytical processing.
This system is a function that collects, manages, processes, and presents
multidimensional data for analysis and management process.
Q. What
is the difference between OLTP and OLAP?
Ans:
Basic |
OLTP |
OLAP |
|
Meaning |
OLTP stands for online transaction
processing. |
OLAP stands for online analytical
processing. |
|
Data Source |
Operational data is initial data source of
data. |
Consolidation data is from different
sources. |
|
Process Goal |
Snapshot of business processes which does
essential business tasks |
Multidimensional view of business events of
planning and decision making |
|
Queries and Process scripts |
Simple quick running queries ran by
customers. |
Complex long-running queries by a scheme to
update the aggregate data. |
|
Database Design |
Normalize small databases. Speed will not
be an issue due to the smaller database, and normalization will not degrade
performance. This adopts an entity-relationship (ER) model and
function-oriented database design. |
De-normalized large databases. Speed is an
issue due to the larger databases, and de-normalizing will improve
performance as there will be lesser tables to scan while performing tasks.
This adopts star, snowflake, or fact constellation mode of subject-oriented
database design. |
|
Back-up and system administration |
Regular database back-up and system
administration can do the job. |
Reloading the OLTP data is well treated as
a good backup option. |
Q. What
is a dimensional table?
Ans:
Dimensional tables include textual attributes
of measurement saved in the fact tables. A dimensional table is a group of
hierarchies, categories, and logic which can be used for the customer to
traverse in hierarchy nodes.
Q. What are Aggregate tables?
Ans:
Aggregate tables are the tables which contain the existing
warehouse data which has been grouped to certain level of dimensions. It is
easy to retrieve data from the aggregated tables than the original table which
has more number of records.
This table reduces the load in the database server and increases
the performance of the query.
Q. What is fact less fact tables?
Ans:
A fact less fact tables are the fact table which doesn’t contain
numeric fact column in the fact table.
Q. How can we load the time
dimension?
Ans:
Time dimensions are usually loaded through all possible dates in
a year and it can be done through a program. Here, 100 years can be represented
with one row per day.
Q. What are non-additive facts?
Ans:
Non-Addictive facts are said to be facts that cannot be summed
up for any of the dimensions present in the fact table. If there are changes in
the dimensions, same facts can be useful.
Q. What is conformed fact?
Ans:
Conformed fact is a table which can be used across multiple data
marts in combined with the multiple fact tables.
Q. What is Datamart?
Ans:
A Datamart is a specialized version of Data warehousing and it
contains a snapshot of operational data that helps the business people to
decide with the analysis of past trends and experiences. A data mart helps to
emphasizes on easy access to relevant information.
Q. What is Active Data
warehousing?
Ans:
An active Datawarehouse is a Datawarehouse that enables decision
makers within a company or organization to manage customer relationships
effectively and efficiently.
Q. What are the key columns in
Fact and dimension tables?
Ans:
Foreign keys of dimension tables are primary keys of entity
tables. Foreign keys of fact tables are the primary keys of the dimension
tables.
Q. What is SCD?
Ans:
SCD is defined as slowly changing dimensions, and it applies to
the cases where record changes over time.
Q. What is BUS Schema?
Ans:
BUS schema consists of suite of confirmed dimension and
standardized definition if there is a fact tables.
Q. What is Star Schema?
Ans:
Star schema is nothing but a type of organizing the tables in
such a way that result can be retrieved from the database quickly in the data
warehouse environment.
Q. What is Snowflake Schema?
Ans:
Snowflake schema which has primary dimension table to which one
or more dimensions can be joined. The primary dimension table is the only table
that can be joined with the fact table.
Q. What is a core dimension?
Ans:
Core dimension is nothing but a Dimension table which is used as
dedicated for single fact table or DataMart.
Q. What is called data cleaning?
Ans:
Name itself implies that it is a self-explanatory term. Cleaning
of Orphan records, Data breaching business rules, Inconsistent data and missing
information in a database.
Q. What is Metadata?
Ans:
Metadata is defined as data about the data. The metadata
contains information like number of columns used, fix width and limited width,
ordering of fields and data types of the fields.
Q. What are loops in Data
warehousing?
Ans:
In data warehousing, loops are existing between the tables. If
there is a loop between the tables, then the query generation will take more
time and it creates ambiguity. It is advised to avoid loop between the tables.
Q. What are the types of
Dimensional Modeling?
Ans:
There are three types of Dimensional Modeling and they are as
follows:
Ø Conceptual
Modeling
Ø Logical
Modeling
Ø Physical
Modeling
Q. What is surrogate key?
Ans:
Surrogate key is nothing but a substitute for the natural
primary key. It is set to be a unique identifier for each row that can be used
for the primary key to a table.
Q. What is the difference between
ER Modeling and Dimensional Modeling?
Ans:
Ø
ER modeling will have logical and
physical model but Dimensional modeling will have only Physical model.
Ø
ER Modeling is used for
normalizing the OLTP database design whereas Dimensional Modeling is used for
de-normalizing the ROLAP and MOLAP design.
Q. What are the steps to build
the Datawarehouse?
Ans:
Following are the steps to be followed to build the
Datawarehouse:
Ø Gathering
business requirements
Ø Identifying
the necessary sources
Ø Identifying
the facts
Ø Defining
the dimensions
Ø Defining
the attributes
Ø Redefine
the dimensions and attributes if required
Ø Organize
the Attribute hierarchy
Ø Define
Relationships
Ø Assign
unique Identifiers
Q. What is a fact table?
Ans:
The fact table includes the measurement of
the business process. Fact table includes the foreign keys for the dimension
tables.
Example:
If we are business phase is "paper
production," "normal production of paper by one device, "or
"weekly production of paper" will be treated as a measurement of the
business process.
Q. What
are the different methods of loading dimension tables?
Ans:
There are two different methods to load data
in dimension tables:
- Conventional (slow): All
the constraints and keys are validated against the information before, it
is loaded, and this method data integrity is maintained.
- Direct (fast): All
the constraints and keys are disabled before the information is loaded.
Once the information is loaded, it is validated against all the constraints
and keys. If the data is found invalid, it is not contained in the index,
and all the future processes are skipped in this data.
Q. Describe
the foreign key columns in fact tables and dimension tables?
Ans:
Foreign keys of dimension tables are the
primary key of entity tables.
Foreign keys of fact tables are the primary
key of dimension tables.
Q. What
is ODS?
Ans:
ODS stands for Operational data store. A
database architecture that is a repository for near real-time operational
records rather than long term trend data. The ODS may further become the
enterprise shared an operational database, allowing operational functions that
are being re-engineered to use the ODS as there operational databases.
Q. What
is ETL?
Ans:
ETL stands for extraction, transformation,
and loading process. ETL is software that allows the business to develop their
disparate records while moving it from place to place, and it doesn't really
matter that data is in several forms or formats. The data can come from any source.
ETL is powerful enough to manage such data disparities.
First, the extract function reads data from a
particular source database and extracts a desired subset of data.
Second, the transform function works with the
acquired record using rules or lookup tables, or creating a combination with
other records to convert it to the desired state.
Finally, the load function is used to write
the resulting information to a target database.
Q. What
is VLDB?
Ans:
VLDB stands for a Very large database. A one terabyte
database would generally be considered to be a VLDB. Typically, there are
decision support applications or transaction processing applications serving a
huge number of users.
Q. What
is real-time data warehousing?
Ans:
Data warehousing capture business event data.
Real-time data warehousing capture business event data as it occurs. As soon as
the business event is complete, and there is data about it, the completed event
data flows into the data warehouse and becomes feasible instantly.
Q. What
are conformed dimensions?
Ans:
Conformed dimension defines the exact same
thing with every possible fact table to which they are joined. They are simple
to the cubes.
Q. What
are non-additive facts?
Ans:
Non-additive facts are the facts that cannot
be examined for any of the dimensions present in the fact table. They are not
treated as useless. If there is a transformation in dimensions, the same facts
can be useful.
Q. What
is Star Schema?
Ans:
Star schema is a type of organizing the
tables such that we can fetch the result from the database instantly in the
warehouse environment.
Q. What
is a Snowflake Schema?
Ans:
Any dimension with extended dimensions is
called snowflake schema, the dimension may be interlinked or may have one too
many relationships with other tables. This schema is normalized and outcome in
complex join and very complex queries as well as slower results.
Q. What
is a surrogate key?
Ans:
A surrogate key is a substitution for the
essential primary key. It is just a unique identifier or statistic for each row
that can be used for the primary key to the table. The only requirement for a
surrogate primary key is that it is unique for each row in the table. It is
useful because the essential primary key can change, and this makes updates
more difficult. Surrogate keys are always integer or numeric.
Q. What
is a junk dimension?
Ans:
A number of very small dimension may be
lumped together to form a single dimension, a junk dimension is the attributes
are not closely related. Grouping of random flags and text attributes in
dimensions and changing them to a separate subdimension is called the junk
dimension.
Q. What
is dimensional modeling?
Ans:
Dimensional data model concept contains two
types of tables, and it is different from the third normal form. This concept
uses a fact table, which includes the measurement of the business and
Dimensional tables, which includes the context (dimension of the calculations)
of the dimensions.
Q. What
is BUS Schema?
Ans:
BUS schema is collected from a master suite
of confirmed size and a standardized description of facts.
Q. What
is active data warehousing?
Ans:
An active data warehouses provide data that
allow decision-makers within an organization to handle customer relationships
efficiently and proactively.
Q. What
is the difference between data warehousing and Business Intelligence?
Ans:
Data warehousing handle with all methods of managing the
development, implementation and applications of a data warehouse or data mart
containing metadata management, data acquisition, data cleansing, data
transformation, storage management, data distribution, data archiving,
operational documenting, analytical documenting, security management,
backup/recovery planning, etc.
Business Intelligence is a set of software tools that allows
an organization to analyse measurable methods of their business, such as sales
performance, profitability, operational efficiency, effectiveness, of marketing
campaigns, market penetration among certain user groups, cost trends, anomalies
and exceptions, etc. Business Intelligence is used to encompass OLAP, data
visualization, data mining, and query document tools.
Q. Which
one is faster, Multidimensional OLAP, or Relational OLAP?
Ans:
Multidimensional OLAP (MOLAP) is faster than
Relational OLAP (ROLAP).
- MOLAP: Here,
data is saved in a multidimensional cube. The storage is not in the
relational database but in a proprietary plan (one example is PowerOLAP's
.olp file). MOLAP products are compatible with Excel, which can make
record interactions easy to learn.
- ROLAP: ROLAP
products approach a relational database by using SQL (structured query
language), which is the standard language that is used to describe and manipulate
data in an RDBMS. Subsequent processing may occur in the RDBMS or within a
middle-tier server, which accepts requests from users, translates them
into SQL statements, and passes them on to the RDBMS.
0 Comments