Data Warehousing and Database: June 2018

Friday, June 29, 2018

Indexing in Database

To retrieve instructor details given department name, db system would look up an index to find on which disk block the corresponding record resides, and then fetch the disk block, to get the appropriate instructor details.

Two kinds of indices

1) Ordered Indices- based on sorted ordering of the values
2)Hash Indices- based on uniform distribution of values determined by hash function across a range of buckets

Each technique evaluated based on access types, access time, insertion time, deletion time and space overhead.
An attribute or set of attributes used to look up records in a file is called search key. If there are several indices on a file, there are several search keys.
Ordered Indices
Index structure is used to gain fast random access to records in a file.
Primary index: in a sequentially ordered file, the index whose search key specifies the sequential order of the file.

Also called clustering index. A clustered index is very useful when you want to retrieve many rows of data, a range of data, or when the BETWEEN clause is used in the WHERE clause.
The search key of a primary index is usually but not necessarily the primary key. Files with a clustering index on the search key are called index-sequential files.

Types of Ordered Indices

Dense index: An index entry appears for every search key value in the file. In a dense clustering index, index record contains the search key value and a pointer to the first data record with that search key value. In a dense non clustering index, index must store list of pointers to all records with the same search-key value.

Sparse Index: contains index records for only some search-key values.Applicable when records are sequentially ordered on search-key. Search file sequentially starting at the record to which the index record points. Generally slower , less space and less maintenance overhead for insertions and deletions.than dense index for locating records.

Creating an index:

It’s a good practice to only create indices for columns rather than rows, because the indices need to be updated when the table is updated as well.

1) The table to be indexed must be in your own schema.

2) You must have the INDEX object privilege on the table to be indexed.

3) You must have the CREATE ANY INDEX system privilege.

CREATE INDEX the_index_name ON the_table_name (the_column_name);

create index  on ();

create index dept_index on instructor(dept_name);

create unique index dept_index on instructor(dept_name);

dept_name is the candidate key for instructor.

To drop index: drop index indexname;

Multilevel Indices: Indices with two or more levels.Construct a sparse outer index on the original index (inner index ). index entries are always in sorted order. To search large tuples (Eg: 100,000,000), binary search is expensive and time consuming as primary index does not fit in memory. solution is treat primary index kept on disk as a sequential file and construct a sparse index on it. It is closely related to tree structures such as binary trees used for in-memory indexing. On insertion and deletion, indexes must be updated.

outer index –a sparse index of primary index
inner index –the primary index file

Secondary index: An index whose search key specifies an order different from the sequential order of the file. Also called non-clustering index. it must be dense. A secondary index must contain pointers to all records. If a person wants to find all the records whose values in a particular field satisfy some condition. For example: Employee relation stored sequentially by EName (non search key), find all employees in CS department.

A sequential scan in clustering index order is efficient because records in the file are stored physically in the same order as the index order.

Sequential scan using primary index is efficient, but a sequential scan using a secondary index is expensive.

SQL Server unique indexes vs. unique constraints:

A unique index ensures that the values in the index key columns are unique.
A unique constraint also guarantees that no duplicate values can be inserted into the column(s) on which the constraint is created. When a unique constraint is created a corresponding unique index is automatically created on the column(s).

Table without a clustered index. It’s an assorted collection of objects. It has multiple extents with different page data. Generally there is no functional difference between a unique index and a unique constraint. The latter is also listed as a constraint, however this is only a method to emphasize the purpose of the index. There is no difference for the query optimizer whether the index is created as a unique index or a unique constraint, therefore there is no performance difference. However there are some differences for creation where some index creation options are not available for unique constraints.

Clustered index- The data rows are stored in order based on clustered index key. It is implemented as a B-tree index structure. It have only one row in sys.partitions, with index_id=1. Data pages in the leaf level are linked in a doubly linked list.

Non Clustered index- It is implemented as a B-tree index structure. It do not affect the order of data rows. Each index row contains the non-clustered key value, row locator (row id) and any included or non-key columns.

Indexing tips

https://www.mssqltips.com/sqlservertip/4270/difference-between-sql-server-unique-indexes-and-unique-constraints/

https://go.sqlauthority.com

Tip 1

Unused Indexes: Drop unused indexes

Tip 2: Missing index

Tip 3: Duplicate indexes: Drop duplicate index

Friday, June 22, 2018

Terms used in everyone's life

Perception (what’s happening now?)
Notification (what do I need to know now?)
Suggestion (what do you recommend?)
Automation (what should I always do?)
Prediction (what can I expect to happen?)
Prevention (what can I avoid?)
Situational Awareness (what do I need to do right now?)

Thursday, June 21, 2018

Transaction Management

Oh..I transferred funds from my savings account to checking account. But transaction seems to be not completed. what happened?

all-or-none-----ATOMICITY
it is mandatory that funds transfer should preserve the database CONSISTENCY
After successful execution of funds transfer, new balances should persist in my savings and checking account (despite of system failure) - DURABILITY

Collection of operations that performs a single logical function in a database application- TRANSACTION
Ensure atomicity or durability - RECOVERY MANAGER
Detect system failures and restore the database to the state in which it was before failure - FAILURE RECOVERY
Control interaction among concurrent transactions - CONCURRENCY CONTROL MANAGER
CONCURRENCY CONTROL MANAGER + TRANSACTION MANAGER

Views and Materialized Views

Views

It is not good to see all users the logical model of database with respect to security. Also, it is good to personalize a collection of relations. So , it is possible to compute and store results of queries and then make the stored relations available to users using "Views."

create view v as ;

Eg: create view faculty as select ID, name, dept_name from instructor;

Materialized Views

Certain database systems allow view relations to be stored, but they make sure that if the actual relations in the view definition changes, view is kept up-to-date. For eg, if a tuple is added or deleted from the table in the database, materialized views contents must be updated.This process is called materialized-view maintenance.
A materialized view in Oracle is a database object that contains the results of a query. It stores data physically and get updated periodically. While querying Materialized View, it gives data directly from Materialized View and not from table.

Monday, June 11, 2018

Customer Relationship Management

CRM Involves managing all aspects of a customer’s relationship with an organization to increase customer loyalty and retention and an organization's profitability.
CRM systems, sometimes called e-CRM systems, use technology to help an e-business manage its customer base. CRM allows an e-business to match customer needs with product plans and offerings, remind customers of service requirements, and determine what products a customer has purchased .

CRM s/w:

Salesforce
SAP CRM- provides a central marketing platform that enables organizations to analyze, plan, develop, and execute all marketing activities through all customer interaction points Oracle,
Microsoft Dynamics CRM
IBM

}Operational CRM – Supports traditional transactional processing for day-to-day front-office operations or systems that deal directly with the customers. Examples of such a process include mobile phones and cameras

}Analytical CRM – Supports back-office operations and strategic analysis and includes all systems that do not deal directly with the customers. it is concerned with exploiting customer data to enhance both customer and company value. Analytical CRM builds on the foundation of customer information. Customer data may be found in enterprise-wide repositories: sales data (purchase history), financial data (payment history, credit score), marketing data (campaign response, loyalty scheme data), service data. To these internal data can be added data from external sources: geo-demographic and lifestyle data from business intelligence organizations, for example. With the application of data mining tools, the company can then interrogate these data. Intelligent interrogation provides answers to questions such as: Who are our most valuable customers? Which customers have the highest propensity to switch to competitors? Which customers would be most likely to respond to a particular offer?

Analytical CRM has become an essential part of effective CRM implementation. Operational CRM struggles to reach full effectiveness without analytical information on the value of customers. Customer value drives many operational CRM decisions, such as:

a) Which customers shall we target with this offer?
b) What is the relative priority of customers waiting on the line, and what level of service should be offered?
c) Where should I focus my sales effort?

From the customer's point of view, analytical CRM can deliver better, more timely, even personally customized, solutions to the customer's problems, thereby enhancing customer satisfaction. From the company's point of view, analytical CRM offers the prospect of more powerful cross-selling and up-selling programs, and more effective customer retention and customer acquisition programs. Retailer Wal-Mart uses analytical CRM. It collects data from its 1200 stores to identify which segments are shopping, what they are buying and which promotions are most effective. It constantly tries to improve average basket value through effective analytical CRM.

Sales Process

Data Warehousing and Database