Data Mining Note

Week 1

Reading: Han Chapter 1~3

Overview

Data mining: Automatic knowledge discovery from data (KDD).

Data warehousing: Efficient data analysis

Data warehouse: a repository of multiple heterogeneous data sources organized under a unified schema at a single site to facilitate management decision making.

Know your data

attribute

Q: the dissimilarity between objects

similarity(i, j)=1-dissimilarity(i, j)

Normalization Methods

①min-max normalization

Advantage:

Min-max normalization preserves the relationships among the original data values.

Disadvantage:

It will encounter an “out-of-bounds” error if a future input case for normalization falls outside of the original data range for A.

猜你喜欢

转载自www.cnblogs.com/weixia14/p/11370192.html