KEMBAR78
Data Objects and Attribute Types | PDF
0% found this document useful (0 votes)
190 views1 page

Data Objects and Attribute Types

Uploaded by

ravishankar55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
190 views1 page

Data Objects and Attribute Types

Uploaded by

ravishankar55
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 1

Data Objects and Attribute Types:

In data mining, a data object is an entity described by a set of attributes. For instance, a
customer in a retail store can be a data object, described by attributes like age, gender,
income, and purchase history.
Attribute Types are the characteristics that define a data object. They can be categorized into:
1. Nominal: Categorical data without an inherent order, such as color, gender, or
country.
2. Ordinal: Categorical data with a specific order, like low, medium, and high, or
educational levels (elementary, high school, college).
3. Interval: Numerical data with meaningful differences but no true zero point, such as
temperature in Celsius or Fahrenheit.
4. Ratio: Numerical data with a true zero point, allowing for ratios and proportions, such
as weight, height, or income.
Measuring Data Similarity and Dissimilarity
In data mining, understanding the relationships between data points is essential for various
tasks like clustering, classification, and anomaly detection. To quantify these relationships,
we use similarity and dissimilarity measures.
Similarity Measures
Similarity measures calculate how similar two data points are. Common similarity measures
include:
 Euclidean Distance: This measures the straight-line distance between two points in
Euclidean space. It's commonly used for numerical data.
 Manhattan Distance: This measures the distance between two points by summing
the absolute differences of their Cartesian coordinates. It's often used for data with
mixed attribute types.
 Cosine Similarity: This measures the cosine of the angle between two vectors. It's
particularly useful for text data and high-dimensional data.
Jaccard Similarity: This measures the similarity between sets. It's often used for binary data,
such as text documents or categorical data.

You might also like