Experiment – 01
Aim: Define data mining , mention the trends , tools , used in data mining and its
applications.
Data Minning : Data mining is the process of extracting knowledge or insights from large
amounts of data using various statistical and computational techniques. The data can be
structured, semi-structured or unstructured, and can be stored in various forms such as
databases, data warehouses, and data lakes .
Its primary goal is to discover hidden patterns and relationships in the data that can be used to
make informed decisions or predictions. This involves exploring the data using various
techniques such as clustering, classification, regression analysis, association rule mining, and
anomaly detection.
The Trends in Data Minning are as follows :
1. Application exploration
Data mining is increasingly used to explore applications in other areas, such as financial
analysis, telecommunications, biomedicine, wireless security, and science.
2. Multimedia Data Mining
This is one of the latest methods which is catching up because of the growing ability to capture
useful data accurately. It involves data extraction from different kinds of multimedia sources
such as audio, text, hypertext, video, images, etc. The data is converted into a numerical
representation in different formats. This method can be used in clustering and classifications,
performing similarity checks, and identifying associations.
3. Ubiquitous Data Mining
This method involves mining data from mobile devices to get information about individuals.
Despite having several challenges in this type, such as complexity, privacy, cost, etc., this
method has a lot of opportunities to be enormous in various industries, especially in studying
human-computer interactions.
4. Distributed Data Mining
This type of data mining is gaining popularity as it involves mining a huge amount of
information stored in different company locations or at different organizations. Highly
SWASTIKA BERA 1/22/FET/BCS/441
sophisticated algorithms are used to extract data from different locations and provide proper
insights and reports based on them.
5. Embedded Data Mining
Data mining features are increasingly finding their way into many enterprise software use cases,
from sales forecasting in CRM SaaS platforms to cyber threat detection in intrusion
detection/prevention systems. The embedding of data mining into vertical market software
applications enables prediction capabilities for any number of industries and opens up new
realms of possibilities for unique value creation.
6. Spatial and Geographic Data Mining
This new trending type of data mining includes extracting information from environmental,
astronomical, and geographical data, including images taken from outer space. This type of
data mining can reveal various aspects such as distance and topology, which are mainly used
in geographic information systems and other navigation applications.
7. Time Series and Sequence Data Mining
The primary application of this type of data mining is the study of cyclical and seasonal trends.
This practice is also helpful in analyzing even random events which occur outside the normal
series of events. Retail companies mainly use this method to access customers' buying patterns
and behaviors.
8. Data Mining Dominance in the Pharmaceutical And Health Care Industries
Both the pharmaceutical and health care industries have long been innovators in the category
of data mining. The recent rapid development of coronavirus vaccines is directly attributed to
advances in pharmaceutical testing data mining techniques, specifically signal detection during
the clinical trial process for new drugs. In health care, specialized data mining techniques are
being used to analyze DNA sequences for creating custom therapies, make better-informed
diagnoses, and more.
9. Increasing Automation In Data Mining
Today's data mining solutions typically integrate ML and big data stores to provide advanced
data management functionality alongside sophisticated data analysis techniques. Earlier
incarnations of data mining involved manual coding by specialists with a deep background in
statistics and programming. Modern techniques are highly automated, with AI/ML replacing
most of these previously manual processes for developing pattern-discovering algorithms.
SWASTIKA BERA 1/22/FET/BCS/441
10. Data Mining Vendor Consolidation
If history is any indication, significant product consolidation in the data mining space is
imminent as larger database vendors acquire data mining tooling startups to augment their
offerings with new features. The current fragmented market and a broad range of data mining
players resemble the adjacent big data vendor landscape that continues to undergo
consolidation.
The Tools Used in Data Minning are as follows:
Various tools help analyze and extract valuable information from data, including:
1. RapidMiner – A user-friendly data science platform supporting predictive analytics and
ML.
2. WEKA – A collection of ML algorithms for data mining tasks.
3. KNIME – A platform for data analytics and workflow automation.
4. Apache Mahout – A scalable machine learning library for big data mining.
5. Orange – A visual programming tool for data mining and analysis.
6. Python & R – Popular programming languages with libraries like Pandas, Scikit-learn,
and TensorFlow for mining.
7. Microsoft SQL Server Analysis Services (SSAS) – Supports OLAP and data mining in
SQL databases.
Applications of Data Minning:
1. Healthcare – Data mining helps improve the quality of healthcare systems. Predictive
analysis helps recommend medicines and evaluate the treatment progress. Similarly,
identifying unusual patterns in medical claims, medicine purchases or incoherent prescriptions
help track fraudulent practices. Predicting diseases, drug discovery, patient monitoring, and
hospital management.
2. Finance & Banking – Banks have detailed information about their customers, their
transactions and loans. Understanding this bulk of data allows banks to classify customers and
customize services like loans, credit card spending limits, rewards and provide discounts on
purchases. Identifying unusual activity in a transaction helps track fraudulent activities and
security breaches. Fraud detection, credit risk assessment, algorithmic trading, and customer
segmentation.
3. Retail & E-commerce – The retail and e-commerce sector collects and tracks customer
details, transactions and product sales. It helps them identify customer purchase behaviour,
SWASTIKA BERA 1/22/FET/BCS/441
product preferences and seasonal product sales. This data benefits organisations to forecast
sales and customise their offerings. Efficiently using past data to make business decisions helps
retailers and e-commerce owners reduce risk and increase profitability. Customer behavior
analysis, recommendation systems, and sales forecasting.
4. Social Media & Marketing – Media channels like radio, television and over-the-top (OTT)
platforms keep track of their audience to understand consumption patterns. Using this
information, media providers make content recommendations, change programme schedules
and produce content of the preferred genre. Data mining helps media providers improve the
viewer experience. Sentiment analysis, targeted advertising, and customer segmentation.
5. Manufacturing & Supply Chain – Companies with numerous warehouses requiring the
transport of goods and materials use data mining to optimise their process. It helps them analyse
demand patterns and plan supply accordingly. Companies also use this data to make their
distribution channels more efficient and improve coordination with other warehouses and
distributors. Predictive maintenance, demand forecasting, and quality control.
6. Education – Data mining is used in education to learn student productivity and development.
It helps understand how a student is performing, predict their future scores, identify relevant
placement opportunities and track teacher performance. Data mining may help derive
associations between the teaching methodologies and student performance and identify areas
of improvement. Student performance prediction, personalized learning, and curriculum design
SWASTIKA BERA 1/22/FET/BCS/441
Experiment – 02
Aim: Install Weka data mining tool. Explore the “Explorer Tab” by importing any two
datasets.
1. Installation of Weka
STEP 1: Go to the website https://sourceforge.net/projects/weka/ and download the Weka tool.
STEP 2: Open the downloaded file and start installing Weka in your local system.
STEP 3: Begin the installation process by selecting next.
SWASTIKA BERA 1/22/FET/BCS/441
STEP 4: Provide your consent and agreement to move ahead.
STEP 5: Select the components you require and click on next.
STEP 6: Select the destination folder to install Weka.
SWASTIKA BERA 1/22/FET/BCS/441
STEP 7: Click on next to finish the installation setup.
STEP 8: Click on finish to close the setup and start Weka.
2. Explore the “Explorer Tab” by importing any two datasets.
Step 1: Open the Weka tool and go to the Explorer Tab.
SWASTIKA BERA 1/22/FET/BCS/441
Step 2: The following screen opens up. Select the open file tab.
Step 3: Navigate to the Weka folder in your local system by searching in C drive.
SWASTIKA BERA 1/22/FET/BCS/441
Step 4: Open the Data folder inside the Weka folder. A variety of datasets will be available on the screen.
Step 5: Select the “weather.numeric.arff” dataset. The following graph will be displayed on the screen.
Step 6: Apply different filters to see the changes in the data and its graph.
SWASTIKA BERA 1/22/FET/BCS/441
Step 7: Select the “weather.numeric.arff” dataset. The following graph will be displayed on the screen.
SWASTIKA BERA 1/22/FET/BCS/441
SWASTIKA BERA 1/22/FET/BCS/441