KEMBAR78
TI2134 PracticalAssignment 2 | PDF | Data Management | Algorithms
0% found this document useful (0 votes)
32 views4 pages

TI2134 PracticalAssignment 2

The document contains a Python implementation of the Apriori algorithm for association rule mining using the mlxtend library. It includes two datasets: one for grocery items and another for movie preferences, with the results showing frequent itemsets and association rules derived from the data. The output includes support, confidence, and other metrics for the generated rules.

Uploaded by

Rocky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views4 pages

TI2134 PracticalAssignment 2

The document contains a Python implementation of the Apriori algorithm for association rule mining using the mlxtend library. It includes two datasets: one for grocery items and another for movie preferences, with the results showing frequent itemsets and association rules derived from the data. The output includes support, confidence, and other metrics for the generated rules.

Uploaded by

Rocky
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

2/18/25, 10:33 PM Untitled

Name: Sahil Govind chaudhari

Rol no: TI2134


In [37]: import pandas as pd

In [38]: from mlxtend.frequent_patterns import apriori,association_rules

Q1.
In [6]: dataset=[['Bread','Milk'],['Bread','Diaper','Beer','Eggs'],['Milk','Diaper','Bee
['Bread','Milk','Diaper','Beer'],['Bread','Milk','Diaper','Coke']]

In [7]: dataset

Out[7]: [['Bread', 'Milk'],


['Bread', 'Diaper', 'Beer', 'Eggs'],
['Milk', 'Diaper', 'Beer', 'Coke'],
['Bread', 'Milk', 'Diaper', 'Beer'],
['Bread', 'Milk', 'Diaper', 'Coke']]

In [14]: from mlxtend.preprocessing import TransactionEncoder


te=TransactionEncoder()
te_array=te.fit(dataset).transform(dataset)
df=pd.DataFrame(te_array,columns=te.columns_)
df

Out[14]: Beer Bread Coke Diaper Eggs Milk

0 False True False False False True

1 True True False True True False

2 True False True True False True

3 True True False True False True

4 False True True True False True

In [15]: freq_items=apriori(df,min_support=0.5,use_colnames=True)
print(freq_items)

support itemsets
0 0.6 (Beer)
1 0.8 (Bread)
2 0.8 (Diaper)
3 0.8 (Milk)
4 0.6 (Diaper, Beer)
5 0.6 (Bread, Diaper)
6 0.6 (Bread, Milk)
7 0.6 (Diaper, Milk)

In [18]: rule=association_rules(freq_items,metric='confidence',min_threshold=0.5)
print(rule)

file:///C:/Users/91982/Downloads/Untitled (1).html 1/4


2/18/25, 10:33 PM Untitled

antecedents consequents antecedent support consequent support support \


0 (Diaper) (Beer) 0.8 0.6 0.6
1 (Beer) (Diaper) 0.6 0.8 0.6
2 (Bread) (Diaper) 0.8 0.8 0.6
3 (Diaper) (Bread) 0.8 0.8 0.6
4 (Bread) (Milk) 0.8 0.8 0.6
5 (Milk) (Bread) 0.8 0.8 0.6
6 (Diaper) (Milk) 0.8 0.8 0.6
7 (Milk) (Diaper) 0.8 0.8 0.6

confidence lift representativity leverage conviction zhangs_metric \


0 0.75 1.2500 1.0 0.12 1.6 1.00
1 1.00 1.2500 1.0 0.12 inf 0.50
2 0.75 0.9375 1.0 -0.04 0.8 -0.25
3 0.75 0.9375 1.0 -0.04 0.8 -0.25
4 0.75 0.9375 1.0 -0.04 0.8 -0.25
5 0.75 0.9375 1.0 -0.04 0.8 -0.25
6 0.75 0.9375 1.0 -0.04 0.8 -0.25
7 0.75 0.9375 1.0 -0.04 0.8 -0.25

jaccard certainty kulczynski


0 0.75 0.375 0.875
1 0.75 1.000 0.875
2 0.60 -0.250 0.750
3 0.60 -0.250 0.750
4 0.60 -0.250 0.750
5 0.60 -0.250 0.750
6 0.60 -0.250 0.750
7 0.60 -0.250 0.750

Q2.
In [21]: dataset=[['User1','KGF','Salaar','Pushpa'],['User2','Toxic','Salaar','Bahubali']
['User4','KGF','Salaar','Bahubali','Pushpa'],['User5','Salaar','Bahubal
dataset

Out[21]: [['User1', 'KGF', 'Salaar', 'Pushpa'],


['User2', 'Toxic', 'Salaar', 'Bahubali'],
['User3', 'KGF', 'Toxic', 'Salaar'],
['User4', 'KGF', 'Salaar', 'Bahubali', 'Pushpa'],
['User5', 'Salaar', 'Bahubali', 'Pushpa']]

In [22]: te=TransactionEncoder()
te_array=te.fit(dataset).transform(dataset)
df=pd.DataFrame(te_array,columns=te.columns_)
df

Out[22]: Bahubali KGF Pushpa Salaar Toxic User1 User2 User3 User4 User5

0 False True True True False True False False False False

1 True False False True True False True False False False

2 False True False True True False False True False False

3 True True True True False False False False True False

4 True False True True False False False False False True

file:///C:/Users/91982/Downloads/Untitled (1).html 2/4


2/18/25, 10:33 PM Untitled

In [35]: freq_items=apriori(df,min_support=0.5,use_colnames=True)
print(freq_items)

support itemsets
0 0.6 (Bahubali)
1 0.6 (KGF)
2 0.6 (Pushpa)
3 1.0 (Salaar)
4 0.6 (Salaar, Bahubali)
5 0.6 (Salaar, KGF)
6 0.6 (Salaar, Pushpa)

In [36]: rules=association_rules(freq_items,metric='confidence',min_threshold=0.5)
print(rule)

antecedents consequents antecedent support consequent support \


4 (Toxic) (Bahubali) 0.4 0.6
12 (Toxic) (KGF) 0.4 0.6
27 (Toxic) (User2) 0.4 0.2
30 (Toxic) (User3) 0.4 0.2
32 (Bahubali, Pushpa) (KGF) 0.4 0.6
.. ... ... ... ...
10 (Salaar) (KGF) 1.0 0.6
16 (Salaar) (Pushpa) 1.0 0.6
3 (Bahubali) (Salaar) 0.6 1.0
11 (KGF) (Salaar) 0.6 1.0
17 (Pushpa) (Salaar) 0.6 1.0

support confidence lift representativity leverage conviction \


4 0.2 0.5 0.833333 1.0 -0.04 0.8
12 0.2 0.5 0.833333 1.0 -0.04 0.8
27 0.2 0.5 2.500000 1.0 0.12 1.6
30 0.2 0.5 2.500000 1.0 0.12 1.6
32 0.2 0.5 0.833333 1.0 -0.04 0.8
.. ... ... ... ... ... ...
10 0.6 0.6 1.000000 1.0 0.00 1.0
16 0.6 0.6 1.000000 1.0 0.00 1.0
3 0.6 1.0 1.000000 1.0 0.00 inf
11 0.6 1.0 1.000000 1.0 0.00 inf
17 0.6 1.0 1.000000 1.0 0.00 inf

zhangs_metric jaccard certainty kulczynski


4 -0.25 0.25 -0.250 0.416667
12 -0.25 0.25 -0.250 0.416667
27 1.00 0.50 0.375 0.750000
30 1.00 0.50 0.375 0.750000
32 -0.25 0.25 -0.250 0.416667
.. ... ... ... ...
10 0.00 0.60 0.000 0.800000
16 0.00 0.60 0.000 0.800000
3 0.00 0.60 0.000 0.800000
11 0.00 0.60 0.000 0.800000
17 0.00 0.60 0.000 0.800000

[226 rows x 14 columns]


C:\Users\91982\anaconda3\Lib\site-packages\mlxtend\frequent_patterns\association_
rules.py:186: RuntimeWarning: invalid value encountered in divide
cert_metric = np.where(certainty_denom == 0, 0, certainty_num / certainty_deno
m)

file:///C:/Users/91982/Downloads/Untitled (1).html 3/4


2/18/25, 10:33 PM Untitled

In [ ]:

file:///C:/Users/91982/Downloads/Untitled (1).html 4/4

You might also like