KEMBAR78
Data Mining | PDF
0% found this document useful (0 votes)
186 views32 pages

Data Mining

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
0% found this document useful (0 votes)
186 views32 pages

Data Mining

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF or read online on Scribd
You are on page 1/ 32
€ Data Mining Code No: 1578 RI8 ‘TAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD ‘B.Tech IV Year I Semester Examinations February/March = 2022 DATA MINING (Common to CSE. TT) ‘Time: 3 Hours Max. Marks: 75 Anoner any Five Questions All Quetions Carry Equal Marks 1. Explain the need of data preprocessing and various forms of preprocessing. [15] 2 What is data warehouse? Demonstrate integrating data mining system with a data ‘warehouse with a neal diagram. Us} 3. Apply FP-Growth algorithm tothe following data fr finding frequent item sets, consider support thresboll 3 30% Us) Tistof RemiDe Tia PPEPPAE 43) Hot denify sb gaps ina pi? ') Gheanoverviw of comelstion aati we 52) Explain clasification asa two step process 'b) State Bayes theorem. How this concep is usd in lasification | eon 6 Whatin a decision ee? Expsin decision te induction algorithm. ust 7.) Consmst k-means clustering with Knedos clustering approach. ) Discuss the meitsand demerits of hierarchical approaches for chstesing, [847] 8. How to apply mining techniques wo unstructured ext drahase? Explain with example ts) #1 SAP Education Partner in India. 15K+ Consultants Trained. @ Apply Now x Po Data Mining (Code Na: 1578C_ RIS JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY BY! ‘Tech TV Year I Semester Examinations, January/February - 2023 DATA MINING (Common to CSE, IT, ITE) ‘Time: 3 Hours Max. Marks: 78 [Note: i) Question paper consists of Pat A, Pat B {Part Ais compulsory, which cates 25 mars. In Pat A, answer all questions. {id In Par B, Answer any one question from each unit. Each question caries 10 marks and may havea, bas sub questions PART-A (5 Marks) 1.2) Whatis data warshome? a 1) List the aplication of data mining fe 2) Whats meant by asuociaton rae mitog? pr © Writes shor note on SPM algorithn’ 6 1) Why are decision tes useful? el §) List the advanages of wing deisiom es. 6 > a » GL a re D a rART-B 0 tarks) 29 how to inteats data mining stom witha data warehouse Eph 1) “Data preprocessing neessry befor dala mining process” Justify your answer. [5+] oR 33) _Diferotiatehetween data mining and data warchose, 1) Discuss the mayor ies in dats mining, (srs 4.3) Writes shor notes on constraint hase association mining 1b) Descrte various types of association rules. [ss OR 5S. Explain in detail how faguent pattem mining in dat mining 10) 6 Desctte Bayesian Belief Network with an example, uo} ‘OR 7.) Briefly explain classification problems and general approaches to solve them. 1) Explain the merits and de-meris ofthe lazy learning method. (ss 8. Explain the ftlowing. 2) Chaser analysis. )Grid-ased methods (ss) oR 1a) How demity based method is used fr chstering? 1) Mlasrate K-mean algorithm wih an example (wo 10, Explain the flowing 2) Spatial daa mining. by Text mining. ts oR 11, Discuss various kinds of patterns to be mined from webserver logs in web wsage mining. to} € Data Mining Code No: 157 R18 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD 1. Tech 1V Year I Semester Examinations, uly/August 2022 DATA MINING «Common to CSE. 17) ‘Time: 3 Hours Max Marks5 newer any five questions All questions carry equal marks 1.3) Writeshort notes om data mining task primitives 1) Discus in detail shout data prprocesting. os 2 Explain the following: 2) tegration of data mining system witha data warchouse. 1) Classifiction of dats mining stom. os 3.) How do you find frequent pater in data mining? Explain, 1) Explain constrain ced association mining. os 4.3) What are the measures of association rule mining? Exp 1) White shor notes on SPM eon 5.2) Compare the methods of classification sad edition 1) Howto evaluate performance of clansitiogmndel? Explain pa 6 Discuss in detail shout rule-based classifation us) 7.) Explain K-means algorithm with an example, 1) What are the key ies in hierarchical clustering? [ov 8. Explain the fotlowing: 9) Spatial data mining ') Mining sequence pattems in transactional databases. pss oot € Data Mining ‘Code No: 1378Q R16 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD B.Tech IV Vear I Semester Examinations, December - 2019 DATA MINING i || Common to CSE, 17) H | Time: $ Howes J Mas. Marks: 75 Note: This question paper contains two parts A and B. ar A is compulsory which caries 25 marks, Anewer all gestions in Par A. Part B consists of § Units Answer any one full question from each unt. Each question caries 10 ‘marks and may havea bs sub question. } j PART-A j : : es Marks) 1.2) Define data mining a 1) List the methods of Filling nnsing vals bL ©) Define chose frequent itemst. a ‘© Whatisthe noe of confidence measre in association rile mining? or | ©) List the measures for selecting best split in decision tree construction, BS J 1) Qusteancxample for Bayesian bei network. BL 2) Whatare the imitations of single linkage algorith? eB) 5) List the typical requirements of chasering in data mining. br 3) Whats meant by step words? Br 2) Give the txonomy of web mining 6 | PARTE | (60 Marks) 2. Discuss dats mining asa sep in knowledge discovery process and various challenges asocited uo} on 3. Usea flowchart to summarize the following procedures for attribute subset selection: |) Stepwise forward eertion ' |) Stepwise hackward elimination 0 4 Chasity fet pate ising meiods ad expin he rein allowed for 01 on 5. Apoly ari sgt to find frequent itemsets from te following wansactionl database, | Eine sup = 308. uo) | ‘D lems. tonght | J 1) Rem mobo, ruler 2 enki erase, sharpenct 3 Pen ruler cha, sharpener 4 emi clip. eraser 5 Rade, pin, try book, pen 6 Marker, cba, sketchpens 6 Ste stsifitonproblemand ety expan general approntes ose (10 2 Aa Nan en tier ia ae labe(campas_placement) othe new to) I Hl J SD [CaP Camapan placement } 1 [Ter Ye Zee, Yer = Toeei0) Yer + [Sis No 1 s[70s. No. 1 J [Ce [res Yes J TY] 900 No 0 8. Suppose thatthe data mining tsk is to cuser the following eight students into three clusters, the distance function is Manhattan Assign record 1.2.3 as the centroid of each | closer respectively. Use the k-means algorithm to show the final three clusters. (10) ny € Data Mining 2 Discuss data mining as 2 step in knowledge discovery process and various challenges sssocuted 0) on 3. Use alow to summarize the following recedes frau ute lection a) Sepwibe fore seletian ' | byStepse ska tmintion 00) 4 Glasty trogen prem miniog mods and explain the cttea followed for classifeation 0) on 5. Anny sri grin oid eget om he following ansactonal databose jt sup = 30, oo | TID) | tems tought | ? 1) Pen mlcbook. er 2 Peni erase sharpener 3 em rer chart taper 4 Pencil ip, ser 5 Rae, iy sry took, pen 6 Matter, sketchpens State asian robiem and biel explain ever approaches tose {10 ‘Apply Natve-Bayesian classifier to identify clas labe(campas_placement) 10 the new samplistodent <7 t 8, “Fai, Excellent™No>. [SD [COPA] Coxe Fisckaine | Campos placemcat ‘Sei Paricipation T_| Tow | Facets Yer Ye 2 [ies | Fair Yen Ye = [si0i0 | Poor No Yer ST] St06 | Poor No No S| 7w 8 |Bxcllent No. No. [ef eeo [foie Yer Yer ‘910 | Poor No] No | 0 to) Suppose that the dat mining ask is 10 cluster the following eight students into dee clusters, the distance function is Manhattan Assign record 1.2.3 as the centroid of cach luster respectively Use the kerma algorithm to show the inal tree chasers. - Recon ‘Weighing —] ~ 5 3s 30 0 50 as “a0 15S 6s oR to) Appraise the imporance of outlier detection and its application. Explain any one approach for outlier detection, to Discus various kinds of pterns to he mined from webvserer logs in webuisage mining. Compare and contrast text mining with web content mining on ~00000— to) eid examples (10) € Data Mining ‘Code No: 13780 R16 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD B.Tech IV Year I Semester Examinations, February/March - 2022 DATA MINING (Common to CSE, IT) ‘Tine: 3 Hours ‘Max. Marka: 75 Ansmer any five questions All questions carry equal marks 11a) Esplin Various Data Mining Funtionalities with an example +) late about Daa Mining Task Primitives ten 22) Whatis Data Cleaning? Describe various methods of Daa Cleaning. '>) Discuss abo the lanes to be considered ding Data Integration. os 3. Witea note on Maximal Frequent em Set and Closed Frequent em Set. (15) 4. Explain about the Apri algorithm fr finding froqunt item sets with an example 15) 5. sess out Decision ee ndicton lgoritrih an example. os) 6 Discs out Nate Bayes taste grit with an cxample. bs) 7. Witepaitoning sound medical) bs 3. Esplin out ierchy of categorie in tek ming. os e000 € Data Mining ‘Note i) Question paper consists of Part A Pat B ) Part A is compulsory, which carries 25 marks In Part A, Anwwer all questions. {ip In Part B. Answer any one question ffom each unit Each question cates 10 marks and may havea, bas sub questions. PART-A. (25 Marks) 4a) Define data mining. a '>) Whats meant by outer analysis? 8 (©) Define maximal frequet item set. re] ‘D) How to compute confidence ofan association rae? Give example fe ©) Whatis mean by test data? 2 4) Whatis the significance of information gain? fe 12) Whatiscloster analyse? 1} 1b) What re the draw bcks of single linkage clustering? Go) 4) Give examples for un stroctred text o 5) List the applications of wb usage mining 6 PART (50 Marks) 2 Discuss the steps in knowledge discovery process and compate it with data access and information retrieval, to) on 32) Apprive sage of smocthing in dat transformation, 1) Evaluate distance measures for dissimilar computation. ts 4. Apply FP-Growth algorithm tothe following transactional database to ind frequent item see, to} Tatot em vor BS 008 PGI ‘05 | TaaT 008 BINT 008 | LIS 0065 on 5) Appraise the limitations of prion and suggest mechanisms to improve it 'b) Explain item merging eoncept for mining closed frequent item sets. (ss) 6 State clasification problem and discuss a general approach to solve classification problem, 10) oR 7.) Discuss decison ee over fing and pruning techniques, 1) Justify the selection ofk value for KNN elasiies. (st 8. Discuss hierarchical methods for clustering and contrast agglomerative and divisive approaches to} oR 1a) With suitable dita explain statistical based outlier detection. 1) Caticize the evaluation meuscs used for castes 1 10, Discs the data mining tasks applicable to ext databases 0) oO 11.3) Discover episode rules for the ex given inthis question pape. 'b)_ Give a brief note on PageRank algorithm used in web stuctre mining, bs € Data Mining Cade No: 1378Q R16 {JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD ‘B.Tech IV Year Semester Examinations, July/August - 2022 DATA MINING ‘(Common to CSE. 17) ‘Time: 3 Hours Max Marks:75 Anower any five questions AML question aery equal marks 11a) Discuss about challenging issues in Data Mining. 1) Whatis peprocesing? Explain about Data Transformation echnigues. ps 2.) Explain abou he Data Cleaning techniques in detail. 1) Wate about Data Mining Tasks with examples. is 3. How (o find all the frequent item sets sing Apri algorithm forthe given data where mit-ap: ts} ‘Tamsctona Data for an Alleeroics ranch Two Tapas 90 ins 4.2) List out differen kinds of Association Ras with an example foreach '6) Explain about maria frequent fem set and closed frequent tem set oss 5. Describe Naive Bayesian Clasification method with an example. ust 62) How to solve aclauificaion problem using L-neaest neighbor algorithm? 1b) Explain abou the meavare fr selecting the best split. won 7.2) Lis out various clustering methods 1D) How to chaser the datasets using Komans casterng algo? [s+10) 82) Explain about unstructured text mining. 1) Whatis web conten mining? Discuss in detail. os) 0000 € Data Mining Code No: 1378Q R16 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD, 1. Tech 1V Year I Semester Examinations, March - 2021 DATA MINING. | || Common to CSE, F) | | Time: 3 Hours } Max. Marks: 78 Answer any Five Questions AML Questions Carry Equal Marks 1.3) How to handle redundancy in data integration? j © Esalngrindyl component rae as metodo dimeinalty con (48) 2 How can we mine close frequent item ses? Explain us} 3. Explain market haskot analysis and its relevance to association rule. Explain the Apecr Igrithm using the following transaction data assuming thatthe support court is 22%. insrate with an example, smi, da sugar, ead | Dal opar wheat jam ‘Milk read curd, paneer ‘Wheat, pacer dl sugar Milk paneer, bread ‘Wheat, dal, panect, bread 0s SRESEEE [4 igus Newest eigtborchshctneAlprthmandChancersis | (8) | © Stow Nour Nene can be ui fr Dat lasifstion? Which alg is site Esplun tem va expe? us) 6 Explain various sais and challenges in data mining. us) | 2a) Describe web usage mining. : 1b) Explain abou Text Clustering with a ilystative example ost 82) Waite and explain abou the k-medoids algorithm. 1) _Descrte distance hased cutie detection. en ~0000-- € Data Mining Pe ROK 4 A ile aaia eae coe emsneeat a sel ace ma cringing Oa pring repre wee A PG PANG AAC, al lel ne i Seapine crore Steen Sean eee ery ieee eer Ear oa ca Sin pee aie sn a i Sine aedegiee Sm ‘ease hye Siena bai idan ae eae 8 4 acs mea pfoing ete peti of ii enon in er aaa cme ‘i A C2 Meri omega in ee) day “nde ta nar tc ’ 3} amtpacracece on AG AG AG A universtyupdates.n | wwcaniverseupdates.in | ww ios universiyupdatesin Po Code No: 1378Q ‘Time: 3 Hours 1a » 2» » sa » 6a » 1a » sa » Data Mining {JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD B. Tech IV Year I Semester Examinations, September -2021 DATA MINING. (Common to CSE, 11) ‘Max. Marks: 75, Answer ans Five Questions AML Questions Carry Equal Marks ‘Whats data mining? Discuss the challenges ascociated with data mining. anteste any thee canes for dsl of numeric dats (ee How to handle missing values in data mining process? Explain the steps in principal component analysis for dat reduction, 98 Generate the strong association roles forthe following transactions using Apion algorithm, rminsup = 30% and mincont = 65%. us) [Trans [CisroF tame id Pri | Paneer, choose, gale anger — Bread. bute. dese, al, supa | Mil, tp Sd, so, bead Noodles pasta, bullet esse ancer, peas, Baby com, Buller Bread, ji, bute, eg Bread. chess, pute, wie Pancer, Due 895. Sua CConsier the following traning data sto construct nave Bayesian classifier and clasify se est ease: AMI=M, AL2=Q, AlL=? Explain the process us} eT aed EFESR PEE FREPERREP PEI Discuss the significance of infrmation gain in decision ee induction [Explain oncarest neighbor algorithm with an example. How to evaluste clustering algorithms? Provide illustrations. Explain the key isucs,stengis and weaknesses of hierarchical cuserng algorithms [748] (8 Discuss the applications of web usage mining. Explain web stctre mining with suitable algo, Ds How wo convert unstructured text in o features intext ming? Demonstrate clstering of text documents using appropriate similarity measures. (748) 00000. ta Warehousing and Data Mining Code No: 1760 R13 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD, ‘B.Tech IV Year I Semester Examinations, July «2021 DATA WAREHOUSING AND DATA MINING ‘Computer Science and Engineering) ‘Time: 3 Hours Max. Marks: 75 Answer any Five Questions ‘AML Questions Carry Equal Marks 1a) Explain the Data warehouse architecture ad its components 1) Explain star schema with an example. Ds) 22) Explainthe OLAP cperations inthe mult dimensional model '>) Briefly outline how to compute the dissimilarity between objects described by the following Nominal aurbutes i) Asymmetric binary atibotes ‘Nomen attabates in) Termtrequeney vectors, Ds 3a) Stutethe nod of Data Cleaning What are the isues that hive to be address daring Daa integraion. Why Duta uansfornaion is essential inthe process of Koowledge Discovery. 1) Deserite the iss and chilléiges of Data, Mining Systems os) 4. Define Association Rule mimin. Explain FP-Growth algorithm with suitable llusration, [Explain the pros and cons of Apc algorithm, us) 5 Aly aNive Bye asin fo he ensign leew, 15) g 5 Carpe | Shin dae Gas FAMILY | SMALL @ ‘SPORTS | LARGE oo ‘SPORTS | EXTRA LARGE | 00 ‘SPORTS [EXTRA LARGE | 0 FAMILY_[ LARGE Co cr cr cr er cr FAMILY [ EXTRA LARGE FAMILY_| MEDIUM LUXURY | MEDIUM EUXURY | MEDIUM LUXURY | MEDIUM, I>) =)>fz|elelzlZlele 62) Waite the basic algorithm for inducing a decision wee from taining ples. And aso explain 1) What s “Over FIT™ Explain about the disadvantages of an over fited DECISION TREE. ‘With an example explain the use of Tree Pruning. 8) 7a) Exploin PAM algorithms. 1) Explain Kemeans algorithm, Des 8. Discuss various Hirachial clustering methods. (Agglomerative. divisive, Chameleon and BIRCH. us) S ta Warehousing And Data Mining PART. 5 Marks) 12) Define Data are housing eh 1) Difereaate OLAP. ROLAP and HOLAP. Gh ©) Discus about subsetsclosion Bh ‘Mention any tee measures of Simi. 61 2) Define Amacition nie mining two sep process. fa 1) Wester set on oppor ant eontiewre mesures BL 1B Menion ype of clase exhiges 1 5) ~ Define oe pining spt prong. 61 4) Discs on Agglomeraive and Divsive clustering echnigues el 2) Mention he varios ips festering methods 6 PARES ‘ i (0 Marks) 2 Explain dau mining as ep press! of knowledge dcovery. Meson the Fuctonlies of Dat mining toy on 4. Difeetate Operational daa systems and das warehousing, Explain the sar schon and fart comllaion scene Ho) 4° plan the watows Dats pe poesing techies. How di redaction els im data preprocessing ox 9} 5. How can the data cabe be effciemly consiucted for discovery driven Exploration? Explain various operations of Daa Cobe to) 6 How can we mine multilevel Association miles sfcienly axing concept hisrrchies? Explain aerate with an Apron algo forthe given dts Below [10] ra a a ed [Bal wen eam sal beastie [Wie pee aw w|i 7 i | hen a pe se aro jchosquesiaghger.com | wal ghevierquertbepaperscom Jun previa: on 7... Can we design's metod tht fins the complete st of regan item se thou Candidate gencaion? I yes, cxplain wihexamplc le meatoged above, (10) 8. Descrite the dats clasifcton proces with = neat dingram. How does the Nive Bayesian clases works? Explain. To) ‘OR 9. Whats pein? Explain ihe vation predton tchaigus. Explain shout Decision tre Indation casaifcaten techni to} 10. Whatareoutien? Discus the methods adopted for oer deteton bo) ‘oR 11. State Kumcane algorithm. Apply Kemeane gor with tw iertion to for two Ginter y obi ated couen wes ta Saha Ta Tas ae 7 0 ae 7 [yp 3a 7 Tas ws 7) ss -90000-- S ta Warehousing and Data Mining 1) Deine data warehouse. a 1) List the Data warehouse Characteristics 61 ©) How can you go abot filing ine missing values for is sutue? i ‘© Why's the word at mining a misnomer? G1 | 2. Gaeaaaeon Closed eet tem St | J 5) Wtetbe FP graph algo. 3 So BL {How prediction iifeten fom elsicaow? ra 1) Whattsruectessiieaion? OL 3) Give amet on kms arith. bh Liste Rey hss in Hierarchical Cusering or | PART -B (so Marts) 22) Makes comparisons bween the MOLAP and HOLAP. ') Discuss he sa and sbowfaks seman del wih suible example (ss) ‘oR | 3) Wiehe difrene between designing a data warehouse and an OLAP cube : |") Ghea etn on ROLAP. sesh | 4. Explain concept hierarchy generation forte minal dts ol ‘oR Sa) _Descie the Feature Subset Selection ') Maen the Dat Traaformation by Normalization [5051 6 Make a comparison of Apron! and ECLAT algorithms for frent item sct mining in transactional databases. Apply these algorithms to the following dts: 1D LIST OF ITEMS | Beco, Milk, Sagar, TeaPomder, Cheese, Tomato | Chiles, Pou, Milk. Cake, Sugar. Bread Bread, Jam, Mik, Buter, Chills ‘Butler, Cheese, Paneer, Curl, Milk Biscuits ‘Onion, Paneer. Chiles, Gac. Mik | | Bead Jam Cae, Bcuts, Temato toy 7, Briefly explain the Partition Algortuas. - or 8. Dincuns K- Newest scighhor clasiication- Algorithm and Characteristics [10] ‘OK ‘3. How does the Naive Bayesian clasifiation works? Explain in deal, to) | 103). Gin riot on PAM Ali | J") Wott te danbackof means slgsetim? How can we rnd the algorithm i Sims hat role? isis] or 11, Whatar the ctferent hnring methods? Expain inde 00) | page | S ta Warehousing And Data Mining (25 Marks) 2) ifr Data archos and DBMS. el ‘b) Write two differences between OLAP and OLTP. Bh ©) What is meant by KDD?. a | Dee bow data eons tr | ©) Dilratae up a mubevel anciaion ule mining, fel 5) acme about ne conep teach gener tf {Expl abut Entopy” und Information gate a 5) Define wo sep ltt proces 6 i) Mention the Key issues in hierarchical clustering. Bb 5} Bitrate OPTICS and DAC AN fh | t i ‘PART-B - (0 Mars) 2. Whatis a Dat Warehouse? Explain three types of schemas tha are weed for modeling dita warchoose with examples State the aplications of Datamining. What i ts nec im Business? to} oR : |3. Define Data Cub computation. Explain the various methods. for Data Cube | Computation, Dscute Contirction cf Maltdimenaional mol. An itt” operations to) 4 What is the need for Data preprocensing? Discuss telly various forms of Data = reprocessing, 0) oF I | 5a) How tohandle missing valves in datasets? | 1b) Discuss attribute subset selection for dimensionality reduction. (451 6 Construct the FP-Tree frm the given Transactional Datsbose. Explain the procedure in let with minima oppor = 3 to} | | | ib fieam | 100 /FACD.G IMP oA B.CFL MO [00 |e. A.1.0.W [0 CS. ' SoA RCE LRM ' ox 7. Explain market basket analysis and its relevance to association rule. Explain the Aprior lgrithn ing the Following transactional data aauming that the support count x 2%. Igstrate with an example to} | iD LIST OF FEMS | or ak dl agar BC / } ‘on Dal, sugar hej, ‘003 Milk bread curd pane ‘ ‘Wheat, panes, dal, sugar ons Mik, panes Bread co "Wheat. dal ance, Bead | | 8. How Neral Networks can be efor Dats casfcaton? Which slit is titsble? Elon them th example? Discs the role of Information Cann slavaticson (10) oR 9. Whatare the various methods of evaluating accuracy of a classifier or predictor? Explain ‘agping and Hooxing techniques? ti | 10. Explain the panitoning moths? Save the following roblm using Partition methods | (cans, Keech) for (2, 4,10, 12,8, 20, 30, 11,23) whee tol OR 11, How Density based clustering algorithms ae diferent from partiioning based cluster ‘Aleorittans. Compare both. Exelain DBSCAN aleorithen with mitable example. (10) S ta Warehousing and Data Mining Code No: 76D R13 ‘TAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD 1B Tech IV Year I Semester Examinations, October/November» 2020 DATA WAREHOUSING AND DATA MINING {Computer Science and Engineering) “ime: 2 Hours ‘Mau. Marks: 75 Answer any Five Questions All Quetions Carey Equal Marks 1a) What are thee major areas in the data warchouse. I this a logical division, Iso, why do you think so? Relate the architectural components tothe thre major areas. 1) Descite dhe composiion of primary Leys forthe dimeasion and factable, (748 2.) Lis out five seasons why you think data quality is ritcl ina Data Warehouse 1) Describe the operations roll-up, dilldowa, slice and te dic, and pivet oo 3. Explain the following with examples 2) Discretiztion and binaization ') Dimensionalty reduction os 4.3) What defines a Data Mining Task? Explain it eat ive basic primiives, ') Brey deseibe dhe four stages of Lnowiedge discovery (KD)? oss 5. Stateandexplain Apri Algotighe with an sstration us} pan wer maton lng sper reed po ig 7a) Gives note on Clasiication wciques. ') Discus the algorithm for K- Nearest neighbor classification, os) a) Briefly explain the Evaluation of Clustering Algorithms. ') List and explain the Key Issoes in Hierarchical Chstering. os 20000 ta Warehousing and Data Mining Code No: 176 R13 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD, ‘B.Tech LV Year I Semester Examinations, September - 2021 ‘DATTA WAREHOUSING AND DATA MINING ‘(Computer Science and Engineering) ‘Time: 3 Hours Max. Marks: 75 Answer aay Five Questions All Questions Carry Equal Marks 1a) Define Data warehouse, Discuss Construction of Multidimensional model and its operations. 1) Explain the fact conatllation schema with a example (8 2 Explain the various stages of Data preprocessing. Explain Binning ~ techniques and Data ~ normalization, Let the values for the abate age are (13.18.16, 16 19, 20, 20,21, 22,22, 25,25, 25,28, 30 33, 38,38, 35,35, 38, 36, 40.45, 46,52, 70.) Smooth his data by bn means using a bin depth of 3. is) 32) Explain the Datamining architscure 1) Given two objects represented hy the tuples (22,142.10) and 200,36.8) ‘) Compute the Euclidean distance between the wo objects Compute the Manhatan distance Betweem the to objects, Compute the Minkowski distance between the two objects using G=3: jw) Compute Jaccand distance between theo objets. ss 4 Explain FP-Growth algorithm and generate effective association rules with minimum support 3. us} Transaction [LIST OF FEMS OL ‘MILK_DAL. SUGAR, BREAD on ‘DAL SUGAR, WHEATJAM 00 ‘MILK, BREAD, CURD, PANEER oo ‘WHEAT, PANEER, DAL SUGAR THs, ‘MILK, PANEER, BREAD. 006) ‘WHEAT, DAL, PANEER, BREAD 5.3) Compare and contrast Eager clastic with Laxy clastic What athe various methods of caning accuracy of a clasiier or predicir? Explain Bagging and Boosting techno? 1b) Explain Nabe Bayer clasifed pst 6 Perform KNN classification for the data given below for X= (Pl 3,P2=7) where ka} Us) Pips [Lass 7 [7 [FALSE 7 [a [FALSE [a] TRUE. 1 [4 [TRUE 7. Define Outer Analysis: Explain the various lgoriduns for identifying the outers ia a sven cluster ustrate us} 82) Categorize major clustering methods. '>) Explain Hierrcical Clustering. os) Po 1) » ° 2 2 aD D 2a » 3a) » sa » 1a » sa) » ta Warehousing and Data Mining cures 10 marks and may have ab €as sub questions. PART-A (25 Marks) List and define the characteristics of Data warehouse, er Give brief note on Fact Less Facts, fe ‘What do you mean by Data Cleaning? el (What are the limitations of data mining? o) “Menton the importance of Association Rule Mining fei Define frequent ses, confideace and support a Whats classification? o Write the need for re pruning in decision tee indicia? o Differntite between clustering and clsification. fel How ae outliers detected using data minin 68 PART GoMarts) ‘Draw the Data warehouse Architeture ad explain its Components Explain Star and Snow-Flake Schemas. [sist OR Give a note on OLAP Operations. What are the differences between the MOLAP and ROLAP models? Also list their similarities, (1 Explain the following with examples 3) Angresation by Dimensionality reduction Feature subset selection to} on ‘What steps you vl follow o identity a frau for aeredit cand company. List and define the measures of Similarity and Disa (st (DA CE.B) 300 101999 {CABLE} 400 102299 [B,A.D) 3) Find all frequent items using’ apioni & FP.growih, respectively. Compare the efficiency ofthe two meaning process List al ofthe stong association rales (vith supports and confidence c) matching the following metrule where X is a yarlable epreseting customers, and item i denotes variables representing tems (eg, "A", “BT, etc) Vi € transactions, buys Extent “ys cm) ya ems uo} Whats more efficient method for Generalizing association re? Explain, Describe a data st for which sampling would actualy increase the amount of work. ta other words it would be faster wo work on full data set T5451 ‘Construct a decision tee with rot node Type from the data inthe table below. The first row contains atrbute names. Each ow after the fit represents the ales for one data instance. The ouput tribute is Class 0} Seale [Type [Shade Teste Cla ‘One [One | Light This A ‘Two | One [Ligh This A Two | Two | Light Thin 5 “Two [Two | Dark “Thin B Two | One | Dak Thin c ‘One| One| Dark Thin c (Ove [Two [Light Thin c oR Explnin in detail the Naive-Bayes Clases List the characteristics of K- Nearest neighbor classification. (ss 10.) Differentiate Agalomerstive and divisive Hcrarchical Casterng, Po 2» » 3a) » sa) » 9 » 9a) » toa) ») na) ») ta Warehousing and Data Mining (60 Marks) Draw the Duta warehouse Architecture and explain its Components, Explain Star and Snow Fake Schemas (ss oR Give a note on OLAP; What are the differences betwen the MOLAP and ROLAP models? Also list their smiles, (ss Explain the following with examples 2) Aggregation bb) Dimensionality rection 2) Festre subset selection. uo} oR What steps you vould follow to identity afraod for credit card company. List and define dhe measures of Similarity and Disiilrity, sy ‘A database as four wansactons. Let min_supe60% and min_confe8O% TID date its bons 100 TWISA9—{K,A B,D} 200 101889 [DLA CE.B) 300 101999 (CABLE) 400 10229 {8,A.D) 2) Find all frequent items using’ apcioci & FP-growth, mapectively. Compare the ‘ficiency ofthe two meaning process. ) List al ofthe strong association rules (vith supports and confidence c) matching the following metarule where Xs a variable repescating customers, and stem 1 denotes Vavables representing stems (eg, “A, BY, etc) Ve €. vansactions, buys (Kiteml) bays O&item2) =>bays(X items}. uo) OR Whatis more eicien method for Generalizing sswoiation rule’? Explain, Descibe a dataset for which sampling would actualy increase the amount of work. In other word it would be faster to work on fl tant (545) Construct a decision tee with rot node Type from the data inthe able below. The fst row contain atrbete ames. Each row after the first represents the valves for one data ‘instance. The ouput atiabut is Chass. to) Seale [Tape Tee Ce ‘One [One Thin A Two | One Thin a Two | Two Ts B Two [Two “Thin B Two [One Thin c ‘One | One: Thin c ‘One [Two [igh Thin © OR Explain in dol the Naive-Bayes Classifier. [List the characteris of K- Nearest neighbor clasification. [ss Differentiate Agelomerstive and divisive Hierarchical Chasterng. [ss (st S ta Warehousing and Data Mininc 5 Marks) 1a) Listoutthe operons of OLAP. el ') Whatis fact able? Write its uses a ©) Define disectization Bl ‘D)Whatis predictive mining? Explain it briefly. fel ©) Wiite the purpose of Aprior algorithm, rei 1) Define support and confidence measie. fe 12) Whats boosting? rei 1) Define decision wee G8) 4) Write the strengths of hierarchical clus 8 4) Compare agglomerative and divisive 6 PARTR, Go Marks) 22) Wiha nea sketch, Explain three ter architecture of data warehousing, 'D) Explain various data warehouse model [ss oR 3. Wateanoieon 2) Relational OLAP. 'b) Multi dimensional OLAP. (ss 41a) Discuss in detail ant the steps of knowledge discovery? 1b) Wate a note on subse selection in atbuts for data eduction. (1 ‘OR Sa) _Exploin various daa mining tasks. 1) Discus beefy about dats cleaning techniques [st 63) Write FP- gromh algorithm, 1b) Explain how association rules ae generated from fragueat item sets [st oR 7.2) Explain the procedure 1o mining closed frequent data item ses. '>) Explain, how can yom improwe the performance of Aprion alseithm, (sis) Sa) Whatis Bayesian blif network? Explain in detail, ') White note atibaesoleton measures. [ss ‘OR 1a) Write oncaret neighbor clasification algorithm adits characteristics 1) Waite decision re induction algoitun, (st 103) What is onier detection? Exphin distance based outer detection. 'b) Write partitioning round medio algocth. (st ‘OR 1.3) Write Ksmeansclostering algorithm, 1b) Woe the Key issue in erarccal clustering algorithm, Bsr 0000 apply.manipal.edu © Application Open For AY'24 x xo Po Code No: 117¢D ta Warehousing and Data Mining R13 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD 1. Tech IV Year ISemester Examinations, Febroary/March - 2022 DATA WAREHOUSING AND DATA MINING ‘(Computer Science and Engineering) ‘Time: 3 Hours ‘Max, Marks: 75 Ansmer any fve questions All questions carry equal marks 11a) How do you optimize the Backup process fora Data Warehouse? » 2a » 3a) » 4a) » sa) » a » 1 » sa » apply.manipal.edu Secure Your Spot For AY'24 List and Explaio briefly the techniques used implement a multidimensional view in data arouse teh Describe the Data Warchomse architect Disinguish between OLTP and OLAP stems. pst IMasrate the varios data reduction techniques for dat reprocessing. List the steps of the Knowledge Discovery ia Databases (KDD) and describe each of them. st Discus brief abost Dats Mining Ise Discuss about discretization and concept hirrchy generation for numerical daa. (748 How are association rues geet from frequent itemsets? Masta Explain the procedure 1o mining closed frequem data item sets. en Apply FP-growth algorithm on the following datahase to find al ofthe strong association rules with min sup= 60% and min_conf = S08. TID ems ‘Tio ‘1200 1300 T4011, M16, 18, 11 130012, M.18, 19,110 (Can we design a method dhat mines the complete se of frequent item sets without candidate generation? Explain. eon Discuss an Measures for Selecting the Best Split forthe various types of atibues Explain NN Algor for dau clasificaion with an example. te) How to later the dat sets using -mediod clustering algorithm? Discos shoot classification of tier detection cheques. eon 00000 OPEN > ta Warehousing And Data Mining idprefiousquesionbae Code No: 117¢D A 4 JAWAHARLAL NEURU TECHNOLOGICAL UNIVERSITY HYDERABAD ‘Tech IV Year I Semester Examinations, March - 2017 ‘DATA WAREHOUSING AND DATA MINING ‘Computer Science and Eagineeriag) ‘Time: 3 Hours ‘Max, Marka: 75 Note: This question paper consis wwo;parts A snd B. Part Ais compalsory which caries 25 marks Answer all questions in Part A Pat B comists of § Unite Answer aty one fall question from each unit. Each guewion aries 10 marks and may have ab as sub questions. Parts A (25 Marks) 4a) What is data mart? 2 1b) © What isa fact table? - OI ©) Whats data mining? 1 6) List similarity measures. BI ©) Whats maximal frequent items? RB) 1) How to compute confidence measure for an association rule? roy |g) _ Whatis classification? I I el 18) © Define information gain, B) 8) What isan outlier? eB) 1) List the demerits of means algorithm. BL Pa 1 (30 Masks) 2 Whatare the various components of data wafchouse? Explain their functionality in detail, 0} oR 3. What is the significance of OLAP in data wayehouse? Describe OLAP operations with necessary dapeanvexample. {10} 4 Explain diferent data mining tasks for knowledge discovery. wo oR 5. What is the need of dimensionality reduction? Explain any two techniques for dimensionality eucion + I tuo) 6A database has six transactions. Let min-sup= 50% and min-conf = 75%. TID [ao em Go1_[ Pencil, sharpen, easr, color papers ‘O02 [Coor papers chars glue sticks (005 Penci glue sek eraser, pew (04_[ Oil pastels, poster colors, cexTcton pe (03_| Whitener, pen, peel. chars gue stick (6 [Colour peas crayons eraser ea Find al frequent tem set sing Apron algorithm Lis al the tng association mls. to} tiratiousquesienbapers.com wai ghreviou s.com Juin ios previouigastionp OR 7a) What ar the advantages of EP-Growth algorithm? ') Discus the applications of association analysis. (st Explain decision te indection algorithm for clasfying data tuples and discus suitable example. to} oR ‘.a) ~ What are the characteristics of K-hearest neighbor algo? 1) How tocvalute the lasifer accuracy? Best 10. Whatis the gal of clustering? How does putitionng around medoids algorithm achieve this goal? uo) oR 1163)" Differeaiie between AGNES and DIANA igithns 1b) How to acces the cluster quality? (sist S ta Warehousing and Data Mining Code No: $7048, ROO SAWANARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD TB. Tech V Ver Semester sania Apri My 2018 DATA WARUTOUSING AND DATA MINING {Computer Scene ond Haginerig) “mes3 Hoss Max Mark Anse any Five Questions All Questions Carry Equal Marks |.) Whats data mining? Explain the architeetre of w typical data mining system, ‘b) every pltem Interesting? Can mining system find only intresting ptiens?Jusiy sou answer . ©) Enplin in detail Onine Analytical Processing operations with ilusraons. (15) 2.) What isthe ned fr preprocessing? Briefly explin various forms of preprocessing. 1) Dincus he techniques to detec noise and its removal (71 5.a) How lind freueit fem sts witht candidate geirtion? Explain whan example) | '5) Quote examples for multidimensional association rules, quantitative asvcktion rues vis) 4a) Whatistre overfiting? How to handle tee overfiting in decision trees? 1) Why is information gain measure necessary for decision tre induction? Explain with instaions. 17) 5.8) Diels he categorie of mor hisefing methods ') Explain CHAMELEON algo for clustering. wt) 64) Whatis data stream? Explain the data mining functionals applicable to data streams. ©) Wat aa ming? Why it peered apa te apis of gh mining os 7.8) Differentiate between sata and sot-Spail daa in Spl databases Hovy it handled ») en 82) Howto use data mining for detecting intruders for online systems? Explain the proves. 1) Is data mining & threat to dus privacy and data security? Jusify your answer with stable scenarios’ discussion, (78) yorw android. universiyapdatesin | www:.universtyupdatesin | wine. dos. universiy S ta Warehousing And Data Mining Code No: 57048 ROO "JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD. 1. Tech IV Year I Semester Exattinations, December - 2014 DATA WAREHOUSING AND DATA MINING (Computer Science and Engineering) ‘Tine: 3 Hours ‘Max. Marka: 75, Answer any Five Questons ‘All Questions Carry Equal Marks 11a) What is data mining? Explain the architectre of 9 pial data mining system. Mey pte inereting? Can «ming aan Edel beeing pte? jst your answer 22) Explain in dotil Online Analytical Processing operations with illustrations 1) Deserbe star flake and fact constellation schema 3. What is the nest for prepmcessing? Briefly explsin various forms of preprocesing. 4. How to find Frequent item ses without candii® generation? Explain with an example. $2) Rowsetantetin ales nif sober Bae me et tin 62) Discus the categorization of maj casorng methods >) Explain CHAMELEON algorithm 7. What is data steam? Explain the dota mining functionals applicable to data 82) Discuss data mining for imasion detection. '6) Wate about data mining fr financial dat analysis, S ta Warehousing And Data Mining ‘Code No: 09470803 RO "TAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD ‘B.Tech IV Year I Semester Examinations, June/July - 2014 DATA WAREHOUSING AND DATA MINING (Computer Science and Engineering) ‘Time: 3 Hours ‘Max. Marka: 78 Answer any Five Questions ‘All Questions Carry Equal Marks 1a) Describe data mining task primitives. 1) Explain any tree nmerosity redaction techniques, 2.) What is a dats warchonse? How does it differ from a operational database system? Explain the atibute oietedalgord with an example 32) Explain Apri algorithm and compare it with vertical data format approach. 1) How to mine quantitative multiinensional association rues from relational dtabace? Give lero. 4.2) Describe the steps incasification pros ') Explain roughest approach for classification ©) Discus the measures for classification necurgey, 52) iene encom AGNES fd DIANA 3) cos the mertr and demens of pron asd clots ports. 3) Bipiue OPTICS pectin 63) Whatis equertalpatero mining? Explain any one algorithm. 1) Lint the characteristics exhibit by noi etwork Describe forest fre mode. 7. What kind of data is stored in spatial database? How is it sored?” What data ‘mining fonctionalitis are applicable to spatial database? Explain, 8. White echnical notes om the folowing: 8) igus data mining Data viseliztion ) Ato data mining S ta Warehousing and Data Mining 1a) Discuss tsk relevant daa asthe data mining primitives, 1b) What isthe need for data preprocessing? Discuss various forms of preprocessing, 22) Explain the threes architecture of data warehouse 'b) Discuss bottom up approach used for data cube implementation. (8) 4a) How to valuate the accuracy of 2 lasifir? 1) Discuss the hack propagation algorithm used in dra networks. irs8) 5 ‘Perea xian tere cli ed otra us) 62) Describe multi relational mining) 'b) Discuss sequence pattem mining of bolo da (8) 7. Explain the feaures of word wide web am the taxonomy of web mining. (15) a) Discuss mining audio nd video data. )Waitea note on socal impacts of data mining. 8) 00000 S ta Warehousing and Data Mining ROO Code No: 09470503 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD. 1H. Teeh IV Year I Semester Examinations, November -2013, Data Warchousing and Data Mining (Computer Science and Engincering) ‘Time: 3 Hours ‘Mas. Marks: 75 Ansmer any Five Questions ‘All Questions Carry Equal Marks 1a) Present an example where data mining is crucial to the success of a business What data mining fincions does this business need? Can they be performed lesnatively by data query processing ot sumple statistical analysis? 1) Data quay con be esessed in terms of accuracy, completeness, and consistency. Propose 1wo other dimensions of data quality? (+7) 2a) What ae the chursteristics of the OLAP and the basic dita warehouse environment s they relate to information delivery needs? 1b) Discus why relevance analysis x beneficial and how it can be performed an negate ino the characterization process. Compare the rest of two induction ‘methods: (1) with relevance analysis and (2) without relevance analysis. (8+7] 3. How can efficiency of Aptir-based be imprgvel? Describe briefly any of five virations ofthe Apron slgoitim? ts} 42) Bec nie the ajar steps of deciton clin? 3) Whar sect be Cas? oo jr moc tse clacton le to achieve higher classification secracy thin © casical decison toe ret? Explain how succinic) clawaficaton am be aned for text document clsiicain? (649) 52) Briefly outline how to comune the dissimilarity between objects described by ‘Asymmetric biaary variables? 1) Give an example of how specific clustering methods may be imegrated, for cxample, where one clustering algorithm is used as a preprocessing step for nother. In addition provide reasoning on why the integration of two methods Po cao fed we cei ey, wr 6 The concept of microclstering has been popular for on-line maintenance of cheering, formation for dain stares, By exploring, the power of rmicrotasering. design an effective densi-hased lusering method for chesering evolving das seams tis} 7. atline an implementation technique that applies 2 similarity-based search ‘method to enhance dhe quit ofchintering in mulimeda dita? U5) 8. Why isthe establishment of theoretical foundations important for data mining? [Name and deveibe the min theoretical foundations that have been proposed for ita mining. Comment on how they each saisty (or fil to sitsty) the requirements of an ideal heoctcal framework fr dala mining? us) S ta Warehousing And Data Mining ven android previousquestionpapers.com | www previwsquestionpapers.com | wi: ios previousguestionpapers.com ‘Code No: $7048 ROO JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD. 1 Tech IV Year I Semester Examinations, November «2015 [DATA WAREHOUSING AND DATA MINING ‘(Computer Science and Engineering) ‘Tine: 3 Hours ‘Max. Marka: 75, Answer any Five Questions All Questions Carry Equal Marks 11a) Discus about different Issues of data mining. 1) Explain in det abot data mining tsk primitives. (8) 2a) Discuss about data imegration 1») Discos aba data ranstormation te 3. Waite de algorithm fr atvbuterented induction. Explain it with an example. Us) 4. Explain aboot mining frequent pattrns using FP us} 5. Explain the Claesifaton by Back us} 6 Explain the following 9) BIRCH. (807) 7. Explain Mining Sequence P tional Databases. us} Sa) How is web usage mining different from web suucture mining and web conten mining? 1) Explain how data mining is usd fr instuction dewectin. (s+) vw android umversiyupdatesin www amiversityupdaesin | won. aniversiupdatesin S ta Warehousing and Data Mining swine android previousquestonpaper.com | wie: previousquestionpapers.cam | we cs previcusquestionpapers.com Cade No: s7048 R09 YAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDEBABAD 1. Tech Iv Veart Semester Examinations, NovemiberDecember 2018 DATA WAREHOUSING AND DATA MINING {Computer Seence and Eagincering) ‘Time: 3 Hows Max. Marks 7 Answer any Five Questions All Questions Catry Equal Marks 14) Whot ar the fuetionaies of data ning? Exp. 1b) Disuse abun diferent kinds of aa eution methods is) 218) Digs about mlisimesional date mode with relevant dagrams. 1) ely deeribe various data cube computation methods pen) 41a) Explain mining oso verti ata format a highlight its advantages und disadvantages 1) Wit about constraint based msociatonrle ining 07) 4) Discus sues in classification and explain various abut selection measures. 1b) _ What ropression? Discuss various pression techies (87) a) _ Explain the Kemeane casting alpoith, 1b) What ican outlier Discins aout ule detection methods ea 63) Explain Hooling Tre algoihn, ) Explain soquntlptem mining technique in bloga da es 7a) Whats text mining? Briefly explain shout text mining methods. 1) fly expan features of adn and video mining, (87) 4a) What are te applications of data ining? Explain 1) Walteshort nates on anal theese data mining cen) ov 9d. aniverstyupdatesin | worn unverseyupdatesin | won. dosuniversiqypdatesin S ta Warehousing And Data Mining (Code No: M0822 "JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD. 1. Tech IV Year I Semester Exattinations, December - 2014 DATA WAREHOUSING AND DATA MINING (Computer Science and Engineering) ‘Tine: 3 Hours ‘Max. Marks: 80 Answer any Five Questons ‘All Questions Carry Equal Marks 11a) What is data mining? Describe the steps involved in knowledge discovery process. 1») Discus various methods fr Dita Integration aad Transformation 2.) Explain shout -clssification of data mining systems ') Explain about eancept hierarchy generation for categorical data 32) How is OLAP diffrent rom OLTP? 1) Explain any the important OLAP operations 4. What is DMQL? Explain in detail with examples 5. Explain Attibut-oriented induction i el with example. 6 Explain market basket analysis elevace to asocaton rule mining. And Fn! al feqca tem st Tote follow data wing Ap lpr with Sppatcomnad So 7 Tae] T10 ADE a0 ACE 130 |G A RE 1300] 130 eae 7. What is meant by classification and prediction? Diseuss classification by decision tree inciction method with an example. Stote why tee proning is sto in decison tee induction? 8a) Discuss various density bused clustering methods 1) List and explain various measures to assess the quality of ext retrieved. Explin various text retrieval methods S ta Warehousing And Data Mining cote ess a JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD 1. Tech IV Vear [Semester Examinations, March «2017 DATA WAREHOUSING AND DATA MINING. ‘(Computer Science and Engineering) ‘Time: 3 Howes ‘Max. Marks 80 Answer amy Five All Questions Carry Equal Marks 11a) Explain the process of knowlege discovery in datahases with necessary diagram, ') Discus principle component analysis asa pprocessing activity. ©) How to handle missing values? ho 22) Define data warehouse. Compare and contrast data warchouse with operational database system. 1b) Wate and explain BUC algorithm for data cube computation [es 1a) Explain desenig srophical user itrface based on dats mining query langungs. bb) Discuss the DMQL syntax for task relevant data and vistliation. [es 42) Diferatate hetwoon characterization and stalytcal charsteization. Quote stitshle cxamples. 1) How touse fox plot summary fy measuring the data dispersion? ©)» Whats seater plo? Whar ig meant by loess carve? 16 5a) What is a FP-uce? Wha is its significance in association rule mining? Explain with an example data st 1) Discus Association rule clusteing system for mising quantitative association rules fom relational database (es) (62) How to aye"stheorem is used for lasification? Explain in deta 1) Whatate the measures fr clasifer's accuracy? ©) Discuss the classifiation process. Compare it with prediction a 7) _ Discs the smilaity measurés used in clusetng: - '5) ~ Explain the merits and demerits of panioning methods. Tes) 4.2) Explain the utzaion of latent semantic indexing intext mining. 1) Giveihe algorithm for web structure mining. ©), What willbe content of spatial dats cube? a ~90000-- Po ta Warehousing and Data Mining ‘Code No: M0s22 JAWAHARLAL NEHRU TECHNOLOGICAL UNIVERSITY HYDERABAD. 'B, Tech IV Year I Semester Examinations, May/June - 2015 DATA WAREHOUSING AND DATA MINING. (Computer Science and Engineering) ‘Time: 3 Hours Max. Marks: 80 ‘Answer any Five Questions AI Questions Carry Equal Marks 11a) Is data mining neaded eventhough an eficient data management system exists Justify your answer. 1) Compare Data warehousing and Data mining. Write about date preprocessing forms Data reduction, Discretization and Hiearehy generation, is3) 2a) Whatare the various metho to handle redundant data during data ietegration. bb) Explain how numeric concept herarchy an be generated (+8) 3a) What is a Data Warehouse? Explain three types of schemas that used for ‘moding dia Warehouse wth examples, 'b) Discuss OLAP serverarchitctures: ROLAP vs MOLAP vs, HOLAP, [88], 4a) Explain the various primitives fr spestyin Datamining Task 1) Desorbesn deal aout Interesgnes of pater [ss] Sa) _ Explain Analytical caracterzationn deta Wheto Ming Aten ersten and mimo or 6. Explain the aprior algorithm for feguant item sets, Abo suggest how we can limprove the efciney of the prior algorithm Ts} 7a) Explainabout k-neares neighbour classifiers ) Discuss about Bayesian classification [ss] 8a) Whatare grid-based clustering techniques? Explain ib) Explain sequential pattern mining in warsactional databases, [ss] 00000

You might also like