KEMBAR78
DSP - Syllabus - Subj - 3 - NLP and Big Data | PDF | Apache Spark | Big Data
0% found this document useful (0 votes)
13 views2 pages

DSP - Syllabus - Subj - 3 - NLP and Big Data

nlp

Uploaded by

Lalitha Abhigna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views2 pages

DSP - Syllabus - Subj - 3 - NLP and Big Data

nlp

Uploaded by

Lalitha Abhigna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

CERTIFICATE COURSE WITH DATA SCIENCE WITH

3.Natural Language Processing and Big Data


UNIT-I
Introduction to NLP.
What is NLP, Various levels of NLP: Morphological,Lexical Analysis, Syntactic analysis,
Semantic analysis,Discourse level, Pragmatic),Applications of NLP
Introduction to Text Processing: Working with,Text Files, HTML files, XML files,JSON
files and PDF files,Working with Regular Expressions
UNIT-II
Text Processing using NLTK, Blob, Spacy
Text Processing: Tokenization,Stemming, Lemmatization, Removal of Stop Words, POS
tagging and Named Entity recognition, Text Preprocessing, Phrase Matching
Text Feature Extraction using SciKit-Learn: Vector Space Model representation, Term
Frequency, Document Frequency, TF_IDF frequency, Count Vectorizer, TF-IDF
Transformer, TF-IDF Vectorizer, Text Similarity
UNIT-III
Application Development using Text using ML
Text Classification,
Text Clustering and
Text Summarization
Case Studies and Application development
Topic Modelling using NLP: Introduction to Topic Modelling, Latent Dirichlet Allocation
with Python - Part Two, Case studies and Applications
Sentiment Analysis: Introduction to Sentiment Analysis
Creating NLP Pipeline for Text Mining (Social Media data/Web data), Word2Vec and
Doc2Vec, Transformers, Recommendation Systems - Collaborative filtering, Overview of
Language Modelling
UNIT-IV
Introduction to Big Data,Evolution of Bigdata, Types of Digital data, Characteristics&
Challenges of data, Overview of Predictive Analytics, NoSQL databases

UNIT-V
Key Technologies and Drivers for Big Data
Knowledge Discovery Tools, Stream Analytics, In-memory Data Fabric, Distributed Storage
and Computing, Data Integration and Visualization, Data Pre-processing

UNIT-VI
Hadoop Eco System
Hadooop for Bigdata, Overview of Apache Hadoop software, Installation of Hadoop,
Architecture of Hadoop, Understanding Hadoop eco-system-HDFS, Map Reduce, Working
with Hadoopeco system components- Hive, Pig, Data Ingestion with Flume &Sqoop,HBase

UNIT-VII
Bigdata&In-memory computing
Understanding In-memory computing, Resilient Distributed Databases(RDDs), Introduction
to Big Data Analytics with Spark, Understading Spark eco-system components, Overview of
client mode & cluster mode computing, Working with basic Spark scripts, Data Analytics
using Spark eco-system
Case Studies & Applications of ML in Spark

UNIT-VIII
Real-time Streaming platforms for Big Data

Overview of Apache Kafka & Storm

You might also like