KEMBAR78
Hadoop Developer | PDF | Apache Hadoop | Map Reduce
0% found this document useful (0 votes)
115 views7 pages

Hadoop Developer

This document contains the contact information and resume of John, who has over 6 years of experience working with big data and Hadoop technologies. He has expertise in processing large amounts of structured, semi-structured, and unstructured data using Hadoop components like MapReduce, HDFS, Hive, and Pig. His experience includes installing and configuring Hadoop clusters, writing MapReduce programs and Hive/Pig scripts, and using Sqoop to load data between HDFS and relational databases. He is proficient in Java, SQL, and various Hadoop technologies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
115 views7 pages

Hadoop Developer

This document contains the contact information and resume of John, who has over 6 years of experience working with big data and Hadoop technologies. He has expertise in processing large amounts of structured, semi-structured, and unstructured data using Hadoop components like MapReduce, HDFS, Hive, and Pig. His experience includes installing and configuring Hadoop clusters, writing MapReduce programs and Hive/Pig scripts, and using Sqoop to load data between HDFS and relational databases. He is proficient in Java, SQL, and various Hadoop technologies.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 7

JOHN

Email: hpjohn111@gmail.com
703-463-9491

Mobile#

PROFESSIONAL SUMMARY
6+ years of IT software experience in the field of Big Data and
related Hadoop framework technologies.
Expertise in processing of large sets of structured, semi-structured
and unstructured data and supporting systems application
architecture in Hadoop framework.
Configured and worked with 5 nodes Hadoop cluster. Installation,
configuration of using Hadoop ecosystem and hands on experience
with Hadoop components like MapReduce, HDFS, HBase, Oozie,
Zoo Keeper, Hive, Sqoop, Pig, and Flume.
Excellent knowledge in HDFS, Job Tracker, Task Tracker, Name Node,
Data Node and Map Reduce programming.
Excellent understanding of Hadoop architecture and underlying
framework including storage management.
Proficient in working with MapReduce programs using Apache
Hadoop for working with Big Data.
Extensive experience in configuring, supporting and monitoring
Hadoop clusters using Apache, Cloudera distributions.
Experience in Setting Hadoop Cluster, Performance benchmarking
and monitoring of Hadoop clusters
Expertise in writing HIVE queries, Pig and MapReduce scripts
and loading the huge data from local file system and HDFS to Hive.
Handled importing of data from various data sources, performed
transformations using Hive, MapReduce, loaded data into HDFS
and extracted the data from relational databases like Oracle,
MySQL, Teradata into HDFS and Hive using Sqoop.
Utilized Sqoop to export data insights from HDFS
Expertise in database design using Stored Procedure, Functions,
Triggers and strong experience in writing complex queries for SQL
Server.
Expertise in HDFS data storage and support for running map-reduce
jobs.
Expertise in providing business intelligence solutions using
Teradata in data warehousing systems.
Experience with hands on data analysis and performing under
pressure.
Excellent knowledge of moving the data from S3 simple storage
system to the HDFS as it having advantage of scalability and can
increase or decrease number of systems inside the clusters
according to the storage requirement.
Quick learner with the ability to work in a fast-paced environment.
Well-versed with all stages of Software Development Life Cycle
(SDLC) and sound knowledge of project implementation

methodologies including Waterfall model, Spiral model, Agile model


and Scrum model.

TECHNICAL SKILLS

Big Data Ecosystem

Hadoop Distribution

Hadoop, Map Reduce, HDFS, HBase,


Hive, Oozie, Pig, Sqoop, Yarn,
Zookeeper, Flume, Scala
Cloudera, Horton works and Map
Reduce

Databases

Oracle DB, SQL Server Management


Studio 2014, MySQL, PL/SQL, Toad,
No SQL

Languages

Java, C, HTML, CSS, XML, Pig Latin

Methodologies

SDLC / Waterfall, Spiral, Agile and


Scrum

Spreadsheet

Microsoft Excel

Scripting

Unix Shell Scripting (ksh, csh, bash,


sh)

PROFESSIONAL EXPERIENCE
Capgemini

New
York,
NY
March 2016 to Current
Role: Hadoop Developer
Project Description: The purpose of the project is to store terabytes of
different departmental information. The solution is based on the open
source Big Data s/w Hadoop. The data will be stored in Hadoop file system
and processed using MapReduce jobs. This project is mainly for the re-plat
forming of the current existing system which is running in MySQL DB to a
new cloud solution technology called Hadoop which can able to process
large date sets.
Responsibilities:

Actively Participated in all phases of the Software Development Life


Cycle (SDLC) from implementation to deployment.
Responsible for building scalable distributed data solutions using
Hadoop.
Responsible for Cluster maintenance, adding and removing cluster
nodes, Cluster Monitoring and Troubleshooting, Managing and
reviewing data backups & log files.
Responsible to manage the test data coming from different sources.
Analyzed data using Hadoop components Hive and Pig.
Load and transform large sets of structured, semi structured and
unstructured data using Hadoop/Big Data concepts.
Involved in importing and exporting the data from RDBMS to HDFS
and vice versa using Sqoop.
Involved in loading data from UNIX file system to HDFS.
Responsible for creating Hive tables, loading data and writing hive
queries.
Created Hive External tables and loaded the data into tables
and query data using HQL
Handled importing data from various data sources, performed
transformations using Hive, Map Reduce, and loaded data into
HDFS.
Created
and
maintained
Technical
documentation
for
launching Hadoop Clusters and for executing Hive queries and Pig
Scripts.
Extracted the data from Teradata into HDFS using the Sqoop.
Exported the patterns analyzed back to Teradata using Sqoop.
Experience in Monitoring System Metrics and logs for any problems
adding, removing, or updating Hadoop Cluster.
Involved in scheduling Oozie workflow engine to run multiple Hives
and pig jobs and used Oozie Operational Services for batch
processing and scheduling workflows dynamically.
Environment: Hadoop, Spark, Scala 1.5.2,MapReduce, HDFS, Hive, Java,
SQL, Cloudera Manager, Pig, Sqoop, Oozie, Zookeeper, Storm, PL/SQL,
MySQL, NoSQL, Elastic Search, Oozie, HBase. UNIX, SDLC
NetCracker

Cincinnati,
OH
Oct 2015 to Feb 2016
Role: Hadoop Developer
Project Description: Kaybus engages in building solutions using big
data, analytics, mobility, social networking, and machine learning. The
project focuses on extracting data from oracle databases, processing the
data in HBase and displaying the results on the dashboard. Responsible
for managing existing data extraction jobs, but also playing a vital role in
building new data pipelines from various structured and unstructured
sources into Hadoop.
Responsibilities:

Worked on analyzing Hadoop cluster and different big data analytic


tools including Pig, HBase, SQL database and Sqoop.
Responsible for Cluster maintenance, commissioning and
decommissioning Data nodes, Cluster Monitoring, Troubleshooting,
Managing and reviewing data backups & Hadoop log files.
Imported data using Sqoop to load data from MySQL, Oracle to HDFS
on a regular basis.
Experienced in managing and reviewing Hadoop log files.
Executed queries using Hive and developed Map-Reduce jobs to
analyze data.
Supported Map Reduce Programs those are running on the
cluster. Running reports in Pig and Hive Queries and Analyzing data
with Hive, Pig.
Involved in creating Hive tables, loading with data from local file
system, HDFS and writing hive queries which will run internally in
map reduce way.
Have written Hive Queries for data analysis to meet the business
requirements. Load and transform large sets of structured, semi
structured and unstructured data.
Responsible to manage data coming from different sources.
Responsible for running workflow jobs with actions that execute the
Hadoop jobs such as Map reduce, Pig, Hive, Sqoop and sub
workflows using OOZIE.
Documentation of the day to day tasks.

Environment: Apache Hadoop, HDFS, Hive, Map Reduce, Java, Flume,


Cloudera, Oozie, MySQL, Oracle, UNIX, Core Java.
Hewlett Packard enterprise Pvt. Ltd. Bangalore, India
Jan 2014 to July 2015
Role: Hadoop developer
Project Description: Hewlett Packard Enterprise is specialized in
developing and manufacturing computing, data storage, and networking
hardware, designing software and delivering services. The Project was to
create a Reference Architecture (RA) having a big data solution deploying
the Cloudera Data Platform on the Hewlett Packard Enterprise Apollo 4500
Architecture. To gather information and run strategies on that collected
data and load them into Hive tables. We collected the data from all the
customers and our team performed the analysis by using various traits
and load that data into RDBMS and into Hive tables using Sqoop.
Responsibilities:
Involved in design and development phases of Software
Development Life Cycle SDLC using Scrum methodology.
Experienced in installing, configuring, and administrating Hadoop
cluster of major Hadoop distributions.
Supported the team in completion of architecture

Responsible for Hadoop development and implementation including


loading from disparate data sets, pre-processing using Hive and Pig
and strong understanding of MapReduce.
Creating Hive tables to import large data sets from various relational
databases using Sqoop and export the analyzed data back for
visualization and report generation by the BI team
Created and maintained Technical documentation for launching
HADOOP Clusters and for executing Hive queries and Pig Scripts.
Involved with ingesting data received from various relational
database providers, on HDFS for analysis and other big data
operations using MySQL
Used Sqoop extensively to import data from RDBMS sources into
HDFS and Hive
Solved performance issues in Hive and Pig scripts with
understanding of joins, group and Aggregation and how it translates
to MapReduce.
Very good understanding of Partitions, Combiner concepts in Hive
and designed External tables in Hive to optimize performance
Developed PIG scripts for source data validation and transformation.
Extensively used Pig for data cleansing.
Developed Pig Latin scripts to extract the data from the web server
output files to load into HDFS

Environment: Hadoop 2.x, Sqoop, Hive 0.13.1, HBase, Pig Scripts, Oozie,
DB2 10.x, Linux Scripts, UNIX, SDLC, Scrum.
Axis
Bank

Hyderabad,
India
Jan 2011 to Dec 2013
Role: Hadoop Developer
Project Description: Axis is a banking and financial services company
that provides customers with access to products, insights and experiences
that enrich lives and build business success. The CORE mortgage
origination system will drive needed improvements in our end-to-end
approach to real estate-secured lending, improve the overall customer
experience and achieve our vision to satisfy all of our customers financial
needs.
Responsibilities:
Designed and developed the application using Agile methodology
Worked on implementation and maintenance of Cloudera Hadoop
cluster.
Load and transform large sets of structured, semi structured and
unstructured data using Hadoop/Big Data concepts.
Analyzed data using Hadoop components Hive and Pig.
Assisted in upgrading, configuration and maintenance of various
Hadoop infrastructures like Pig, Hive, and HBase.
Extensively used Pig for data cleaning and optimization.
Wrote SQL queries, stored procedures, modifications to existing

database structure as required for addition of new features using


MySQL database
Extracted the data from Teradata into HDFS using the Sqoop.
Responsible for designing and managing the Sqoop jobs that
uploaded the data from Oracle to HDFS and Hive.
Developed Hive queries to analyze data and generate results.
Creating Hive tables to import large data sets from various relational
databases using Sqoop and export the analyzed data back for
visualization and report generation by the BI team.
Assisted Oracle DB development team in developing stored
procedures and designing the database. Used Toad for running the
SQL queries and developing Stored Procedures
Monitored workload, job performance and capacity planning using
Cloud era Manager.
Used Hadoop FS scripts for HDFS (Hadoop File System) data loading
and manipulation.
Created and maintained Technical documentation for launching
HADOOP Clusters and for executing Hive queries and Pig Scripts.

Environment: MySQL, Redhat Linux, HDFS, Hive, Pig, SQL Server, Sqoop,
Oracle, Linux, Agile.
Tavant
Technologies

Bangalore,
India
Sep 2010 to Dec 2010
Role: Database Developer
Project Description: This Information system helps the bank to collect
money from the defaulters. It receives the defaulter's data as text file
format from Vision plus system. It will be loaded into branch system based
on the area code. All the text files will be loaded into a particular
directory, at the end of every day.
Responsibilities:
Developed PL/SQL procedures, customizing existing programs
according to the needs of the client testing.
Implemented PL/SQL packages consisting of procedures and
functions.
Created database objects like tables, views, procedures, packages
using Oracle, PL/SQL and SQL server
Involved in the data transfer and creating tables from various tables,
coding using PL/SQL, packages, stored procedures and triggers.
Performed review of the business process, involved in Requirements
Analysis, System Design Documents, Flows, Test Plan preparations
and Development of Business Process / Work Flow charts.
Involved in development and testing of oracle back-end objects like
database triggers, stored procedures, Sequences and Synonyms.
Involved in design, development and Modification of PL/SQL stored
procedures, functions, packages and triggers to implement business
rules into the application.

Involved in tuning the SQL queries create database objects and


optimally set storage parameters for tables and indexes.
Designed the reports using Reports as per user requirements.
Created References and Master/Detail tables to store information.
Tested forms and reports using test data

Environment: Oracle, SQL Server, PL/SQL

You might also like