0% found this document useful (0 votes)

94 views9 pages

CSE5006 Multicore-Architectures ETH 1 AC41

This course provides knowledge on multicore architectures and parallel programming. It covers topics such as evolution of multicore processors, parallel programming models, OpenMP, CUDA, performance analysis tools, and trends in high-performance computing. The course aims to enable students to design and develop parallel applications and analyze their performance using tools like OpenMP, CUDA, VTune and MKL.

Uploaded by

;(

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

94 views9 pages

CSE5006 Multicore-Architectures ETH 1 AC41

Uploaded by

;(

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Multicore Architectures

Multicore Architectures L,T,P,J,C

Subject Code:
2,0,0,4,3
Preamble This course is to provide knowledge on multicore architectures that
lays the foundation for the development of High Performance
Applications through OpenMP, CUDA parallel programming
platforms. It enables to analyse the performance of HPC applications
using various profiling tools

Objective of the course

 To provide knowledge on basics of Multicore architectures
 To understand concepts of parallel computers and its programming
models
 To design and develop parallel programs
 To practice parallel programming using OpenMP, CUDA parallel
programming platforms
 To apply program optimizations on parallel programs
 To analyse the performance using profiling tools
 To explore various contemporary tools and recent trends in field of
multicore architectures

Expected Outcome After successfully completing the course the student should be able to

1) Describe various parallel programming models

2) Design and develop High Performance Applications using
contemporary tools
3) Improve performance of applications through program
optimizations
4) Analyse performance of parallel applications

Student Learning 2. Having a clear understanding of the subject related concepts and of
Outcome contemporary issues

11. Having interest in lifelong learning

14. Having an ability to design and conduct experiments, as well as to

analyze and interpret data
Module Topics L Hrs SLO
1 Introduction to Multi-Core Architectures
Evolution of multicores through Moor's Law, Comparisons of 2 2
single core, multi-core, multi-processing and hyper threading
2 Parallel Computers and programming
Threading Concepts, Communication Architectures and
Communication Costs, Thread Level Parallelism(TLP),
Instruction Level Parallelism(ILP), Comparisons, Cache 5 2
Hierarchy and Memory-level Parallelism, Cache Coherence,
Parallel programming models, Shared Memory and Message
Passing, Vectorization
3 OpenMP programming (Open multi-processing)
Introduction to OpenMP, Parallel constructs, Runtime Library
routines, Work-sharing constructs, Scheduling clauses, Data 5 2
environment clauses, atomic, master Nowait Clause, Barrier
Construct

4 CUDA Programming(Compute Unified Device Architecture)

Introduction to GPU Computing, CUDA Programming Model,
CUDA API, Simple Matrix, Multiplication in CUDA , CUDA 6 2
Memory Model, Shared Memory Matrix Multiplication,
Additional CUDA API Features

5 Performance Analysers
Trace analyzer and collector (ITAC), VTune Amplifier XE,
Energy Efficient Performance, Integrated Performance Primitives 4 14
(IPP)
6 Contemporary tools
MKL (Math Kernel Library), Threading Building Blocks, CUDA 3 14
Tools
7 HTC and MTC
HTC (High Throughput Computing), MTC (Many Task Computing),
3 14
Top 500 Super computers in the world, Top 10 Super Computer
architectural details, Exploring Linpack.
8 Recent Trends 2 11
30

Project 60
[Non 17
Contact
Projects may be given as group projects hrs]
Design and development of High Performance applications through parallel
programming platforms in the following areas
Network Security
Data Compression
Image Processing
Bio-Medical
Information retrieval
Natural Language Processing
Health care Applications

Reference Books

1. Rob Farber, “CUDA Application Design and Development”, Morgan Kaufmann

Publishers, 2013
2. Shameem Akhter and Jason Roberts, “Multi-Core Programming”, 1st edition, Intel Press,
2012
3. Cameron Hughes, Tracey Hughes, “Professional Multicore Programming Design and
Implementation for C++ Developers”, Wiley, 2008
4. Robert Oshana, “Multicore Software Development Techniques: Applications, Tips, and
Tricks”, Newnes,1 edition, 2015
5. David B. Kirk , Wen-mei W. Hwu, “Programming Massively Parallel Processors: A
Hands-on Approach (Applications of GPU Computing Series)”, 1st edition, Morgan
Kaufmann, 2010.
Multicore Architectures
Knowledge areas that contain topics and learning outcomes covered in the course

Knowledge Area Total Hours of Coverage

Systems Fundamentals (SF) 30

Body of Knowledge coverage

[List the Knowledge Units covered in whole or in part in the course. If in part, please indicate
which topics and/or learning outcomes are covered. For those not covered, you might want to
indicate whether they are covered in another course or not covered in your curriculum at all.
This section will likely be the most time-consuming to complete, but is the most valuable for
educators planning to adopt the CS2013 guidelines.]

KA Knowledge Unit Topics Covered Hours

SF Computer Evolution of multicores through Moor's Law, 2

Organization Comparisons of single core, multi-core, multi-
processing and hyper threading.

SF Parallelism Threading Concepts, Communication Architectures 5

and Communication Costs, TLP, ILP, Comparisons,
Cache Hierarchy and Memory-level Parallelism,
Cache Coherence, Parallel programming models,
Shared Memory and Message Passing, Vectorization

SF Parallel Programming Introduction to OpenMP, Parallel constructs, 5

Language Runtime Library routines, Work-sharing constructs,
Scheduling clauses, Data environment clauses,
atomic, master Nowait Clause, Barrier Construct

Heterogeneous Introduction to GPU Computing, CUDA 6

architecture and its Programming Model, CUDA API, Simple Matrix,
programming Multiplication in CUDA , CUDA Memory Model,
Shared Memory Matrix Multiplication, Additional
CUDA API Features

SF Program Analyzer Trace analyzer and collector (ITAC), VTune 4

Amplifier XE, Energy Efficient Performance,
Integrated Performance Primitives (IPP)
SF Contemporary Tools MKL (Math Kernel Library), Threading Building 3
Blocks, CUDA Tools

SF HTC and MTC HTC (High Throughput Computing), MTC (Many Task 3
Computing), Top 500 Super computers in the world, Top
10 Super Computer architectural details, Exploring
Linpack.

SF Recent Trends HTC (High Throughput Computing), MTC (Many Task 2

Computing), Top 500 Super computers in the world, Top
10 Super Computer architectural details, Exploring
Linpack.

Total hours 30

What is covered in the course?

[A short description, and/or a concise list of topics - possibly from your course syllabus.(This is
likely to be your longest answer)]

Module 1: Introduction to Multi-Core architecture

Evolution of multi-cores through Moor's Law, Comparisons of single core, multi-core, multi-
processing and hyper threading.

Module 2: Parallel Computers & its programming

Threading Concepts, Communication Architectures and Communication Costs, TLP, ILP,
Comparisons, Cache Hierarchy and Memory-level Parallelism, Cache Coherence, Parallel
programming models, Shared Memory and Message Passing, Vectorization

Module 3: OpenMP Programming

Introduction to OpenMP, Parallel constructs, Runtime Library routines, Work-sharing constructs,
Scheduling clauses, Data environment clauses, atomic, master Nowait Clause, Barrier Construct

Module 4: CUDA Programming

Introduction to GPU Computing, CUDA Programming Model, CUDA API, Simple Matrix,
Multiplication in CUDA , CUDA Memory Model, Shared Memory Matrix Multiplication,
Additional CUDA API Features
Module 5: Performance Analysers
Trace analyzer and collector (ITAC), VTune Amplifier XE, Energy Efficient Performance,
Integrated Performance Primitives (IPP)

Module 6: Contemporary Tools

MKL (Math Kernel Library), Threading Building Blocks, CUDA Tools

Module 7: HTC and MTC

HTC (High Throughput Computing), MTC (Many Task Computing), Top 500 Super computers in the
world, Top 10 Super Computer architectural details, Exploring Linpack.

Module 8: Recent trends

What is the format of the course?

[Is it face to face, online or blended? How many contact hours? Does it have lectures, lab
sessions, discussion classes?]

This Course is designed with 100 minutes of in-classroom sessions per week as well as 200
minutes of non-contact time spent on implementing course related project.. Generally this
course should have the combination of lectures, in-class discussion, case studies, guest-lectures,
mandatory off-class reading material, assignments and quizzes.

How are students assessed?

[What type, and number, of assignments are students are expected to do? (papers, problem sets,
programming projects, etc.). How long do you expect students to spend on completing assessed
work?]

 Students are assessed based on group activities, classroom discussion, assignments, quiz,
projects, continuous (CAT) assessment test, and final assessment test.

Additional topics
[List notable topics covered in the course that you do not find in the CS2013 Body of
Knowledge]

Cuda Programming, Top 10 Super Computers in the world and Benchmarks

Other comments
Nil
Session wise plan
Student Outcomes Covered: 2, 5, 9, 17

Sl. Class Hour Topic Covered levels of Reference Remarks

No. mastery Book
1 2 Evolution of Usage 2
multi-cores
through Moor's
Law,
Comparisons of
single core,
multi-core,
multi-processing
and hyper
threading.

2 5 Threading Usage 2
Concepts,
Communication
Architectures
and
Communication
Costs, TLP, ILP,
Comparisons,
Cache Hierarchy
and Memory-
level
Parallelism,
Cache
Coherence,
Parallel
programming
models, Shared
Memory and
Message
Passing,
Vectorization

3 5 Introduction to Usage 2 Assignments

OpenMP,
Parallel
constructs,
Runtime Library
routines, Work-
sharing
constructs,
Scheduling
clauses, Data
environment
clauses, atomic,
master Nowait
Clause, Barrier
Construct

4 6 Introduction to Usage 1 Assignments

GPU
Computing,
CUDA
Programming
Model, CUDA
API, Simple
Matrix,
Multiplication in
CUDA , CUDA
Memory Model,
Shared Memory
Matrix
Multiplication,
Additional
CUDA API
Features

5 4 Trace analyzer Usage 1

and collector
(ITAC), VTune
Amplifier XE,
Energy Efficient
Performance,
Integrated
Performance
Primitives (IPP)

6 3 MKL (Math Usage 2

Kernel Library),
Threading
Building Blocks,
CUDA Tools

7 3 HTC (High Familiarity

Throughput
Computing),
MTC (Many Task
Computing), Top
500 Super
computers in the
world, Top 10
Super Computer
architectural
details, Exploring
Linpack.

8 2 Recent Trends
30 Hours

Csi3021 Advanced-computer-Architecture TH 1.0 66 Csi3021 61 Acp
No ratings yet
Csi3021 Advanced-computer-Architecture TH 1.0 66 Csi3021 61 Acp
2 pages
Prebook MCAP
No ratings yet
Prebook MCAP
11 pages
Parallel Programming FDP
No ratings yet
Parallel Programming FDP
43 pages
Parallel Computing Course Guide
No ratings yet
Parallel Computing Course Guide
2 pages
Gujarat Technological University: W.E.F. AY 2018-19
No ratings yet
Gujarat Technological University: W.E.F. AY 2018-19
3 pages
CS6801 MCAP-Lesson Plan - Regulation-2013
No ratings yet
CS6801 MCAP-Lesson Plan - Regulation-2013
5 pages
Advanced Computer Architecture Fall 2019 Multithreaded Architectures
No ratings yet
Advanced Computer Architecture Fall 2019 Multithreaded Architectures
31 pages
M.Tech (MVD)
No ratings yet
M.Tech (MVD)
2 pages
Handbook HPC 23-24
No ratings yet
Handbook HPC 23-24
18 pages
Parallel Computing LessonPlan
No ratings yet
Parallel Computing LessonPlan
10 pages
22ECT505D - Multicore Architecture
No ratings yet
22ECT505D - Multicore Architecture
2 pages
Parallel
No ratings yet
Parallel
4 pages
ME Lab - II Sem Syllabus
No ratings yet
ME Lab - II Sem Syllabus
6 pages
HPC Syllabus
No ratings yet
HPC Syllabus
2 pages
Professional Electives Semester Ii, Elective I
No ratings yet
Professional Electives Semester Ii, Elective I
16 pages
High Performance Computing Labs & Concepts
No ratings yet
High Performance Computing Labs & Concepts
5 pages
COSC 4101 Parallel and Distributed Computing Final
100% (1)
COSC 4101 Parallel and Distributed Computing Final
4 pages
Cours 1
No ratings yet
Cours 1
38 pages
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
No ratings yet
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
2 pages
HPC Unit 1
No ratings yet
HPC Unit 1
65 pages
Multi Core Architectures and Programming
No ratings yet
Multi Core Architectures and Programming
10 pages
CP4292 Syllabus
No ratings yet
CP4292 Syllabus
4 pages
Cours 1
No ratings yet
Cours 1
38 pages
cs6801 Syllabus
No ratings yet
cs6801 Syllabus
1 page
Lec6 - TLP Data Dependence Solutions
No ratings yet
Lec6 - TLP Data Dependence Solutions
20 pages
PDC Lecture 01
No ratings yet
PDC Lecture 01
36 pages
Mcse503l - Computer-Architecture-And-Organisation - TH - 1.0 - 71 - Mcse503l - 67 Acp
No ratings yet
Mcse503l - Computer-Architecture-And-Organisation - TH - 1.0 - 71 - Mcse503l - 67 Acp
2 pages
GPU Programming Slides 1
No ratings yet
GPU Programming Slides 1
33 pages
If3102 High Performance Computing
No ratings yet
If3102 High Performance Computing
2 pages
HPC Unit 2
No ratings yet
HPC Unit 2
72 pages
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
No ratings yet
Parallel Processors From Client To Cloud: Omputer Rganization and Esign
43 pages
217 Lec1
No ratings yet
217 Lec1
35 pages
CS-6712 - Grid and Cloud Computing Lab Syllabus PDF
No ratings yet
CS-6712 - Grid and Cloud Computing Lab Syllabus PDF
2 pages
Ca LP
No ratings yet
Ca LP
6 pages
HPC - Unit-1 Insem Notes
No ratings yet
HPC - Unit-1 Insem Notes
76 pages
Introduction to Parallel Computing
No ratings yet
Introduction to Parallel Computing
57 pages
Advanced Computer Architecture Course
No ratings yet
Advanced Computer Architecture Course
2 pages
Lecture ParallelArchTLP-DLP
No ratings yet
Lecture ParallelArchTLP-DLP
52 pages
Parallel ProgrammingSyllabus
No ratings yet
Parallel ProgrammingSyllabus
2 pages
EAS 520 UmassD Syllabus Sheer
No ratings yet
EAS 520 UmassD Syllabus Sheer
2 pages
Introduction To Parallel Programming
No ratings yet
Introduction To Parallel Programming
129 pages
Arallel Rocessing NIT
No ratings yet
Arallel Rocessing NIT
44 pages
ECCS 4312 - Parallel Computer Architectures & Programming
No ratings yet
ECCS 4312 - Parallel Computer Architectures & Programming
11 pages
Parallel Computing Course Guide
100% (1)
Parallel Computing Course Guide
49 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
47 pages
Aca
No ratings yet
Aca
13 pages
High Performance Computing Lecture 1 HPC Public
No ratings yet
High Performance Computing Lecture 1 HPC Public
50 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
Introduction To Gpgpu and Parallel Computing (Gpu Architecture and Cuda Programming Models)
No ratings yet
Introduction To Gpgpu and Parallel Computing (Gpu Architecture and Cuda Programming Models)
4 pages
ECE690 Syllabus
No ratings yet
ECE690 Syllabus
2 pages
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
100% (2)
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
506 pages
PDC CS428 (For Website)
No ratings yet
PDC CS428 (For Website)
5 pages
Multicore Architecture
No ratings yet
Multicore Architecture
159 pages
Intro GPUs
No ratings yet
Intro GPUs
36 pages
Parralel 01
No ratings yet
Parralel 01
38 pages
Cp4292 Multicore Lab Multicore Lab
No ratings yet
Cp4292 Multicore Lab Multicore Lab
41 pages
Cuda Mode Lecture2
No ratings yet
Cuda Mode Lecture2
33 pages
Chapter 8
No ratings yet
Chapter 8
58 pages
DigitalLogic ComputerOrganization L23 Multicore Handout
No ratings yet
DigitalLogic ComputerOrganization L23 Multicore Handout
32 pages
Q
No ratings yet
Q
3 pages
NoSQL Databases Course Guide
No ratings yet
NoSQL Databases Course Guide
10 pages
4
No ratings yet
4
8 pages
CSE3016 Computer-Graphics-and-Multimedia ETH 1 AC44
No ratings yet
CSE3016 Computer-Graphics-and-Multimedia ETH 1 AC44
3 pages
E
No ratings yet
E
3 pages
E
No ratings yet
E
3 pages
C
No ratings yet
C
2 pages
J
No ratings yet
J
1 page
Autodesk Advance Steel 2015 Multilenguaje Espaol 64 Bit
No ratings yet
Autodesk Advance Steel 2015 Multilenguaje Espaol 64 Bit
1 page
I/A Series Product Specifications Hardware and Software: PSS 21A-0A2 A1
No ratings yet
I/A Series Product Specifications Hardware and Software: PSS 21A-0A2 A1
12 pages
Lab 12
No ratings yet
Lab 12
8 pages
What Is Operating System - Explain Types of OS, Features and Examples
No ratings yet
What Is Operating System - Explain Types of OS, Features and Examples
8 pages
Se Lab CD It
No ratings yet
Se Lab CD It
11 pages
Mini PCs and Linux: Perfect Pairing
No ratings yet
Mini PCs and Linux: Perfect Pairing
11 pages
Computer Science
No ratings yet
Computer Science
18 pages
Manual For Yashasvi Scheme at Institute Level L-1
No ratings yet
Manual For Yashasvi Scheme at Institute Level L-1
9 pages
IMK - Slide PPT Minggu 3
No ratings yet
IMK - Slide PPT Minggu 3
61 pages
Class Hand Out Be S 473699 L Mihai Sandu
No ratings yet
Class Hand Out Be S 473699 L Mihai Sandu
79 pages
SQL Server 2000 and AccuMark
No ratings yet
SQL Server 2000 and AccuMark
18 pages
Cucumber
No ratings yet
Cucumber
28 pages
SAP Functional Specification Guide
No ratings yet
SAP Functional Specification Guide
10 pages
Comdex Informatics Practices, Class 12
No ratings yet
Comdex Informatics Practices, Class 12
2 pages
Oci Vs Aws
No ratings yet
Oci Vs Aws
9 pages
Sia PDF
No ratings yet
Sia PDF
55 pages
(8M16) How To Get Tinder Gold For Free-How To Download Tinder Gold Free
No ratings yet
(8M16) How To Get Tinder Gold For Free-How To Download Tinder Gold Free
2 pages
Salesforce Integration Interview Questions
No ratings yet
Salesforce Integration Interview Questions
4 pages
WAVE 5.15 WTC DeploymentGuide
No ratings yet
WAVE 5.15 WTC DeploymentGuide
56 pages
Pivotal Rabbitmq Jms Client
No ratings yet
Pivotal Rabbitmq Jms Client
32 pages
IntelliSpace Radiology 4.7 Enterprise SUBI Integration Guide
No ratings yet
IntelliSpace Radiology 4.7 Enterprise SUBI Integration Guide
26 pages
SE 101 Software Engineering 2: Unit
No ratings yet
SE 101 Software Engineering 2: Unit
4 pages
Img 0001 PDF
No ratings yet
Img 0001 PDF
1 page
Configuring An Activity Guide in PeopleSoft 9.2 1
No ratings yet
Configuring An Activity Guide in PeopleSoft 9.2 1
9 pages
Ms Word MCQ
No ratings yet
Ms Word MCQ
25 pages
IBM System Storage DS8000 Architecture and Implementation
No ratings yet
IBM System Storage DS8000 Architecture and Implementation
656 pages
ES&RTOS Unit-2
No ratings yet
ES&RTOS Unit-2
45 pages
LVM Linux
No ratings yet
LVM Linux
3 pages
BBB
No ratings yet
BBB
199 pages
VAIBHAV PANDEY CS Project
No ratings yet
VAIBHAV PANDEY CS Project
21 pages

CSE5006 Multicore-Architectures ETH 1 AC41

Uploaded by

CSE5006 Multicore-Architectures ETH 1 AC41

Uploaded by

Multicore Architectures

Multicore Architectures L,T,P,J,C

Objective of the course

1) Describe various parallel programming models

11. Having interest in lifelong learning

14. Having an ability to design and conduct experiments, as well as to

4 CUDA Programming(Compute Unified Device Architecture)

1. Rob Farber, “CUDA Application Design and Development”, Morgan Kaufmann

Knowledge Area Total Hours of Coverage

Systems Fundamentals (SF) 30

Body of Knowledge coverage

KA Knowledge Unit Topics Covered Hours

SF Computer Evolution of multicores through Moor's Law, 2

SF Parallelism Threading Concepts, Communication Architectures 5

SF Parallel Programming Introduction to OpenMP, Parallel constructs, 5

Heterogeneous Introduction to GPU Computing, CUDA 6

SF Program Analyzer Trace analyzer and collector (ITAC), VTune 4

SF Recent Trends HTC (High Throughput Computing), MTC (Many Task 2

What is covered in the course?

Module 1: Introduction to Multi-Core architecture

Module 2: Parallel Computers & its programming

Module 3: OpenMP Programming

Module 4: CUDA Programming

Module 6: Contemporary Tools

Module 7: HTC and MTC

Module 8: Recent trends

What is the format of the course?

How are students assessed?

Cuda Programming, Top 10 Super Computers in the world and Benchmarks

Sl. Class Hour Topic Covered levels of Reference Remarks

3 5 Introduction to Usage 2 Assignments

4 6 Introduction to Usage 1 Assignments

5 4 Trace analyzer Usage 1

6 3 MKL (Math Usage 2

7 3 HTC (High Familiarity

You might also like