CUDA for Engineers: An Introduction to High-Performance Parallel Computing

Duane Storti / Mete Yurtoglu  
Total pages
November 2015
Related Titles

Product detail

Product Price CHF Available  
CUDA for Engineers: An Introduction to High-Performance Parallel Computing
50.30 not defined


Ideal for students with at least introductory programming experience, this tutorial presents examples and reusable C code to jumpstart a wide variety of applications. Students will walk through moving from serial to parallel computation; computing values of a function in parallel; understanding 2D parallelism; simulating dynamics in the phase plane; simulating heat conduction; interacting with 3D data; walking through a basic N-body simulation, and more.


  • Working examples show how to bring low-cost, high-performance parallel computing to engineering and scientific applications
  • Includes easy-to-understand, fully-tested code for all examples
  • For students with at least introductory programming experience
  • Provides CUDA training that can significantly improve an engineer's job market readiness

Table of Contents

Acknowledgments            xvii

About the Authors             xix


Introduction          1

What Is CUDA?     1

What Does “Need-to-Know” Mean for Learning CUDA?     2

What Is Meant by “for Engineers”?     3

What Do You Need to Get Started with CUDA?      4

How Is This Book Structured?      4

Conventions Used in This Book      8

Code Used in This Book      8

User’s Guide      9

Historical Context      10

References      12


Chapter 1: First Steps            13

Running CUDA Samples      13

Running Our Own Serial Apps      19

Summary      22

Suggested Projects      23


Chapter 2: CUDA Essentials           25

CUDA’s Model for Parallelism     25

Need-to-Know CUDA API and C Language Extensions     28

Summary      31

Suggested Projects      31

References      31


Chapter 3: From Loops to Grids           33

Parallelizing dist_v1    33

Parallelizing dist_v2      38

Standard Workflow      42

Simplified Workflow      43

Summary      47

Suggested Projects      48

References      48


Chapter 4: 2D Grids and Interactive Graphics           49

Launching 2D Computational Grids      50

Live Display via Graphics Interop     56

Application: Stability      66

Summary      76

Suggested Projects      76

References      77


Chapter 5: Stencils and Shared Memory           79

Thread Interdependence      80

Computing Derivatives on a 1D Grid      81

Summary     117

Suggested Projects      118

References      119


Chapter 6: Reduction and Atomic Functions          121

Threads Interacting Globally      121

Implementing parallel_dot      123

Computing Integral Properties: centroid_2d      130

Summary      138

Suggested Projects      138

References       138


Chapter 7: Interacting with 3D Data           141

Launching 3D Computational Grids: dist_3d     144

Viewing and Interacting with 3D Data: vis_3d      146

Summary      171

Suggested Projects     171

References     171


Chapter 8: Using CUDA Libraries           173

Custom versus Off-the-Shelf      173

Thrust      175

cuRAND      190

NPP      193

Linear Algebra Using cuSOLVER and cuBLAS      . 201

cuDNN      207

ArrayFire      207

Summary      207

Suggested       208

References     209


Chapter 9: Exploring the CUDA Ecosystem            211

The Go-To List of Primary Sources      211

Further Sources      217

Summary      218

Suggested Projects     219


Appendix A: Hardware Setup           221

Checking for an NVIDIA GPU: Windows      221

Checking for an NVIDIA GPU: OS X     222

Checking for an NVIDIA GPU: Linux     223

Determining Compute Capability      223

Upgrading Compute Capability      225


Appendix B: Software Setup            229

Windows Setup     229

OS X Setup      238

Linux Setup      240


Appendix C: Need-to-Know C Programming          245

Characterization of C     245

C Language Basics      246

Data Types, Declarations, and Assignments      248

Defining Functions      250

Building Apps: Create, Compile, Run, Debug      251

Arrays, Memory Allocation, and Pointers      262

Control Statements: for, if      263

Sample C Programs     267

References     277


Appendix D: CUDA Practicalities: Timing, Profiling, Error Handling, and Debugging            279

Execution Timing and Profiling      279

Error Handling     292

Debugging in Windows      298

Debugging in Linux     305


Using Visual Studio Property Pages      309

References     312


Index            313



Duane Storti is a professor of mechanical engineering at the University of Washington in Seattle. He has thirty-five years of experience in teaching and research in the areas of engineering mathematics, dynamics and vibrations, computer-aided design, 3D printing, and applied GPU computing.


Mete Yurtoglu is currently pursuing an M.S. in applied mathematics and a Ph.D. in mechanical engineering at the University of Washington in Seattle. His research interests include GPU-based methods for computer vision and machine learning.