CUDA for Engineers: An Introduction to High-Performance Parallel Computing

Duane Storti / Mete Yurtoglu  
Total pages
November 2015
Related Titles

Product detail

Product Price CHF Available  
CUDA for Engineers: An Introduction to High-Performance Parallel Computing
50.30 approx. 7-9 days


Ideal for students with at least introductory programming experience, this tutorial presents examples and reusable C code to jumpstart a wide variety of applications. Students will walk through moving from serial to parallel computation; computing values of a function in parallel; understanding 2D parallelism; simulating dynamics in the phase plane; simulating heat conduction; interacting with 3D data; walking through a basic N-body simulation, and more.


  • Working examples show how to bring low-cost, high-performance parallel computing to engineering and scientific applications
  • Includes easy-to-understand, fully-tested code for all examples
  • For students with at least introductory programming experience
  • Provides CUDA training that can significantly improve an engineer's job market readiness

Table of Contents

Acknowledgments    xvii

About the Authors xix


Introduction 1

What Is CUDA?   1

What Does “Need-to-Know” Mean for Learning CUDA?   2

What Is Meant by “for Engineers”?   3

What Do You Need to Get Started with CUDA?    4

How Is This Book Structured?    4

Conventions Used in This Book    8

Code Used in This Book    8

User’s Guide    9

Historical Context    10

References    12


Chapter 1: First Steps    13

Running CUDA Samples    13

Running Our Own Serial Apps    19

Summary    22

Suggested Projects    23


Chapter 2: CUDA Essentials   25

CUDA’s Model for Parallelism   25

Need-to-Know CUDA API and C Language Extensions   28

Summary    31

Suggested Projects    31

References    31


Chapter 3: From Loops to Grids  33

Parallelizing dist_v1 33

Parallelizing dist_v2    38

Standard Workflow    42

Simplified Workflow    43

Summary    47

Suggested Projects    48

References    48


Chapter 4: 2D Grids and Interactive Graphics   49

Launching 2D Computational Grids    50

Live Display via Graphics Interop   56

Application: Stability    66

Summary    76

Suggested Projects    76

References    77


Chapter 5: Stencils and Shared Memory   79

Thread Interdependence    80

Computing Derivatives on a 1D Grid    81

Summary   117

Suggested Projects    118

References    119


Chapter 6: Reduction and Atomic Functions 121

Threads Interacting Globally    121

Implementing parallel_dot    123

Computing Integral Properties: centroid_2d    130

Summary    138

Suggested Projects    138

References 138


Chapter 7: Interacting with 3D Data   141

Launching 3D Computational Grids: dist_3d   144

Viewing and Interacting with 3D Data: vis_3d    146

Summary    171

Suggested Projects   171

References   171


Chapter 8: Using CUDA Libraries   173

Custom versus Off-the-Shelf    173

Thrust    175

cuRAND    190

NPP    193

Linear Algebra Using cuSOLVER and cuBLAS    . 201

cuDNN    207

ArrayFire    207

Summary    207

Suggested 208

References   209


Chapter 9: Exploring the CUDA Ecosystem    211

The Go-To List of Primary Sources    211

Further Sources    217

Summary    218

Suggested Projects   219


Appendix A: Hardware Setup   221

Checking for an NVIDIA GPU: Windows    221

Checking for an NVIDIA GPU: OS X   222

Checking for an NVIDIA GPU: Linux   223

Determining Compute Capability    223

Upgrading Compute Capability    225


Appendix B: Software Setup    229

Windows Setup   229

OS X Setup    238

Linux Setup    240


Appendix C: Need-to-Know C Programming 245

Characterization of C   245

C Language Basics    246

Data Types, Declarations, and Assignments    248

Defining Functions    250

Building Apps: Create, Compile, Run, Debug    251

Arrays, Memory Allocation, and Pointers    262

Control Statements: for, if   263

Sample C Programs  267

References  277


Appendix D: CUDA Practicalities: Timing, Profiling, Error Handling, and Debugging   279

Execution Timing and Profiling   279

Error Handling  292

Debugging in Windows   298

Debugging in Linux  305


Using Visual Studio Property Pages   309

References  312


Index   313



Duane Storti is a professor of mechanical engineering at the University of Washington in Seattle. He has thirty-five years of experience in teaching and research in the areas of engineering mathematics, dynamics and vibrations, computer-aided design, 3D printing, and applied GPU computing.


Mete Yurtoglu is currently pursuing an M.S. in applied mathematics and a Ph.D. in mechanical engineering at the University of Washington in Seattle. His research interests include GPU-based methods for computer vision and machine learning.