General purpose computing on GPUs
Historically, GPUs were designed and used to render high-resolution graphics such as for video games. To be able to render millions of pixels every second, GPUs utilize a highly parallel architecture that specializes in the types of computations required to render graphics. At a high level, the architecture of a GPU is similar to that of a CPU—it has its own multi-core processor and memory. However, because GPUs are not designed for general computation, individual cores are much simpler with slower clock speeds and limited support for complex instructions, compared to CPUs. In addition, they typically have less RAM than CPUs. To achieve real-time rendering, most GPU computations are done in a highly parallel manner, with many more cores than CPUs—a modern GPU might have more than 2,000 cores. Given that one core can run multiple threads, it is possible to run tens of thousands of parallel threads on a GPU.
In 1990s, programmers began to realize that certain...