Newer posts are loading.
You are at the newest post.
Click here to check if anything new just came in.

December 10 2013

Four short links: 10 December 2013

  1. ArangoDBopen-source database with a flexible data model for documents, graphs, and key-values. Build high performance applications using a convenient sql-like query language or JavaScript extensions.
  2. Google’s Seven Robotics Companies (IEEE) — The seven companies are capable of creating technologies needed to build a mobile, dexterous robot. Mr. Rubin said he was pursuing additional acquisitions. Rundown of those seven companies.
  3. Hebel (Github) — GPU-Accelerated Deep Learning Library in Python.
  4. What We Learned Open Sourcing — my eye was caught by the way they offered APIs to closed source code, found and solved performance problems, then open sourced the fixed code.

July 01 2013

Four short links: 1 July 2013

  1. Web Traffic VisualizationDots enter when transactions start and exit when completed. Their speed is proportional to client’s response time while their size reflects the server’s contribution to total time. Color comes from the specific request. (via Nelson Minar)
  2. Complete Guide to Being Interviewed on TV (Quartz) — good preparation for everyone who runs the risk of being quoted for 15 seconds.
  3. Harlan (GitHub) — new language for GPU programming. Simple examples in the announcement. (via Michael Bernstein)
  4. Open Fitopen source software that investigates several approaches to generating custom tailored pants patterns. Open Fit Lab is an attempt to use this software for on-the-spot generation and creation of custom clothes. (via Kaitlin Thaney)

December 13 2010

Strata Gems: Use GPUs to speed up calculation

We're publishing a new Strata Gem each day all the way through to December 24. Yesterday's Gem: The emerging marketplace for social data. Early-bird pricing on Strata closes December 14: don't forget to register!

Strata 2011 The release in November of Amazon Web Services' Cluster GPU instances highlights the move to the mainstream of Graphics Processing Units (GPUs) for general purpose calculation. Graphical applications require very fast matrix transformations, for which GPUs are optimized. Boards such as the NVIDIA Tesla offer hundreds of processor cores all able to work in parallel.

While debate is ongoing about the exact range of performance boost available by using GPUs, reports indicate that speedups over CPUs from 2.5 to 15x can be obtained for calculation-heavy applications.

NVidia has led the trend for general purpose computing on GPUs with the Compute Unified Device Architecture (CUDA). By using extensions to the C programming language, developers can write code that executes on the GPU, mixed in with code running on the CPU.

NVIDIA's Tesla M2050 GPU Computing Module

While CUDA is NVIDIA-only, OpenCL (Open Computing Language) is a standard for cross-platform general parallel programming. Originated by Apple and AMD, it is now developed with cross industry participation. ATI and NVIDIA are among those offer OpenCL support for their products.

Now with Amazon's support for GPU clusters, it's easier than ever to start accessing the power of GPUs for data analysis.
OpenCL and CUDA bindings exist for many popular programming languages, including Java, Python and C++, and
the R+GPU project gives GPU access for the R statistical package.

To get a quick impression of what GPU code looks like, check out this example from the Python OpenCL bindings. The code to execute on the GPU is called out in bold text.

import pyopencl as cl
import numpy
import numpy.linalg as la

a = numpy.random.rand(50000).astype(numpy.float32)
b = numpy.random.rand(50000).astype(numpy.float32)

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

mf = cl.mem_flags
a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b)
dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b.nbytes)

prg = cl.Program(ctx, """
__kernel void sum(__global const float *a,
__global const float *b, __global float *c)
int gid = get_global_id(0);
c[gid] = a[gid] + b[gid];


prg.sum(queue, a.shape, None, a_buf, b_buf, dest_buf)

a_plus_b = numpy.empty_like(a)
cl.enqueue_read_buffer(queue, dest_buf, a_plus_b).wait()

print la.norm(a_plus_b - (a+b))

Amazon's Werner Vogels will be among the keynote speakers at Strata.

Older posts are this way If this message doesn't go away, click anywhere on the page to continue loading posts.
Could not load more posts
Maybe Soup is currently being updated? I'll try again automatically in a few seconds...
Just a second, loading more posts...
You've reached the end.

Don't be the product, buy the product!