Many minions! Think about it. If you have a ton of work to do, and you want to get it done as quickly as possible, you go sucker a bunch of friends to help you. Welcome to the world of parallelization. Sounds simple right? Well, yes and no.
Lets start out with a simple problem. Remember when your teacher had you switch your quiz with a classmate and grade each other's? They had n tasks to do, and they were able to divide it between n workers. This is a perfect storm. It rarely happens, but it provides a good base line for this discussion.
Embarassingly Parallel
So lets say we want to calculate the curve for a polynomial function:
y=x^2
with this problem, each value has no data dependencies. No value depends on a previous value. So we can give each calculation to a separate worker.
A computer architect by the name of Gene Amdahl came up with a measure for how much speedup you can get by parallelizing a problem. Its called Amdahl's Law.
Given:
- P, The proportion of the problem that can run in parallel
- N, The number of workers
The speedup is calculated as follows:
S(N)=1/((1-P)+P/N)
So, assuming we have an infinite number of workers, as we parallize more and more of the problem, our speedup also approaches infinite.
for our problem above, let's say our program has a 1% overhead when it comes to dividing up the work. Our speeup with infinite workers would then be:
S(I)=1/((1-.99)+P/I) = 1/.01 = 100
The problem above is whats termed embarassingly parallel.
Not so embarrassing
So what problems are harder? Let's look at the Mandelbrot Set.
Don't worry about understanding the math. The problem with calculating the set is that each value in a row of values is dependent on the value before it in the row. So we have a data dependency. Given a problem space of m columns and n rows, we can only divide he problem into n pieces. That we will cover in a future post.