I’m having a lot of trouble understanding the second paragraph in section 2.1.2, especially by the sentences “Amdahl’s law assumes that the size of the problem stays constant as the number of processors increases, but Gustafson (1988) notes that in practice the problem size scales with the number of processors.” Can you expand on what you mean here?
Edit: Also there’s a typo in 4.1- “practicioners”.
I think the point is that when you increase the data set, you then expose more work for the parallelism to handle.
If I have a 1kb dataset and I have a partially parallel algorithm to run on it, I will very quickly ‘run out of parallelism’ and find that 1000 processors are as good as 2 or 3. Whereas, if I have a 1pb dataset, same data and same algorithm, I will be able to add processors for a long time before I finally run out of parallelism.
I’m having a lot of trouble understanding the second paragraph in section 2.1.2, especially by the sentences “Amdahl’s law assumes that the size of the problem stays constant as the number of processors increases, but Gustafson (1988) notes that in practice the problem size scales with the number of processors.” Can you expand on what you mean here?
Edit: Also there’s a typo in 4.1- “practicioners”.
I think the point is that when you increase the data set, you then expose more work for the parallelism to handle.
If I have a 1kb dataset and I have a partially parallel algorithm to run on it, I will very quickly ‘run out of parallelism’ and find that 1000 processors are as good as 2 or 3. Whereas, if I have a 1pb dataset, same data and same algorithm, I will be able to add processors for a long time before I finally run out of parallelism.
gwern’s explanation is right. Gustafson’s law. I’ll clarify that.