Many interesting analytic problems can be scaled to greater volumes by use of parallel processing approaches, such as Hadoop clusters.
There are some problems, however, that cannot be run in parallel or, if they are, the speed gains are modest. There are many interesting questions in this space, with valuable answers.
In a few years time, many current analytic systems will become uneconomic to run, because although the value they produce grows as does data, the cost of solving them grows faster. By studying the “running time” of algorithms – how long it takes to solve a problem vs the amount of data – it is possible to predict when growing data sizes reach that tipping point, the tipping point of big data.