We are living in an age of Big Data. In 2012 the Harvard Business Review reported that “… about 2.5 exabytes of data are created each day, and that number is doubling every 40 months or so. More data cross the internet every second than were stored in the entire internet just 20 years ago.” During Fortune’s 2016 annual Brainstorm Tech conference, Shivon Zilis (Bloomberg Beta) said: “Data is the new oil.” Machines can now offer valuable services such as Google Translate, Apple’s Siri, facial recognition, and self-driving cars but to do this they must learn from vast data sets. Shivon Zilis argued that this demand for data is making it into a commodity like oil.
In some specific areas and with enough data, deep learning machines can outperform human experts. This performance is a tremendous achievement and an example of technical progress, but it is not a universal panacea or silver bullet. Machine learning as whole has many aspects in common with the drivers of progress I described in my earlier article “The formula for progress.”
In that article I outline the formula:
At the core of much of machine learning are mathematical algorithms for updating belief as new information is presented (learning) and in these algorithms belief is a probability calculation. The original example that shows this pattern is Bayes Theorem.
Although the language is different and it uses mathematical notation Bayes Theorem and the symbol for progress have the same shape or form:
Testing is the gold standard. Executing a well designed and rigorous experiment provides the most informative data. Bayesian statistics complements this by providing a mathematical method which uses the data to assess truth and update belief. Machine learning and Bayesian statistics are powerful techniques for handling data and when they are harnessed to the right goals they can drive progress.