Facebook has just short of 2.4 billion active users and sees 350 million photo uploads a day, plus more than 500,000 comments posted every minute. How do it track, monitor and gain value from this amount of information?
copyright by www.computerweekly.com
“There are billions of users and no way for humans to scale to do the analytics,” says Chirag Dekate, a research director covering (), and at Gartner.
So, Facebook uses learning systems and to scan posts. “No one can analyse every video or image, for banned or inflammatory material, or tag or for ad revenue generation,” says Dekate.
Social media sites are just one example of a growing number of applications of , which has moved from academic research into areas as diverse as medicine, law enforcement, insurance, and retailing.
Its growth has far-reaching implications for enterprise IT systems, including data storage.
is a broad term that covers a wide range of use cases and applications, as well as different ways of processing data. Machine learning, , and neural networks all have their own hardware and software requirements and use data in different ways.
“Machine learning is a subset of , and is a subset of ,” says Mike Leone, senior analyst at ESG.
Deep learning, for example, will carry out several passes of a data set to make a decision and learn from its predictions based on the data it reads.
Machine learning is simpler and relies on human-written algorithms and training with known data to develop the ability to make predictions. If the results are incorrect, data scientists will change the algorithms and retrain the model.
A application could draw on thousands of data points. A application data set will be an order of magnitude larger, easily running to millions of data points.
“Deep learning acts similarly to a human brain in that it consists of multiple interconnected layers similar to neurons in a brain,” says Leone. “Based on the accuracy or inaccuracy of predictions, it can automatically re-learn or self-adjust how it learns from data.”
Data storage requirements for vary widely according to the application and the source material. “Depending on the use case, the data set varies quite dramatically,” says Dekate. “In imaging, it grows almost exponentially as files tend to be really, really huge.
“Any time you do or or neural systems, you are going to need new architecture and new capabilities. But in a use case like fraud detection, you can use an infrastructure stack without new hardware for incredible results.”[…]