Tap the predictive power of machine learning with these diverse, easy-to-implement libraries and frameworks
Spam filtering, face recognition, recommendation engines — when you have a large data set on which you’d like to perform predictive analysis or pattern recognition, machine learning is the way to go.
Apache Mahout provides a way to build environments for hosting machine learning applications that can be scaled quickly and efficiently to meet demand.
Mahout works mainly with another well-known Apache project, Spark , and was originally devised to work with Hadoop for the sake of running distributed applications, but has been extended to work with other distributed back ends like Flink and H2O.
Mahout uses a domain specific language in Scala. Version 0.14 is a major internal refactor of the project, based on Apache Spark 2.4.3 as its default.
Compose, by Innovation Labs, targets a common issue with machine learning models: labelling raw data, which can be a slow and tedious process, but without which a machine learning model can’t deliver useful results.
Compose lets you write in Python a set of labelling functions for your data, so labelling can be done as programmatically as possible. Various transformations and thresholds can be set on your data to make the labelling process easier, such as placing data in bins based on discrete values or quantiles.
Core ML Tools
Apple’s Core ML framework lets you integrate machine learning models into apps, but uses its own distinct learning model format. The good news is you don’t have to pre-train models in the Core ML format to use them; you can convert models from just about every commonly used machine learning framework into Core ML with Core ML Tools.
Core ML Tools runs as a Python package, so it integrates with the wealth of Python machine learning libraries and tools. Models from TensorFlow, PyTorch, Keras, Caffe, ONNX, Scikit-learn, LibSVM, and XGBoost can all be converted. Neural network models can also be optimised for size by using post-training quantisation (e.g., to a small bit depth that’s still accurate).
Cortex provides a convenient way to serve predictions from machine learning models using Python and TensorFlow, PyTorch, Scikit-learn, and other models. Most Cortex packages consist of only a few files — your core Python logic, a cortex.yaml file that describes what models to use and what kinds of compute resources to allocate, and a requirements.txt file to install any needed Python requirements.
The whole package is deployed as a Docker container to AWS or another Docker-compatible hosting system. Compute resources are allocated in a way that echoes the definitions used in Kubernetes for same, and you can use GPUs or Amazon Inferentia ASICs to speed serving. […]