Artificial intelligence does not exist on its own. It is a technology that fits into a larger solution to address a business issue. No market is stagnant while remaining relevant. How and when to update systems, as they are used in more and more places, is important.
copyright by www.forbes.com
In the continuing theme of higher level tools to improve developing useful applications, today we’ll visit feature engineering in a changing environment. Artificial intelligence () is increasingly used to analyze data, and () is one of the more complex aspects of . In multiple forums, I’ve discussed the need to move past heavy reliance on not just pure coding, but even past the basic frameworks discussed by programmers. One of the keys to the complexity is figuring out the right data attributes, or features, which matter to any system. It’s even more important in , both because of larger data sets and due to the less transparent nature to the inference engine over procedural code. As tricky as that is the first time, it needs to be a repeatable process, as environments change, and systems must change with them.
Defining the initial feature set is important, but it’s not the end of the game. While many people focus on ’s ability to change results based on more data, that still means the use of the same features. For instance, the features are fairly well known radiology. It’s gaining more examples for training that matters, to see the variation of how those features appear. However, what is there’s a new tumor? There might be a new feature that needs to be added to the mix. With supervised systems, that’s easy to modify because you can provide labeled images with the features and the system can be retrained.
However, what about consumer taste? Features are defined, then the system looks for relationships between the different defined features and provides analysis. However, fashion changes over time. Imagine, for instance, a system defined when all pants had pleats. The question of whether or not pants should have pleats isn’t an issue, so the designers did not train the system to analyze the existence of pleats. While the feature might be defined in the full data set, for performance issues the feature was not engineered into the engine.
Suddenly, there’s a change. People start buying pants without pleats. That becomes something that consumers want. While that might be in the full dataset, the inference engine is not evaluating that variable because it is not a defined feature. The environment has changed. How can that be recognized, and the system changed?
SparkBeyond is a company working to address the problem. While the product works with initial feature engineering, the key advantage is that it helps with DevOps and other processes to work to keep driven applications current in changing environments.
What the company’s platform does is analyze the base data being used by the systems. It is not itself, but leverages random forests (RF). This technique is a way of running multiple tests with different parameters. This is helped by the advances of cloud technologies and the ability to scale-out to multiple servers. Large numbers of decision trees can be analyzed, with new patterns being seen. The RF is one of the ways that has moved past a pure definition, as it can create insight far faster than other methods, identifying new classifications and relationships in large data sets.