Not all () is made equal. A wide range of different techniques and applications fall under the term “.”
Some of these techniques and applications work really well, such as . But other applications, especially those focused on prediction, have fallen short of expectations.
Fielding before it is ready could lead to complacency amongst soldiers, inaccurate predictions, and strategic manipulation by users to game the system. Military procurement officers need to learn basic literacy to become smart buyers and ensure that is not fielded prematurely.
Manage Expectations of Artificial Intelligence
has improved significantly over the past 20 years, but there are still clear limits to what algorithms can do. is much better equipped at tasks related to categorization and classification than judgement or prediction. Princeton Associate Professor Arvind Narayanan refers to programs sold in the latter category as “ snake oil .” is especially strong at tasks like , translation, and generating new content (e.g., deepfakes or artificially generated text such as Generative Pretrained Transformer-2 ( GPT-2 )_. These are tasks that are narrow, with a clearly defined purpose, and little ambiguity about the correctness of the outcome. is best-suited for such applications.
struggles more with tasks related to automated judgment, such as spam filtering and hate detection. These tasks involve a subjective verdict, and there will always be disagreement about the correctness of the decision. Most difficult is the task of prediction, especially for social outcomes. Narayanan claims that currently performs little better than linear regression. For example, Julia Dressel and Hany Farid have shown that software used to predict recidivism does not perform better than people without criminal justice expertise. Additionally, will perform worse if there is limited data, or if the data it is trained on and data from the real world are not similar. Moreover, it is still very difficult to integrate prior knowledge into models.
So far, has not yet lived up to the hype. A 2019 MIT survey of over 2,500 executives showed that 65 percent of the companies that made investments in have not yet seen any value gained from it in the past three years. Prof. Erik Brynjolfsson calls this the “productivity paradox of .” A similar dynamic is playing out with fully autonomous cars, that were predicted to drive on the streets years ago. That does not mean that no works, or that will never work. is great for many applications, but it’s just not a tool that can fix every problem.
Policymakers, engineers, and analysts should develop more realistic expectations of . If not, the result could be another “ winter.” winters previously occurred in the 1970s and 1990s, as research funding dried up when high expectations were not met. The technologies of concern are not the ones that clearly don’t work, but those that give a false illusion of being capable. This is hard to judge. It is much less obvious whether a predictive program works than a tank does, and the outcome of the assessment whether software works is much more ambiguous. Moreover, there are no formal methods yet to verify and validate autonomous systems, making it even more difficult to assess whether programs function as prescribed. This will make procurement and deployment especially challenging.[…]