Artificial intelligence (AI) and machine learning (ML) capabilities are growing at an unprecedented rate. Countless AI applications are being developed and can be expected over the long term. In hindsight, one would say that progress certainly has taken place just looking at the range of tasks that AI and ML are able to solve autonomously today (according to the benchmarks) and were not solvable a few years ago, from machine translation to medical image analysis or self-driving vehicles. Moreover, progress in AI is widely believed to have substantial social and economic benefits, and possibly to create unprecedented challenges. In order to properly prepare policy initiatives for the arrival of such technologies, accurate forecasts and timelines are necessary to enable timely action among policy-makers and other stakeholders.
However, there is still much uncertainty over how to assess and monitor the state, development, uptake and impact of AI as a whole, including its future evolution, progress and benchmarking capabilities. While measuring the performance of state-of-the-art AI systems on narrow tasks is useful and fairly easy to do, where the assessment really becomes difficult, though, is in trying to map these narrow-task performances onto more general AI and how it can have an impact on society in terms of benefits, risks, interactions, values, ethics, oversight into these systems, etc.
This workshop will welcome formalisations, methodologies and testbenches for the evaluation of AI systems. The goal is also to measure the field's progress. More specifically, we are interested in theoretical or experimental research focused on the development of concepts, tools and clear metrics and indicators to characterise and measure AI/ML systems and how this relates to, among others, metrics of intelligence (and other cognitive abilities), and rates of development, progress and impact.