In the AI science boom, beware: your results are only as good as your data
Machine-learning systems are voracious data consumers - but trustworthy results require more vetting both before and after publication.
February 1, 2024
Nature
Hunter Moseley
"We are in the middle of a data-driven science boom. Huge, complex data sets, often with large numbers of individually measured and annotated ‘features’, are fodder for voracious artificial intelligence (AI) and machine-learning systems, with details of new applications being published almost daily.
But publication in itself is not synonymous with factuality. Just because a paper, method or data set is published does not mean that it is correct and free from mistakes. Without checking for accuracy and validity before using these resources, scientists will surely encounter errors. In fact, they already have."
|