= PDF Reprint, = BibTeX entry, = Online Abstract
A. M. Jones, G. Sahin, Z. W. Murdock, Y. Ge, A. Xu, Y. Li, D. Wu, S. Ni, P. H. Huang, K. Lekkala, L. Itti, USC-DCT: A Collection of Diverse Classification Tasks, Data, Vol. 8, No. 10, p. 153, MDPI, Oct 2023. [2022 impact factor: 2.6] (Cited by 1)
Abstract: Machine learning is a crucial tool for both academic and real-world applications. Classification problems are often used as the preferred showcase in this space, which has led to a wide variety of datasets being collected and utilized for a myriad of applications. Unfortunately, there is very little standardization in how these datasets are collected, processed, and disseminated. As new learning paradigms like lifelong or meta-learning become more popular, the demand for merging tasks for at-scale evaluation of algorithms has also increased. This paper provides a methodology for processing and cleaning datasets that can be applied to existing or new classification tasks as well as implements these practices in a collection of diverse classification tasks called USC-DCT. Constructed using 107 classification tasks collected from the internet, this collection provides a transparent and standardized pipeline that can be useful for many different applications and frameworks. While there are currently 107 tasks, USC-DCT is designed to enable future growth. Additional discussion provides explanations of applications in machine learning paradigms such as transfer, lifelong, or meta-learning, how revisions to the collection will be handled, and further tips for curating and using classification tasks at this scale.
Themes: Machine Learning, Computer Vision
Copyright © 2000-2007 by the University of Southern California, iLab and Prof. Laurent Itti.
This page generated by bibTOhtml on Mon Jan 13 03:04:22 PM PST 2025