Assessing the stewardship maturity of individual datasets is an essential part of ensuring and improving the way datasets are documented, preserved, and disseminated to users. It is a critical step towards meeting U.S. federal regulations, organizational requirements, and user needs. However, it is challenging to do so consistently and quantifiably. The Data Stewardship Maturity Matrix (DSMM), developed jointly by NOAA’s National Centers for Environmental Information (NCEI) and the Cooperative Institute for Climate and Satellites–North Carolina (CICS-NC), provides a uniform framework for consistently rating stewardship maturity of individual datasets in nine key components: preservability, accessibility, usability, production sustainability, data quality assurance, data quality control/monitoring, data quality assessment, transparency/traceability, and data integrity. So far, the DSMM has been applied to over 800 individual datasets that are archived and/or managed by NCEI, in support of the NOAA’s OneStop Data Discovery and Access Framework Project. As a part of the OneStop-ready process, tools, implementation guidance, workflows, and best practices are developed to assist the application of the DSMM and described in this paper. The DSMM ratings are also consistently captured in the ISO standard-based dataset-level quality metadata and citable quality descriptive information documents, which serve as interoperable quality information to both machine and human end-users. These DSMM implementation and integration workflows and best practices could be adopted by other data management and stewardship projects or adapted for applications of other maturity assessment models.
Published on 2019-08-23 10:15:58Tag: Data Analytics
Leveraging resource management for efficient performance of Apache Spark
Apache Spark is one of the most widely used open source processing framework for big data, it allows to process large datasets in parallel using a large number of nodes. Often, applications of this framework u...
Positive and negative association rule mining in Hadoop’s MapReduce environment
In this paper, we present a Hadoop implementation of the Apriori algorithm. Using Hadoop’s distributed and parallel MapReduce environment, we present an architecture to mine positive as well as negative associ...
Memetic particle swarm optimisation for missing value imputation
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 3, Page 273-289, January 2019.
Feature selection methods for document clustering: a comparative study and a hybrid solution
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 3, Page 246-272, January 2019.
Stellar mass black hole optimisation for utility mining
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 3, Page 222-245, January 2019.
A comparative study of unsupervised image clustering systems
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 3, Page 197-221, January 2019.
Review on recent developments in frequent itemset based document clustering, its research trends and applications
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 2, Page 176-195, January 2019.
Enhancing the involvement of decision makers in data mart design
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 2, Page 148-175, January 2019.
A new feature subset selection model based on migrating birds optimisation
International Journal of Data Analysis Techniques and Strategies, Volume 11, Issue 2, Page 133-147, January 2019.