Data audit framework - CNI 2008

| | Comments (2)

I attended a session at CNI 2008 that described the Data Audit Framework, a JISC funded initiative. Four UK universities are participating in the pilot. If you don't want to read the rest of these notes, I'd say it all comes down to one bullet for me - taking on something like a DAF "enables the development of a data strategy." Sounds painfully simple, but this is something I spend a fair amount of time worrying about. As faculty develop and create data associated with their funded research, how can we establish a culture of concern (thanks Candy Yekel) regarding the stewardship and curation of that data? In my travels around campus, I believe that many people think that these issues are simply being "handled" and in some cases they are, in many others I expect they aren't. I liked the concept and execution of this framework because as developed it is non-threatening but can yield positive outcomes for better managing (protecting, sharing, curating, archiving) research data. The raw notes follow.

"Enables the development of a data strategy"
Two modes:
1. improve awareness of assets

  • capacity planning
  • facilitate sharing and reuse
  • avoid leaks

2. recognition of practices

  • more efficient use and improved research workflows
  • enables development of a strategy
  • risk management enabled, important as more agencies start to require a curation plan

Plan the audit:

  • Identifying and classifying assets
  • Assessing mgt of data assets
  • Reporting and recommendations

Three of four pilot schools (didn't get the fourth one):

  • Geosciences at Edinburgh
  • Innovative Design and Manufacturing Research Centre at Bath
  • Glasgow University Archaeological Research Division

Main findings:

  • lots of data being created
  • few policies for creation, storage and mgt
  • researchers unsure where to begin
  • unaware of available support campus or federal support
  • often no place of deposit or funds for preservation

Implementation tips:

  • scope audit carefully
  • timing needs to be appropriate
  • find an advocate in the department
  • collect information at once where possible
  • general discussion to build rapport, communicate purpose of audit and understand organization

Conclusions:

  • Premise of project is demonstrated, that is, lack of knowledge about what one has and poor understanding about how to curate is prevalent
  • Key areas for future work:
  • support on policies, stds, best practice
  • skills are lacking - basic training & professional workforce needed
  • develop robust, sustainable federal infrastructure for curation
  • recognition of data and active push for sharing and reuse critical

2 Comments

MARK CHARLES SAUSSURE Author Profile Page said:

Wow, this looks like it was a great presentation.

I think collaborating with work units or departments that are keenly interested in "really" finding out what they have could be a win for everyone. I'd welcome the chance to talk with these folks.

Over here we're preparing for a deluge of data from all parts of the University but have no way to know how much or at what given moment we'll be asked to preserve it and make it discoverable. I want to add that we can propose the tools to make data discoverable but the ultimate decision as to the value and how it becomes discoverable (the metadata) is up to owners and curators (librarians).

Ok, now my soapbox (short)
Please think about how over time we are going to plan and manage this data (maybe forever?). The last thing IMHO we want to create is yet another "repository", that becomes a silo that is difficult or impossible to federate. We need a system that's standards based where the repositories and storage hardware are decoupled and the only time we are required to move data around is to transform it to a readable version based on current technology, not vendor or platform end of life issues. And, it's easily done because the proper metadata is associated with the data objects. XAM Initiative (OK, I have to stop)

Thanks!

I couldn't agree with your soapbox box, more. I do think that whatever is done at the institutional level, however, is likely to always be outstripped by the pace of capability expansion by individuals. Given that, the bigger institutional risk is not having an awareness about this issues. Indeed, anything done institutionally has to be scalable, affordable, extensible and most importantly interoperable - which of course means *open* standards compliant.

Leave a comment

About this Entry

This page contains a single entry by KEVIN M MOROONEY published on December 10, 2008 9:12 PM.

Coalition for Networked Information 2008 - kickoff was the previous entry in this blog.

sciencecommons presentation at CNI 2008 - how we can do better than Google is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.

Powered by Movable Type 4.21-en