effective data management strategies

Looking back: September 3 – 22, 2013

The majority of my time and brain power this month has been spent on thinking about data management methods and instructional design for effective training. Some of this was in preparation for the Data Information Literacy Symposium at Purdue, the rest in developing the workshops/lab curriculum and materials for our December pilot. Since I’m finishing this poster after the DIL Symposium, I can say that there was some fantastic discussion about data information literacy competencies and how we teach them effectively. More on that soon…

Data management workshops

Curriculum – So I’ve been reviewing what is out there for some time, but only recently began to compare the various structures. Most training curricula cover the same topics, but there isn’t yet consensus around how these topics are organized and framed. I’ve chosen to structure my curriculum around the DataONE data life cycle because it represents data management activities from the perspective of the researcher. One of the more challenging issues in comparing curricula is mapping the various modules onto a common framework (mine). Several groups are organizing their content around the NSF  Data Management Plan requirement. I find this a bit arbitrary and not particularly useful for researchers who are not subject to this requirement. On the other hand, it does address an existing requirement. I prefer to focus data management training on skills, methods, and strategies that help researchers in their day-to-day work, while also improving their efficiency. In my mind, it makes sense to organize this information around the process as they understand and discuss it. Currently, I think the data life cycle diagram that best fits a general research process is that from DataONE.

Identifying effective strategies/best practices – There is a more extensive lit review in process, but I wanted to proceed with the pilot workshops as soon as possible. So the strategies are based on an initial, fairly extensive lit review across several social science domains plus computer science and statistics. The more extensive lit review will factor into future revisions, I’m sure. In any case, there is some good consensus around electronic resource management (file management, organization, versioning, etc.) but less so around more context-specific aspects of research data management.

Instructional design – This has been relatively easy because I’m retreading familiar territory. However, I’m finding that the best practices I’ve used in past trainings are more difficult to map onto data management issues. In part, this is due to the variety across domains and our lack of standard terminology when we talk about data management. The bench scientist has a different vocabulary than a sociologist, who has a different vocabulary than a social worker, etc.

Sample datasets & activities – We are currently identifying and reviewing sample datasets. We will do our best to use a representative sample of data, including numeric, textual, and image data, that was gathered in a variety of ways – surveys, interviews, observational or lab data.

DIL Symposium Poster

While it’s available in our repository IUPUIScholarWorks, don’t forget to check out the other great resources from the Symposium.

By Heather L. Coates

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s