Building your own data life cycle

During the Data-driven ecological synthesis class, there is a cool little exercise I do when we start discussing data management. After reading through @Mich15, I ask learners to think about a specific project they are working on, and to draw the data life cycle that best describes it, as honestly as possible. In many cases, this is a lot of jumps from “collect” to “analyze”. This is a good starting point.

The second part of the exercise is to draw the data life cycle they would like to end up with. In most cases, this is not the usual 8-steps data life cycle, but something that reflects the reality of a specific project. We usually follow this by a group discussion.

There are two important points in this exercise.

First, unless you know that there is a “formal” way of representing the movement of data throughout a project, decomposing the process into steps is not really intuitive. Second, there is no right way to manage data, that would work for everyone and for every project. Instead, each project should have its own data life cycle, tailored to the specific steps.

One thing I will add to this exercise this year is a sort of annotation of the data life cycle diagram. In a sense, moving from one step to the other requires as series of tools. So knowing the type of data life cycle you want, means that you can prioritize the tools you need to learn in order to get there. This will most likely help with understanding how all of these things fit together.