Posted on 07 September 2020
Interview with Brent Wood
- 3 Minutes to read
Brent Wood is a Principal Technician - GIS and Spatial Data Management at NIWA, working with data from across disciplines: climate, fisheries, atmosphere, coasts and oceans. He led the development of our Metadata Catalogue, where anyone can discover what datasets our research has produced. In this interview, Brent introduced us to the fascinating, yet underrated world of metadata and the benefits of telling your data's story.
What is metadata and why is it important?
Most people know metadata as ‘data about data’, which is true but makes it sound boring – and that’s coming from a data scientist! A better explanation is that metadata is information that allows you to quickly determine whether the data is fit for your purpose. The best part about it is that you don’t have to spend hours accessing and analysing the actual dataset itself – which can be very time consuming.
So, how does metadata help you discover data?
Metadata tells the story around the data. And it’s the story – made up from keywords, titles, description, file format, location, date etc – that can make discovering data and research much easier. If your catalogue has 300,000 entries, there’s no way anyone is ever going to manually look through them all. Search engines match your search terms against keywords in the metadata and rank them accordingly. Without metadata, it'll be very hard to find what you're looking for.
Could you say a metadata catalogue is kind of like a Google search engine?
Yep. Yep. Google is the ultimate metadata catalogue.
How did our metadata catalogue come about?
I’ve been building databases at NIWA since 2006, and Julie Hall [Challenge Director] who also has an interest in this space, gave me a call one day. She saw a need for Sustainable Seas to have a catalogue of the datasets being produced. So, we built a generic data management system for Sustainable Seas, together with two other National Science Challenges – Deep South and Our Land and Water.
We have many different people interested in our work, from researchers, policy makers, Māori partners and stakeholders to local community groups. Can anyone use it?
Anyone with an interest can use the catalogue. It is designed so that anyone should be able to look at the metadata entry and understand what the data is about. In some cases, they can access the actual dataset via a link, or at least who to contact for more information. Some of the datasets can be embargoed or have intellectual property rights associated with them, so the information in the metadata entry can help facilitate a connection with the dataset holder.
If I’m a researcher, why should I care about metadata?
Most scientists are keen to have their work built on by others. Metadata helps other researchers find your data. And conversely, helps researchers find other datasets that can inform their research. There’s an increasing trend for not just scientific publications to be published, but also for data to be published. The research world now is analysing, comparing, and collating emerging datasets, and this can provide us with richer insights and better information [See Editor’s note].
In terms of measuring academic success, it’s not just the number of people citing your paper - there is additional kudos from data reuse citations. Essentially, the 20th century was about publishing papers; 21st century is about publishing data.
Editor’s note: Last year StatsNZ combined data from the 2018 General Social Survey with data from Environment Aotearoa 2019, to compare what New Zealanders thought about the state of our environment with empirical monitoring data. 74% said we have an issue with the state of oceans and sea life, and most cited household waste or sewage/stormwater discharge as the cause. However, the environmental reporting data shows that significant pressures on our marine environment are fishing, pollution, climate change and invasive species.