Accelerating Data Analysis
Oceanographic data are collected in mass quantities worldwide, but it is often exclusively held by large institutions and requires advanced data analysis and computer science skills to process. Ocean Discovery League’s overarching goal is to uncover ways to simplify, accelerate, and consolidate ocean data so that analysis of it is more accessible and the results are more globally available to all researchers who can benefit from them worldwide.
In more than five decades of deep-ocean exploration, the oceanographic community has captured tens of thousands of hours of video and still imagery at considerable effort and expense. However, only a fraction of that video has been viewed and analyzed in its entirety, and even less shared with the global scientific community. Consolidation of underwater video data will catalyze a new era of discovery. It will allow us to harness the power of artificial intelligence and machine learning to gain new insights, make discoveries from existing data at scale, and create data-driven real-time tools for ocean exploration. Recent advances in cloud computing and advanced analytics only now make those ambitions possible.
Our primary objectives in the acceleration and accessibility of ocean data analysis include: (1) the aggregation and analysis of ocean video using machine learning; (2) the preservation and digitization of ocean video; (3) making ocean data generally more accessible to less technical populations and researchers without deep expertise in data processing and machine learning; and (4) providing training on data analysis as part of both our capacity building and sensors & systems programs.
We are currently partnering with Monterey Bay Aquarium Research Institute (MBARI) on FathomNet (Katija et al., 2022), a labeled image set for ocean species, Ocean Vision AI (OVAI), a comprehensive tool to automatically identify organisms in deep-sea video, and additional projects to assist in the acceleration of labeled data sets and begin to gather archived video from their original sources. Ocean Vision AI was recently awarded $5 million in NSF Convergence Accelerator funding. As the product is trained and informed at scale, it will grow into a real-time platform that can be introduced into the ocean video collection workflow, dramatically increasing the value of ocean observations for scientists and policymakers alike.
To complement the consolidation and analysis of ocean video, we are investigating initiatives to preserve as much original ocean video as possible. The little ocean video that has been captured over the past 50 years is now trapped on thousands of hours of physical tape in facilities and offices worldwide. This video degrades a little each day, and we may lose our only known observations of thousands of species. Our shortsightedness on what this video could contain or our future abilities to automatically analyze it means we abandoned much of this rare footage to collect dust on shelves. The time is now to save as much of this original footage as possible.
Data capacity training is also a key component of our overall capacity and community programs. Just because data is aggregated doesn’t inherently make it accessible. Many large institutions have data management and IT teams that can assist with more complex data processing and analysis. If a researcher does not have access to these resources, they will need training and tools to support their data analysis efforts.
The impact of the dramatic acceleration of deep ocean exploration will be transformative. Ultimately, we will see an unprecedented amount of new ocean data gathered that will help characterize the ocean for science-based decision-making worldwide. With the data analysis platforms developed alongside the information-gathering solutions, we could almost immediately expand our understanding of the ocean's biodiversity.