Accelerating Data Analysis
In more than three decades of deep-ocean exploration, tens of thousands of hours of video and still imagery have been captured by the oceanographic community at considerable effort and expense. However, only a fraction of that video has been viewed and analyzed in its entirety, and even less shared with the global scientific community.
Consolidation of underwater video data will catalyze a new era of discovery. It will allow us to harness the power of artificial intelligence and machine learning to gain new insights and make discoveries from existing data at scale and create data-driven real-time tools for ocean exploration. Recent advances in cloud computing and advanced analytics only now make those ambitions possible.
Ocean expeditions that record visual observations of ocean biodiversity are rare. Due to the high cost and limited access to exploration tools and ships, only ~3% of our world’s oceans have been visually observed, and only a tiny fraction of that video has been viewed and analyzed.
Currently, after an oceanographic expedition where underwater video observations were collected, video is usually transferred to a hard drive, given to the lead scientist, and then physically carried back to their lab to be manually viewed and processed. Each dive recording can last from just a couple of hours to 72 hours or longer. While the dive's primary target may be a specific area of the seafloor, species are encountered and recorded at every moment throughout from the vehicle entering the water, transiting to the bottom, and returning to the surface for recovery. The vehicles are often also recording geographic location, temperature, depth, salinity, and other critical environmental data points that can be cross-referenced to every video frame.
In a single dive, thousands of species could have been recorded in new locations, at new depths, and in new environmental conditions. This is a wealth of untapped biodiversity information trapped in these recordings, and new solutions are required to reveal them.
Current expedition video resides with different organizations that must be individually contacted to gain access, often as hard drives collecting dust on office shelves. If that video is ever examined, scientists must use manual annotation tools to hand tag every organism, painstakingly identifying only those they know, leaving the rest uncatalogued. Most current tools available to researchers require an engineer to set up and use.
The primary technology challenge we are facing with artificial intelligence solutions for ocean video analysis, is the automation of those manual processes. This includes the automatic localization, or detection, of objects within frames of video (e.g., drawing bounding boxes) and automating the initial classification of those objects. This automation and classification requires an extensive labeled training data set of ocean imagery.
Our goal is to build a globally accessible platform where thousands of hours of deep-ocean video can be automatically transcoded, objects detected, and initial classification done with minimal human intervention. As the product is trained and informed at scale, it will grow into a platform that can be introduced into the ocean video collection workflow in real-time, dramatically increasing the value of ocean observations for the entire scientific community. This is the vision we are working to make a reality.
We plan to use machine learning to rapidly identify and analyze all visual observations (and associated environmental metadata) made of species in the ocean, dramatically expanding our understanding of global ocean biodiversity. This tool will be critical in gathering, viewing, analyzing, and classifying vast underwater video and environmental data archives, including the thousands of hours that we and our collaborators collect, making discoveries available to the entire scientific community and the public.
In 2020, we began initial research on creating a master visual reference database of every known marine organism on earth by scraping existing online databases, and development of generic object detection algorithms to automatically identify objects in underwater video. In 2022 and beyond, we plan to (1) dramatically expand the master reference database by scraping additional databases of both ocean imagery and environmental data and verifying images for use in classification algorithms; (2) continued development of generic object detection algorithms and initial development of classification algorithms based on the master reference database; and, (3) begin aggregation of underwater video and environmental data with participating partners.
The discoveries resulting from this analysis will help countless scientists and ocean aficionados worldwide develop a new understanding of species distribution and characteristics. They could also contribute to the cases being made for various marine protected areas by global conservation efforts.
We are currently working with Monterey Bay Aquarium Research Institute (MBARI) on both FathomNet, a labeled image set for ocean species, Ocean Vision AI (OVAI), a comprehensive tool to automatically identify organisms in deep-sea video, and additional projects to assist in the acceleration of labeled data sets and begin to gather archived video from their original sources.