Scientists in China have developed an advanced machine learning (ML) system that can accurately predict one of aquaculture’s most devastating parasites, cryptocaryoniasis.
Outbreaks of the disease, commonly known as white spot disease, are caused by the ciliate parasite Cryptocaryon irritans.
The new tool, validated in both commercial open-sea cages and recirculating aquaculture systems (RAS), is now accessible via a free, open-source web platform, offering a real-time early warning to the global industry.
The study, conducted by researchers from Ningbo University, the University of Copenhagen and Nord University, emphasised the urgent need for such a tool. Disease outbreaks, especially from parasites like Cryptocaryon irritans, create a huge barrier to sustainable marine fish farming. They destroy productivity and profitability but until now, there have been no effective early warning systems using modern technology.
The research team published their findings after integrating seven years of disease surveillance data with 17 high-resolution oceanographic factors that influenced the parasite’s life cycle along the Chinese coast.
A major threat to marine fish
Cryptocaryon irritans is an obligate ectoparasite, meaning it must live on a host. It rapidly invades the gills, skin, and fins of fish, leading to severe necrosis, respiratory failure, and death. Its life cycle is explosive: a single resting stage, called a tomont, can release hundreds of infective theronts within days, especially in warm water.
Once clinical symptoms are visible, the tissue damage is often irreversible and treatment options are limited. The disease’s severe economic impact has long made it a primary concern for high-value marine species like large yellow croaker and groupers, which are intensively cultivated in Asian coastal zones.
The scientists noted that traditional methods of prediction are challenged by the non-linear, complex relationship between the parasite and its environment. Outbreaks show predictable seasonal patterns, typically peaking between May and July, and again between September and November, but are also strongly affected by variables like water temperature, salinity and pH.
Machine learning provides a breakthrough
To overcome the limitations of traditional statistical methods, the team turned to machine learning. They trained and tested five different ML models — logistic regression (LR), support vector machine (SVM), random forest (RF), XGBoost (XGB), and artificial neural network (ANN) — using 429 outbreak records collected from 2016 to 2023.
The most successful model proved to be the RF algorithm, which achieved a sensitivity of 98.61%. This meant it was excellent at correctly identifying true outbreaks. Its overall prediction accuracy was also consistently high — a key factor for a reliable early warning system.
In the final model performance assessment, the RF model achieved an F1 score of 0.9281 on the historical data test set. While the XGBoost model achieved a slightly higher F1 score of 0.9388, the RF model was ultimately chosen for deployment due to its superior sensitivity and robust performance across both field and recirculating aquaculture system (RAS) validation trials.
The system’s true test came from validation in real-world settings. In commercial open-sea cage farms, the RF model correctly predicted disease occurrence with an accuracy of 91.67 %. It was also highly effective in a controlled RAS trial, achieving 93.75% accuracy.
The field trials confirmed the model’s reliability, demonstrating its ability to deliver accurate forecasts under operational aquaculture conditions.
Identifying key risk factors
The ML process also helped definitively rank the most influential factors driving outbreaks. The analysis highlighted stocking density (the amount of fish in a given area), water temperature (a primary driver of the parasite’s life cycle), salinity and pH (which affect the parasite’s ability to encyst and reproduce), and novel predictors (silicate, nitrate, and dissolved oxygen).
The findings aligned with existing biological knowledge — for instance, high water temperature (peak virulence between 24°C and 27°C) accelerates the parasite’s life cycle. Critically, the model also flagged new, less-understood factors like the concentration of silicates and nitrates, which could have influenced the formation of the parasite’s protective cyst wall.
The model validated what was already known about temperature and density, but also prompted new research into how nutrient dynamics, like silicate and nitrate levels, were linked to parasite development.
Open-source platform lowers entry barrier
A major goal of the project was accessibility. The team deployed the best-performing model on a free, user-friendly, web-based platform that eliminated the need for expensive, on-site water quality sensors. This made it particularly valuable for small-scale farms and developing regions.
Most small-scale farms cannot afford real-time sensor technologies, and the researchers addressed this by integrating their system with the Copernicus Marine Services, which provides freely accessible, high-resolution oceanographic and atmospheric data through open APIs.
Users can simply input the farm’s location coordinates and stocking density. The platform then automatically fetches the relevant environmental data from the Copernicus services to generate a real-time risk forecast.
The platform includes four core functions: data management (a history of outbreak records), model information (performance metrics for all tested models, real-time disease prediction, and an epidemic map (a dynamic, colour-coded map showing weekly outbreak probabilities along the coastline).
A flexible foundation for global aquaculture
The system is currently focused on Cryptocaryon irritans in China, but the scientists designed its architecture to be modular and scalable.
The framework is robust and flexible, can be adapted to forecast other significant aquatic parasitic diseases (such as Ichthyophthirius multifiliis) and could support precision health management across diverse global aquaculture systems.
The study marks a significant step toward embedding systematic disease prevention into aquaculture governance, supporting a more sustainable and economically resilient “blue food” system worldwide.
According to the researchers, the next phase of research should focus on integrating real-time host and pathogen data — such as fish immune status and parasite prevalence — to further enhance the model’s precision.
Source: National Center for Biotechnology Information “A machine learning-driven early warning system for cryptocaryoniasis in marine aquaculture” https://doi.org/10.1186/s13071-025-07124-z Authors: Xiao Xie, et al.



