Voices: How has the AI boom impacted algorithmic biology?

Mona Singh, Cenk Sahinalp, Jianyang Zeng, Wei Vivian Li, Carl Kingsford, Qiangfeng Zhang, Teresa Przytycka, Joshua Welch, Jian Ma, and Bonnie Berger (2024) Voices: How has the AI boom impacted algorithmic biology?. Cell Systems 15(6): P483-487.

The AI boom has affected algorithmic computational biology by further bringing traditional algorithmic thinking and ML and AI techniques closer together. One area where this is particularly true is in the field of automated algorithm design, where AI is used to inform or predict aspects of the design of an algorithm.

Many traditional algorithmic tasks, such as genomic sequence alignment or transcript assembly, are implemented as highly parameterized algorithms, where the settings of these parameters can significantly affect the accuracy of the output. These can be hard to set by hand, requiring expertise, time, and a way to assess accuracy.

This work can be avoided, while simultaneously increasing reproducibility and accuracy, through new AI approaches that use large datasets to train AI models to predict input-specific parameters for traditional, hand-designed algorithms. Such systems are especially useful when analyzing large, heterogeneous collections of samples where hand selection of optimal parameters is not feasible.

Future work in this area involves deeper co-design of parameterized algorithms and AI systems to enable AI-driven optimization, possibly explicitly supporting the selection from among various large-scale algorithmic changes. An additional challenge is to codify the definition of the desired output to be able to optimize parameters for biological insight and utility. This is related to a third challenge, which is avoiding overfitting: when selecting from a large parameter space, trivial solutions that technically optimize the quality of the output but that are not useful can be obtained (for example, if the optimization metric is number of fragments aligned, selecting parameters that simply align all fragments poorly would satisfy the AI but not the biologist).

DOI