Ariful Azad, an assistant professor of intelligent systems engineering at the School of Informatics, Computing, and Engineering, has been awarded $150,000 by the Lawrence Berkeley National Laboratory to develop High-performance Markov Clustering (HipMCL), a communication-efficient parallel algorithm to analyze large-scale microbial data containing billions of proteins from a mixture of thousands of microbial organisms. This project is part of the ExaBiome project under the Department of Energy’s Exascale Computing Project (ECP).
Metagenomes are made up of thousands or millions of microbial genomes, many of which have never been identified or analyzed, but they may contain useful genes for applications in energy, the environment, and the healthcare industry. Using HipMCL, researchers can identify and characterize novel aspects of microbial communities to discover patterns that will increase our understanding of critical proteins.
“I’m thrilled to receive this grant from the Lawrence Berkeley National Laboratory,” Azad said. “Clustering microbial proteins and understanding their functions have huge impacts in bioenergy and environmental research. The availability of huge volume of metagenomic data made this an exascale problem. We will develop efficient clustering algorithms that scale to hundreds of thousands of processors.”
The grant will allow Azad and his team to reduce communication among processors by using communication-avoiding algorithms, improve in-node performance through faster algorithms, and allow the HipMCL software to scale to more than 2,000 nodes while clustering larger protein networks.
“It is well known that communication among processors is the main performance bottleneck in most extreme-scale applications,” Azad said. “We will develop algorithms that avoid unnecessary communication among processors to scale HipMCL to upcoming exascale systems.”
Azad joined the faculty at SICE in 2018 after serving as a research scientist in the Computational Research Division of the Lawrence Berkeley National Laboratory. His research interests include parallel graph algorithms, high performance computing, data-intensive computing, and bioinformatics.
“Developing methods to analyze genomic data more efficiently will be a big step forward in our understanding of how proteins impact one another,” said Kay Connelly, the associate dean for research at SICE. “This grant complements Dr. Azad's work with IU's Precision Health Initiative, a major initiative within the University and SICE to improve healthcare.”