A central goal of artificial intelligence (AI) is to develop algorithms and software that can solve complex problems as a human would. AI is poised to have a significant impact on making discoveries in biomedical big data given the availability of powerful algorithms, visualization methods, and high-performance computing. We introduce here the Exploratory Modeling for Extracting Relationships using Genetic and Evolutionary Navigation Techniques (EMERGENT) algorithm as an AI approach for the large-scale genetic analysis of common human diseases. EMERGENT builds models of genetic variation from lists of mathematical functions using computational evolution. A key feature of the system is the ability to utilize pre-processed expert knowledge giving it the ability to explore model space much as a human would. We applied EMERGENT to the genetic analysis of glaucoma in 1272 subjects with the disease and 1057 healthy controls. A total of 657,366 single-nucleotide polymorphisms (SNPs) or features from across the human genome were measured in these subjects and available for analysis. EMERGENT revealed a best model consisting of six SNPs that map to at least six different genes. Two of these genes have previously been associated with glaucoma. The others represent new hypotheses. All of the SNPs are involved in non-additive gene-gene interactions. Further, the six genes are all directly or indirectly related through biological interactions to the vascular endothelial growth factor (VEGF) gene that is an actively investigated drug target. This study demonstrates the routine application of AI to biomedical big data.
Jason Moore is the Edward Rose Professor of Informatics and Director of the Penn Institute for Biomedical Informatics. He also serves as Senior Associate Dean for Informatics and Director of the Division of Informatics in the Department of Biostatistics and Epidemiology. He came to Penn in 2015 from Dartmouth where was the Director of the Institute for Biomedical Informatics. Prior to Dartmouth he served as Director of the Advanced Computing Center for Research and Education at Vanderbilt University. He has a Ph.D. in Human Genetics and an M.S. in Applied Statistics from the University of Michigan. He leads an active NIH-funded research program focused on the development of artificial intelligence and machine learning algorithms for the analysis of complex biomedical data with a focus on genetics and genomics. He is an elected fellow of the American Association for the Advancement of Science (AAAS), an elected fellow of the American College of Medical Informatics (ACMI), and was selected as a Kavli fellow of the National Academy of Sciences.