Gene + artificial intelligence, where is precision medicine going?

The genetic industry has recently been a bit hot. On the 17th of last month, Huada Gene announced the formation of a new business organization with artificial intelligence as its core, which caused speculation in the industry. Then on July 29, CCTV focused on precision medicine, and introduced the genetic testing in a large space. They all set off genes. However, during this period, a company called Deep Genomics in Canada quietly set up and quickly occupied the headlines of major foreign media (there are few reports in China).

So what is this company doing? What are some of the best? Let us first look at foreign media evaluations. Canada’s Globe and Mail said “this Toronto startup intends to incite the genetic sequencing market”; the US “Washington Post” commented that “Deep Genomics, a startup that brings deep learning energy to genomics”; Gizmag said that "Deep Genomics intends to use deep learning to reform genetic medicine"; "Connected" previously reported that "machine intelligence deciphers genetic control"; "Scientific American" is very succinct, "some corners of our DNA hidden disease clues – The light of deep learning brightens the little-known corners of genetic mutations."

To sum up, Deep Genomics is the product of the marriage of artificial intelligence and genomics, namely "Deep Learning + Genomics." In the era of deep learning to study genomics, Deep Genomics opened the first window.

Maybe you have a big question in your heart. Genetic testing has been done for so long. Many diseases can be detected. Why does genomics require deep learning technology? Here is an example. A city suddenly has a power outage. In order to find out why there is a power outage, there are two ways: the first is to check all the wires and find the damaged place. The second is to choose those that are easily damaged. Location to check. If we do statistical analysis of the causes of power outages in 100 different cities, it is not difficult to find that some reasons occur at a high frequency, and some causes a low frequency.

The same is true for humans. The total number of DNA mutations (SNVs) in the population is about hundreds of millions. The mutation frequency is more than 1% called SNPs, and there are about 3 million SNPs. To study the relationship between disease and SNPs, a large sample size of patients is needed, and the difference between the patient population and the normal population SNPs is counted. For SNVs with a mutation frequency of less than 1%, although the number of populations is large, the individual is not statistically significant, so it is automatically screened out in the analysis of the disease. In terms of quantity, it is not difficult to see that if genetic testing lacks in-depth analysis of SNVs with a mutation frequency of less than 1%, precision medicine can only be limited to a narrow range.

At present, the projects approved by the National Health and Family Planning Commission for clinical testing include: genetic disease diagnosis, prenatal screening and diagnosis, preimplantation embryo genetic diagnosis and tumor diagnosis and treatment. The common feature of these four categories of diseases is that the disease is only associated with one or several susceptible genes. In fact, in addition to monogenic genetic diseases, the susceptibility genes of other diseases depend on the degree of research on the disease. For example, the current genetic testing for breast cancer is mainly focused on the BRCA1 and BRCA2 genes. Currently, a large number of mutations have been found in these two genes, but we lack a deep understanding of the effects of these mutations on breast cancer. What's more, with the deepening of breast cancer research, 40 genes related to breast cancer have been discovered (of course, there may be multiple SNVs in each gene). Therefore, from the perspective of genetic testing, it is still too early to achieve precision medicine.

The founder of Deep Genomics, Professor Frey of the University of Toronto, Canada, has long focused on research in this area. Their academic team has published research in this field in the world's top journals Science, Nature Biotechnology and Bioinformatics, hoping to use deep machine learning techniques to transform precision medicine, genetic testing, diagnosis and treatment.

Next, let's talk about how Deep Genomics analyzes the relationship between SNVs with mutation frequencies less than 1% and disease. Of course, to make clear the solution of Deep Genomics, we still need to continue science. For students who don't have a biological background and just know a little about genetics, they think about genes when they talk about diseases, but there are actually several steps from genes to diseases. The pot is not done well, there may be a problem with the design drawing, or it may be a problem with the mold.

Suppose we want to be a robot. We have to draw the drawing and material cutting diagram (DNA) first, then make the mold (RNA) according to the drawing and material cutting diagram, and then make various originals (proteins) according to the mold. Finally, these components are functional. Robot. Our life activities are also implemented in such a level. Life information is transmitted from the DNA carrying the gene to the RNA, then to the biologically active protein, and finally all life activities are realized by the protein.

In the process of making a robot, errors may appear on the drawing (gene) or on the material cut. Both errors can cause the robot to malfunction. The current genetic testing analyzes the effects of high frequency mutations on the disease, and seriously ignores the impact of gene shear mutation on the disease. The reason is nothing more than the frequency of control gene shear mutations is low, not statistically significant. But the number of them is huge – hundreds of millions. Deep Genomics currently offers 328 million SNVs that affect the prediction of RNA (material for making molds) shear. How did Deep Genomics do it?

According to the current thinking of genetic testing, it is difficult to analyze these SNVs. Therefore, Deep Genomics introduced artificial intelligence technology for in-depth learning. First, the Frey team built a mathematical model, then imported the whole human genome sequence and RNA sequence of the healthy person, trained the model to make the model learn the RNA shear pattern of healthy people; next, after training through other molecular biology methods. The model is validated and corrected; finally, several currently known case data are used to test the accuracy of the model judgment. Under the guidance of this idea, Deep Genomics launched their first product, SPIDEX. Simply by sequencing and cell type, SPIDEX can analyze the effect of a mutation on RNA cleavage and calculate the relationship between the mutation and the disease.

If Deep Genomics' deep learning analysis becomes accurate enough, the contribution of this technology is obvious: direct analysis of the relationship between mutations with low mutation frequency and disease; accelerated genomics research and drug development. At the same time, we must be soberly aware that the current Deep Genomics SPIDEX technology can only analyze the relationship between RNA shear mutation caused by SNVs and disease, and can not do anything for other causes. Even so, the application of artificial intelligence in genetic analysis is still worth looking forward to, perhaps it will become a golden key to decoding genes and disease mysteries.

© 2024 Kindle Medical Devices Co.Ltd  All Right Reserved.