Data science meets biology: bioinformatics breakthroughs
Gene editing magic scissors CRISPR/Cas9 system can be tamed by data science and machine learning algorithms. Recently, PhD student Hui Peng from Professor Jinyan Li’s group has achieved a breakthrough on how regression learning algorithms can make accurate predictions of the right editing scissors sgRNAs for specific disease genes. The paper is entitled: CRISPR/Cas9 cleavage efficiency regression through boosting algorithms and Markov sequence profiling, co-authored by Hui Peng, Yi Zheng, Michael Blumenstein, Dacheng Tao, and Jinyan Li^. This week, the paper has been accepted by the prestigious journal Bioinformatics which is a top-3 journal in the SCI category of computational biology ranging over 100 journals.
Within the last six months, a total of four papers have been published by Bioinformatics.
The breakthrough research is not only on gene editing problems by the team. Within the last six months, another three papers have been published by Bioinformatics. One is by PhD student Yuansheng Liu whose work is about lossless referential genome compression; the other two are by Dr Liang Zhao on sequence data error correction and epitope structural biology. The details of the papers are:
- Yuansheng Liu, Hui Peng, Limsoon Wong and Jinyan Li^. High-speed and high-ratio referential genome compression. Bioinformatics. 33(21), 3364–3372, 2017.
- Liang Zhao, Qingfeng Chen, Limsoon Wong^, Jinyan Li^. MapReduce for accurate error correction of next-generation sequencing data. Bioinformatics. 33(23):3844-3851, 2017.
- Liang Zhao^, Shaogui Wu, Jiawen Jiang, Wencui Li, Jie Luo^, and Jinyan Li^. Novel overlapping subgraph clustering for the detection of antigen epitopes. Bioinformatics. Advanced online, Feb 02, 2018.
The team is still working on these challenging areas, expected to publish even more exciting results soon.