Identifying biological functions and molecular networks in a gene list and how the genes may relate to various topics is of considerable value to biomedical researchers. As compared with previous versions and the other similar text-mining tools, the unique characters of GenCLiP 3 are: (i) integration of
CoreNLP to accurately recognize molecular interactions and their interaction polarity and directionality with
Semgrex patterns from the entire PubMed database; (ii) integration of Sphinx with MySQL to support Boolean search and to quickly retrieve function-related genes from more literature sources; (iii) identification of gene related keywords by a new scoring method that considers the co-occurrence of a gene and keyword in a sentence and abstract; and (iv) daily updates following the release cycle of PubMed FTP.
More details can be seen in the paper and supplementary data.