Version 2.0 
Chemical Text Mining
The ever increasing rate of publication of scientific literature and patents makes it difficult for researchers in the pharmaceutical industry to stay current with the latest developments and trends. NextMove Software's LeadMine product is a text mining tool for the identification and annotation of chemicals, protein targets, genes, diseases, species, named reactions, company names, cell lines, etc. in the text of documents. Whilst initially developed to identify molecules of interest to medicinal chemists in patent applications, its functionality has been extended to also handle arbitrary entity types specified by dictionaries, ontologies, regular expressions or formal grammars.
A significant competitive advantage of LeadMine over similar tools is its use of NextMove Software's CaffeineFix automatic spelling correction technology, that allows it to identify (and correct) misspelt terms and entities, including those introduced through optical character recognition (OCR), hyphenation and line-breaking or human error. This ability to handle noisy real-world text has been shown to significantly improve recall rates over non-correcting approaches and methods using simplistic rule-based OCR correction heuristics.
A significant feature of LeadMine and CaffeineFix is their ability to efficiently handle very large dictionaries, often containing tens of millions of terms/synonyms. Such large synonym dictionaries are not uncommon in chemical and biological text mining, and are often problematic for many text mining tools not designed for processing scientific and technical documents.
Another unique feature of LeadMine is its ability to also perform chemical named entity recognition of Chinese (both simplified and traditional) and Japanese documents.
- A Presentation describing LeadMine v1.0 presented at the German Cheminformatics Conference (GCC) in Goslar, November 2010
- A Presentation describing LeadMine v2.0 presented at the American Chemical Society (ACS) National Meeting in Philadelphia, August 2012
- A Presentation describing LeadMine v2.0's ability to process Chinese and Japanese documents at the same Philadelphia ACS meeting