The recent boom in ‘machine learning’ or ‘artificial intelligence’-driven approaches to materials discovery has been largely underpinned by structured property databases, often encompassing vast amounts of numerical data. But of course there is more to the literature than just datapoints and numbers. Now, Vahe Tshitoyan, Anubhav Jain, Gerbrand Ceder and colleagues argue that implicit connections and relationships between words in the literature could also be harnessed to discover new materials. They take 3 million abstracts of published materials science articles and apply natural language processing algorithms to uncover relationships between words, material compositions and properties – some obvious, some less obvious. By projecting material compositions onto the word ‘thermoelectric’, they predict potential new thermoelectric materials, and also show that there was enough information in the literature to predict current ‘top performers’ several years before they actually were discovered.
- Text mining facilitates materials discovery (News & Views p42, doi: 10.1038/d41586-019-01978-x)
- Unsupervised word embeddings capture latent knowledge from materials science literature (Letter p95, doi: 10.1038/s41586-019-1335-8)
Recent Hot Topics
- Jul 18Reaction prediction made easy
- Jul 11Inflammation affects stem cell function in old brains
- Jul 4Reading between the lines
- Jun 27Bugs on drugs
- Jun 20Stressed out
Sign up for Nature Research e-alerts to get the lastest research in your inbox every week.