At Iris.ai, we have spent the last five years researching and developing an Engine for Scientific text understanding. Already successfully deployed in a generalized suite of tools for Academic literature reviews, we believed it was time to see how this engine could be reinforced on one specific domain, and how it could be used to find precise and more spot-on answers for industry researchers. Chemistry was an interesting place to start, for the reasons outlined below, as well as because it is an industry ripe for digital innovation and essential for the sustainable future of our planet.
The interesting thing about chemistry
In 1776, chemist and mechanical engineer James Watt invented the Watt steam engine, which was fundamental to the changes brought by the Industrial Revolution. Ever since – and potentially even before – an understanding of chemistry has been the foundation for our technological development, and there is no reason to believe that this holds any less true for the future. Whether we need more sustainable materials or biodegradable fuel to reduce our carbon emissions, new materials allowing us to travel to space or terraform Mars, novel ways of ensuring that every person on this planet is properly fed or understanding how we can handle an ocean filled with plastic particles, chemistry is going to be absolutely foundational.
What has enabled such a thorough understanding of chemistry pertains to the field’s formalism – the same as for maths and physics. This means structured approaches to unifying language so that any chemist anywhere can talk about anything from the basic elements, via molecular formulas to complex synthesis procedures in the same way. This structured way of communicating with each other has allowed rapid progress in this scientific field.
However, formalism has its downsides: when you simplify a process or a thought process into a unified language, inevitably there will be a loss of information on the way. Much like a compressed image is easier to share and still show the same motive – but is pixelated, so can formalist research results be easier to convey transmitting a general idea of the approach, whilst missing the finer details though. Ideas are compressed to formulas, long research papers compressed to abstracts, novel ideas compressed to a 140 character tweet, detailed lab notes compressed to summaries.
In chemical research, this ‘compression’ has been required because of human limitation – but today, it isn’t required anymore. Computers have already allowed a much broader and larger volume of shared knowledge – which in itself makes absolute formalism tricky. And thanks to advances in , we are rapidly approaching a new frontier in chemical research (and beyond).
With new advances, machines can help researchers find what other researchers have done, ‘translate’ it into that researchers’ current context, and get a much higher clarity on how and why the solutions or conclusions were reached – without the information loss built into the current process. The machine will have all necessary information as there is no information loss – but only communicate or ‘translate’ the exact relevant pieces between the researchers. This will truly be a new paradigm of chemical research, and we intend to be part of it.
Iris.ai’s first steps into chemistry
We have taken our core engine and reinforced it on chemistry. The interesting thing about that approach is that because the starting point of the general engine is strong, we only need a small collection of research paper in the specialized field, or for some use cases a seed ontology already created by a human, to specialize the tool – which makes it very flexible and re-deployable on many different research fields with similar user needs.
We are already now building out this engine into the first set of tools that will help Chemistry researchers on three different levels:
- Discover. When dealing with unknown unknowns, the Discover tool will allow interdisciplinary discovery, beyond today’s limiting keyword queries. It fingerprints the description of the researchers’ problem, and maps out all relevant papers and patents they should be reading to get a full overview of the field. The discover tool is especially helpful at the early phase of a new and interdisciplinary project, where it has proven to give researchers a better overview, find more spot on papers and draw better conclusions.
- Identify. When the researcher knows the answer is ‘out there somewhere’ but it’s like looking for a needle in a haystack. Known unknowns can be found through this conversational that guides the researcher through the information found in millions of documents, asking the right questions to narrow down to exactly the bits of knowledge you need. This knowledge could be finding new application areas for an existing compound, identifying better synthesis procedures or simply identifying the right material for your use case.
- In spite of chemistry being such a formalist field, every researcher writes in their own way. That means when you have a need to extract key data from a document – for example experiment data before going to the lab to recreate them – it takes a lot of valuable researcher time. Our automatic extraction can achieve 90% accuracy and perform two months worth of manual labor in less than 15 minutes.
At Iris.ai, we are very excited in bringing our skills together with some very talented chemical researchers, and see what just might be possible when you bypass the limitations of human formalist language and let an understand the context of your words to help advance your research.
CognitiveNations series by SwissCognitive, brings a handful of countries together to discuss the status of research, development, and the operating environment of , including technological, ethical, legal and socio-economic aspects. experts and leaders provide a high-level overview on the development of in their countries, followed by a more detailed exchange across the selected country representatives. The discussion is continued by practical examples from experts, revealing hands-on developments, processes, challenges, and achievements.