Artificial Intelligence (AI) around patents – or data types such as research literature and news articles – is huge. Or at least its promise, some might say hype, is huge: through greater efficiency you can enhance insights by steering sophisticated algorithms into to a heap of patent or non-patent data to identify clear spaces, hear signals in the noise, or to automatically generate insightful patent landscapes.
These would be great results from software innovation, with promising opportunities for development, for sure. However, many organizations focus too early on the solution and before they have clearly defined the problem it is intended to solve. Let me illustrate with an example.
“Let’s use AI to automate our patent repository for more efficient access to insights!”
Our team was recently approached by the IP head of an FMCG company. After reading an article on the plane, he came up with an exciting idea to use AI in the company’s patent repository management processes and to feed the newly-identified relevant patents of other companies in to this repository. Any new patent added to this database must be tagged to one or more relevant technology category to ensure the repository is searchable. Our client wanted to train an algorithm to automate this process.
“Great idea, we have the right AI technology which can help you with that use case,” we replied. “What is the volume of new patents that needs to be automatically classified per month?” His answer left us speechless: “Four patents.”
We have great technology for this …
Treparel, a company acquired by Evalueserve in 2015, developed one of the world’s first machine-learning-based patent analysis tools, KMX, in close cooperation with Royal Dutch Philips Electronics. The application was optimized to rapidly classify large sets of patents or scientific articles on the basis of a limited set of training examples, and since then we have applied and improved it based on a wide variety of client projects.
This approach works by visually selecting positive and negative examples based on which the application creates a ‘classifier’ – or a ‘terminology fingerprint’ of the content in these examples. With consent of the analyst, KMX then quickly goes through the full patent set to rank it based on these classifiers. Patents with a high score will contain similar terminology, and so will likely be relevant. You can also use classifiers to add technology tags to documents by creating a unique fingerprint for each tag.
… but with some caveats
Firstly, training a classifier requires a significant investment. You need to provide your AI tool with positive and negative examples to learn from. Furthermore, training typically requires a few iterations of evaluating results to build confidence in its accuracy, plus providing additional examples to steer the algorithm in the right direction. So, when using AI you must always be aware of the return on investment: for example, it can make a lot of sense if the document set that you need to analyze is prohibitively large, or if you are pressed for time. But in other cases, like the one above, it’s just not worth the effort.
Even if your document set is large, there is a second factor that can prevent successful application of AI: accuracy requirements in IP or R&D use cases are often stringent. Despite the major technological gains, AI tools rarely hit the high level of accuracy required and so therefore deliver a result set with recall and precision that is too low to make it useful. Consequently, analysts will need to check the AI’s results – thereby undoing most of the efficiency gains of using AI in the first place. You can of course set up hybrid schemes in which AI tools take care of the obvious documents, and human analysts focus on the difficult examples. However, the obvious documents are typically also very easy for humans to tag, so again the return on investment is limited.
Success factors for using AI in patent search
In this example, but equally in all other use cases, several dependent factors define success or failure of AI when used in information science:
- The larger the data set the higher chances of success or – more important – value
- The expected accuracy (expected recall + expected precision) should be moderate to low; machine-based solutions remain unable to deliver high recall/precision tasks
- The more explorative is the use case, the higher the likelihood of a successful output
- AI supporting the analyst in his analysis process can potentially show great value.
Considering these factors, it becomes clear that the use case above did not qualify for successful application of AI, as it would have been not only more efficient but also more reliable to do it manually. However, there are clear cases where AI will deliver.
In the following two posts, we will aim to demystify the application of AI to patent analysis and innovation intelligence. We’ll share the story of when we ran in to disappointing results (spoiler alert: efficiency) and where our implementation of AI in our search or analytics processes is way more successful (quality!). Based on our Search Quality Index (SQI) theoretical framework, we’ll also discuss which use cases offer high potential for AI-driven solutions and which use cases will potentially remain with human experts, supported by AI.
In addition, we’ll also publish a more technical blog post, in which we will outline the basics of the most important text analytics approaches.
You and AI?
Have you successfully implemented AI in your innovation intelligence or patent search operations? Did your AI robot already replace part of the work of the expert IP searcher or R&D information professional? Are there any other learnings we can gain from such technologies? Let us know in the comments!