Is disintermediation of patent search on the horizon?

By Ian Maxwell

Recently I had the opportunity to talk to Richard Jefferson, the CEO of Cambia, a not-for-profit (limited by guarantee) company operating out of the Queensland University of Technology. Over the last 22 years Cambia has received funding from various government and private organisations including the Gates Foundation, Moore Foundation and the USPTO. [1] Cambia hopes to substantially expand its operation with a possible upcoming large funding. The company has around 10 full-time equivalent staff. Most of the employees are software engineers and algorithm developers with the CEO also doing market development.

According the Richard Jefferson, the mission of the organization is to provide free or low-cost services that ‘de-risk’ the process of innovation so that:

  1. Innovation is cheaper
  2. The prospective return on investment (ROI) of innovation is increased
  3. The number and type of innovation projects that are undertaken is increased
  4. Hence there will be an overall increase in innovation.
  5. A Utilitarian ‘greater good for the masses’ will thus be encouraged through innovation

An example that Richard gives is for the development of treatments for Malaria. This problem is a major social issue in many relatively poor countries in the tropics; however it is apparently not a very financially lucrative problem for drug and biotech companies to focus on. Cambia hopes to reduce the costs of R&D into this problem by various means related to open and free access to innovation document systems. Cambia’s thesis is that by making research into malaria treatments cheaper the chances of the world getting a working technology actually increases since more groups are able to secure funding to do the R&D. This in turn is easy to understand since reduced costs and risks of R&D increases the return on investment (ROI) on the R&D investment which in turn increases the amount of capital available for this type of research.

The primary gateway for Cambia at the moment is the Lens website ( The focus of this site is to provide information services primarily around patent documents. One way to view Lens is as an attempt to disintermediate the (paid) patent information systems and services. It is in fact quite ironic that Cambia is trying to co-opt a contrived patent system which was pretty much designed to artificially create monopolies and to be usable only by people with deep pockets. Other similar services are ‘Google Patents’ and ‘Baidu Patents’ both of which, on the surface, are less sophisticated (with no substantial advanced search capability) than Lens. But who knows what Google, in particular, is preparing in the background.

In my mind the useability and quality of a patent database search system comes down to three key factors:

  • The quality of the content
  • The search engine and query syntax that is searched; the more flexible the better.
  • The ease and flexibility in extracting or presenting the patent information for further review and analysis.

Patent offices around the world are quite variable in the quality of information (content) that they publish. Into this breach have stepped many private companies that are service providers (such as Thomson Reuters with the Derwent World Patent Index (DWPI)); these are expensive-to-use patent information gateways. Almost all of the world’s Patent Offices use the Derwent data in their own searching during examination; that is, they effectively don’t own their own data.

The focus of these gateways has been to create a unified patent information database which has been very much needed by the world’s patent attorneys, particularly in the search for relevant prior art for various customer-focused services that they provide. More recently we have seen the emergence of more sophisticated patent analytics, by large companies and even start-up technology providers. The purpose of these companies is to provide the flexible search capability services that are required by professionals in the technology and patent spaces. These services are primarily based on keyword and citation approaches, to answer specific questions such as:

  • Who is infringing this patent?
  • Which patents in this portfolio have high asset value?
  • Who can I sell this patent to?
  • What is the technology landscape and how might I best direct my R&D spend?
  • What is the competitor landscape?

The Lens system currently aggregates 80 million patent documents from 90 jurisdictions around the world. Whilst the Lens system from Cambia shares some similarities with the commercially available systems it has some key differentiators:

  • It is a free service
  • It has some degree of integration with non-patent documents, particularly scholarly and scientific literature, that is linked to patents. I note that some but not all other systems have such links also. Some have many more external database links than Lens does.
  • The incoming patent information is coming directly from patent offices and is not passed through a human indexer as happens with the commercial databases, Derwent World Patents Index and Chemical Abstracts. This avoids some operating costs but also substantially reduces the quality of data in the system. Patent offices are notoriously variant and universally poor at publishing quality data; which is why the commercial databases exist.
  • Users are able to add annotations to all key fields and these annotations then become part of the searchable and permanent record of the database. This is an attempt to bring the Wikipedia-like benefits of the internet to the patent system
  • The Lens system offers a patent ranking via keywords and location, which is powered by Lucence – an open-source text search engine library written entirely in Java. I would note that most experts in the field of patent search agree that keyword searching is not anywhere near as useful as citation searching or a combination thereof, primarily because of the variability in the use of keywords in patents. Derwent addresses this problem by rewriting all patent abstracts as Derwent Abstracts which are prepared by analysts that are trained to do/write things in a certain way so that they are more standardized for searching. Even so, citation searching remains far more powerful for many use-cases.

Richard Jefferson acknowledges these shortcomings but stresses that it is very early days for Lens. For example, Lens is working with the patent offices to increase the quality of information that they publish. They are developing more advanced search capabilities and hopefully with the injection of growth funds, progress will accelerate. Richard also acknowledges that there is a missing link between products and patents and notes that Cambia’s solution is to promote association via patent ownership. I would note that the UK patent ‘box’ system which offers a tax rebate for the successful product use of patents does in principle also create such a link.  Another issue that all patent search engines ignore is that over 90% of all patents are essentially worthless since they don’t actively provide for any product monopoly. The trouble for search engines is that they don’t know, in the absence of any product-to-patent link, which patents are the valuable ones that should be heavily weighted in search results.

Richard enthuses that the Cambia system is focused on ‘Innovation Cartography’, i.e. creating a set of tools which enable user to map and navigate the IP landscape. He has written on the subject (see To be honest I had to read this a couple of times in order to parse into something that I understood. To save you the bother let me offer an example which illustrates the cartography metaphor; let’s start with a little company in Brazil. They want to solve a specific product problem but have little idea of what people have done in the area before. The Lens website will hopefully in the future enable them to quickly assess all the key efforts (past and current) and they can then extract the good and the bad from this summary. This will hopefully guide them to a new solution whereupon they face yet another problem; would they infringing any other company’s patent rights if they, for example, offer the new product for sale in the USA? Again the Lens system will hopefully be able to give them a definitive answer. Today these services are available but they are inconclusive and generally quite expensive. Cambia wants to make them much more powerful and also free of charge; if they can do this, as far as I am concerned they can call this service anything they want, including ‘cartography’, and I will live with it.

Cambia has developed a DNA sequence patent database called PatSeq which correlates specific DNA sequences to references in the patent literature. They are now working on a chemistry equivalent, which can also be codified with near 100% accuracy since there is the IUPAC universal nomenclature for molecular structures. I would note however that most areas of technology are far less conducive to this type of automatic database structuring since they do not have such defined terminology.

If I was to offer the Cambia folks some advice it would be this:

  • Enable third-party patent analytics providers the ability to plug into the Lens ‘front end’ and also enable them to publish their analytics services on the Lens website. What this would provide is a central patent information service. Unfortunately some of these services might require customer payment but that can be solved by having the usual mix of free and advanced paid services, typical of many internet services.
  • The ‘search’ engine as seen by users should really start with the customer focused ‘use-cases’. Search in patents is very different subject to what the user is after. For example, a search for prior art when drafting a patent is very different to a search for infringement risk, which is different again to a search for ‘white space’ in an emerging field of technology. Search options really need to start with, and be tailored to the general set of use-cases. This is also one of the key limitations of Google Patents.
  • The Lens folk really need to clean up the incoming data to Derwent standards or they will forever be seen as a second-class service by the patent service providers. If they can win over the patent service providers then they will know they have won this war.
  • The initial value proposition of Lens might be to allow innovators to get some idea of what is out there before incurring further costs with patent agents. That is, it might be a valuable tool to guide R&D efforts prior to incurring the often high costs of engaging with patent agents.

Some specific use-cases for the Lens system as noted by Cambia are:

  • The disintermediation of global corporate interests by providing individuals and companies in the developing and third worlds with access to latest technology advancements via easy to access patent information. This can, for example, educate researchers on the latest state-of-the-art information and also alert them to potential issues of infringement or the opposite where certain patents are not granted in certain jurisdictions.
  • It will help researchers have the ability to identify and focus their efforts in key areas where innovation is required and conversely to not re-invent wheels. Today many researchers rely almost entirely on the scientific literature for this purpose.
  • To give researchers the ability to define both their potential benefits to, and constraints on commercialisation, so they can respond more successfully to applied research requests where such requests demand some insight into the degree of innovation and freedom to operate for the outcomes.
  • To give small to medium size companies a set of cheaper tools to sustain themselves and grow, independently of the very common trade sale to larger companies. This would be achieved by more ready access to patent and innovation information

Overall my sense is that Lens model is driven by a strong vision to;

  • Create an online toolset that will help disintermediate global corporations, and their information service providers, that operate on gross information asymmetries. I suspect that, ironically, the first and most ardent users of the Lens system will in fact be the large corporates looking to reduce costs.
  • Create a free Wikipedia-like vision to socialise technology information that is inclusive of all participants and reduces the barriers to innovation for all.

This is a grand endeavour of Manhattan Project proportions. The scale and scope of this project is much bigger than Wikipedia for example. The funding to date for Cambia has really only been a feasibility study, which is what the Lens system is today. Maybe, at some point in the future, the Lens system will be ‘good enough’ and being free it will start grabbing market share from the incumbents; thereafter it will maybe roll over the top of them. In order to achieve this goal the Cambia folks need significant funding over the next ten years and access to the highest quality people in the patent search environment – these people are now employed in various private enterprises all over the world and it’s going to take some doing to attract them.

A final note; companies such as Thomson Reuters have for many years endeavoured to put together a complete patent analysis system and yet these systems are still somewhat limited. It is impossible for example to get any indemnity for a new product release in the context of Freedom-to-Operate. That is, the patent search systems are nowhere near good enough for anybody to (sanely) insure a party against accidental infringement of an unknown third party’s patent rights. And why are our patent search systems so limited? Well it’s because this is a very big and horrendously complex problem; one that is probably too large for any one company to tackle properly. In fact it might be that a global open-source approach is the only one that can finally reign in and constrain the patent systems and take them back to their original intent, namely the open-source promotion of technology advancement.

[1] I was told that Cambia has received around $45m over 22 years and currently has an annual budget of around $3m.

 [This post originally appeared at and Scribd.]

