Activeloop helps enterprises to organize complex unstructured data and retrieve knowledge with AI, with many of its customers working in heavily-regulated industries.
For a subset of its customers, Activeloop needed to provide highly accurate AI search across all U.S patents, and build a patent generation engine powered by a custom language model.
The U.S. Patent and Trademark Office (USPTO) website is a portal to an incredible amount of knowledge: the USPTO dataset consists of over 8 million patents, and its corpus of text contains some 40 billion words.
But – as anyone who has visited the USPTO website can attest – it’s a site that’s notoriously difficult to navigate, with a slow and rigidly-structured search engine (we suspect it’s running on Cobalt servers, without any neural network execution).
When Activeloop approached us to co-develop a GenAI approach to U.S. Patent data, we were thrilled with the opportunity. Together, we saw it as a challenge to make the incredibly rich dataset of U.S. patents more easily accessible to a broader audience.
The goal was to build a retrieval engine with powerful search and generation capabilities – including:
- Autocomplete
- Patent search on Abstracts
- Patent search on Claims
- Ability to generate Abstracts
- Ability to generate Claims
- General chat.