The scientific ecosystem generates a massive amount of content on a daily basis. The consequence is that researchers have to constantly keep up with a moving target and to deeply capture the necessary insights is both overwhelming and complex.
Aside from keeping up with the literature, scientists engage in many additional tasks that require specific insights from the literature and documents associated with the scientific record. They might need to focus on ideation and hypothesis generation by looking for gaps in the literature for a new grant proposal. Or, maybe they want to add new capabilities and are thus looking to establish a new protocol in the lab or facing issues with an existing one and need to optimize or troubleshoot. Each of these use cases requires a different lens and the tools we use today, although growing in power and capabilities, mostly focus on information aggregation and summarization.
Let's take the example of a postdoc trying to establish a new technique in the lab. They might have to skim through 70 articles to find a good fit for their model, chase down 20 articles (with the hope that they have access to it) where they dig into the approach in finer detail and at the end of it all, not get the answer they needed. Such tasks require an integrated means by which to aggregate and synthesize information in a way that supports these types of scenarios. For researchers to use any of the information that is fed back requires that they have confidence in the quality and provenance of the information. Funding is hard to come by and the cost of a failed experiment runs the risk of a missed opportunity.
Each researcher brings their own unique perspective to the ecosystem. Their expertise and how they frame their world view helps us formulate a web of knowledge that, when combined, generates a collective understanding of that field. Such diversity of thought and cultivated expertise can help lead to critical breakthroughs in science but this specialization also comes at a cost in that no researcher has equal command of all relevant and adjacent domains that impact their area of interest. Let's take the example of a neuroscientist who has deep knowledge of cellular imaging techniques but might struggle with the statistical complexity of designing a behavioral testing framework. Or what about a biochemist who may have an expertise in x-ray crystallography to help reveal the architecture of proteins but would struggle with using computational models that could significantly accelerate their work. These examples highlight the fact that deep knowledge is often paired with a narrow, granular scope. But what if we could help to bridge those gaps by surfacing alternative perspectives and boundaries that a researcher might not otherwise consider? Such an approach would expand our ability to explore and reason with the literature in powerful ways.
How we capture insights as scientists is also a challenging space when it comes to the accessibility of content relevant to our areas of study. On the one hand we search, skim and review literature using abstract and citation databases which offer us breadth of coverage to ensure we aren't missing critical source documents. However, the space where the real insights emerge is in the deep reading of full-text articles and reviews. This depth is captured by the slow, methodical reading which is where the majority of insights are found. This level of deep reading requires time to understand experimental design and methodology, map claims to their evidence, and capture the boundary conditions that contextualize and set limitations on findings. This work is cognitively taxing and nearly impossible to scale.
Every decision in science is a bet as to where to allocate resources. When the stakes are high the need for trust, transparency and traceability are increasingly in demand. Bad bets waste experimental budget, run the risk of lost grant funding and increase the chances of getting scooped by competing labs. Therefore in any tool developed to support the research community, trust and evidence trails are paramount to the usability and reliability of those products.