Making all of the RCT evidence accessible: Trialstreamer

August 13, 2020

We are excited to unveil Trialstreamer, a living (continuously updated) database of all published randomized controlled trial reports, automatically annotated with data we extract via the machine learning models in RobotReviewer. To do this, we monitor PubMed and other sources daily for new literature, and use our previously validated machine learning model to automatically identify the subset of articles that constitute new RCT reports in humans.

Next we automatically extract from all identified trials: Sample sizes; Snippets of text that describe trial Populations, Interventions, Comparators and Outcomes (PICO elements), and normalized terms for these; An estimated proxy for study quality, as a (simplified) overall “risk of bias” score; And the “punchlines” that seem to convey the main trial findings, along with an inferred directionality of said finding.

We make this database (updated daily) available directly. As of this writing, it comprises 697,217 trials. We have also implemented a simple faceted search interface that facilitates browsing the aggregated evidence. Try it out here: https://trialstreamer.robotreviewer.net/.

This is just one potential use of the data, however. The resource might also permit novel views of the evidence base that it might afford. For example, in an ACL 2020 Demo paper led by Ben Nye, we used the underlying data to automatically construct “evidence maps” for queries, on-demand:

Elsewhere, Trialstreamer data has been incorporated into the neural covidex search engine for Covid to complement the body of non-RCT literature pertaining to COVID-19.

Aside from more efficient navigation of the evidence base, we are optimistic that the semi-structured data automatically extracted from the underlying trial reports might afford novel analyses of the trials literature. For example, this readily allows for an analysis of the interventions studied over time, or of the (estimated) risks of bias of trials in particular sub-areas. We’d be keen to hear other potential uses and extensions; so please reach out to us if you have questions or thoughts.

More details are available in the JAMIA paper describing Trialstreamer, here: https://academic.oup.com/jamia/advance-article/doi/10.1093/jamia/ocaa163/5907063.

RobotReviewer at ICASR 2018

November 8, 2018

The RobotReviewer team (Iain Marshall and Frank Soboczenski, together with many of our collaborators such as James Thomas) attended the Fourth International Collaboration for the Automation of Systematic Reviews (ICASR) at ZonMw, The Hague, Netherlands.

Iain Marshall gave a talk on: "Reflections on evaluations of RobotReviewer ".

The team thanks the organisers and enjoyed fruitful discussions on the current state of the community and latest developments.

We’re looking forward to including feedback and new ideas in future research.

RobotSearch is online! Apply our classifier to your search results!

October 2, 2018

Our newest tool RobotSearch is available!

This is a front-end for our machine learning model that identifies reports of randomized controlled articles (RCTs). With RobotSearch you are able to filter your PubMed search results for RCTs in just a few seconds!

We have extensively validated the recall and precision of this model, as described in our Research Synthesis Methods publication, Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide.

RobotSearch website available at https://robotsearch.vortext.systems/

RobotReviewer User Study

March 27, 2018

The RobotReviewer team invites all systematic reviewers to take part in our study!

It is all done online and takes about 25 minutes!

Please get in touch via DM/Twitter/Email/etc if you are interested.

DM us on Twitter (our handles are: @byron_c_wallace @ijmarshall @h21k) or get in touch via email.

Here is the short instructions clip of our study:

NICE Joint Information Day 2018

February 28, 2018

We (Iain Marshall and Anna Noel-Storr) presented about how to use machine learning to find RCTs at the NICE Joint Information Day 2018. It was great to meet so many information experts on a beautiful snowy day.

Here are some useful links to go along with the talk.

Our new paper, published in the Journal of Research Synthesis Methods.

The RobotSearch open source software, available to download and use at Github.

Our slides, and some photos from the day below.

#NICE #JID_2018 Machine learning finding RCT’s pic.twitter.com/IxTMv6XIyZ
— Sue Jennings (@libsue28) February 28, 2018

#JID_2018 put through the crowd process .. pic.twitter.com/hksvWgKnkr
— Sue Jennings (@libsue28) February 28, 2018

Machine Learning for Identifying Randomized Controlled Trials: an evaluation and practitioner’s guide

February 22, 2018

Our new paper, just published in the Journal of Research Synthesis Methods, has evaluated machine learning (ML) for identification of RCTs. and has shown that ML works better than traditional database search filters.

To help get this into practice, we have released software as open source (available here), which implements our algorithm. Our software takes a standard database search result (in RIS format), and filters out the RCTs with very high accuracy.

Probably not all our users are keen to run Python code from scratch. We are therefore keen to get our RobotReviewer RCT classifier in as many databases as possible, so you could use the algorithm at the click of a button. Already, our RCT classifications are in the TRIP database, and hopefully more will follow.

We will keep this page updated with new ways of using our RCT classifier as they come live, and we're working on some new posts about why and how to use machine learning in practice — more soon!

We're grateful to the Cochrane Crowd volunteers, and to the McMaster HERU team for sharing their data with us which allowed us to build the tool and validate it.

RobotReviewer at ICASR

October 20, 2017

The RobotReviewer team (Iain Marshall, Byron Wallace, and Frank Soboczenski, together with many of our collaborators) attended the Third International Collaboration for the Automation of Systematic Reviews (ICASR) at Friends House, London.

Iain Marshall gave a talk on: "State of the science for data extraction : how should different systems be used and evaluated".

RobotReviewer at the Global Evidence Summit #GESummit17

September 13, 2017

RobotReviewer (Iain and Joël) is at the Global Evidence Summit in Cape Town!

We are looking forward to our workshop this afternoon (which is fully booked).

It also would be great to see you at our talk, or poster session, or around the summit :)

Thursday 12:30-14:00: Poster presentation (number 2040)

Saturday 14:00-15:30: Presentation in Long oral session 25: Tools for evidence production and synthesis (Room 1.44)

RobotReviewer demo — links for ACL 2017 paper

February 24, 2017

Welcome to RobotReviewer! We have collected some links here associated with our demo at ACL 2017. RobotReviewer takes clinical trial reports (journal articles in PDF format), and automatically produces evidence summaries.

Try out the demonstration site

You may try out the system here if you have some clinical trial PDFs.

Example reports

Alternately, we have saved some example reports (using open access clinical trial PDFs) on a few clinical topics, which can be accessed from the following links:

Source code

Our full system is open source, released under the GPL v3.0, and is available on our github page.

Demonstration video

Finally, we give a demonstration of the use of RobotReviewer, and its key features in the short video below.

Come work with us on RobotReviewer

January 11, 2017

We're delighted to announce that we're recruiting a Research Associate to work with us on RobotReviewer.

This is a two year post based at King's College London for someone with background in health sciences or computer science.

Full details and the application form can be found on the King's College London jobs site.

Introducing the new version of RobotReviewer, first steps to auto-synthesis!

October 21, 2016

The new version of RobotReviewer has been live on our website for a couple of months now, but we've just released all the code as open source, which is a good opportunity to write a post!

The goal of the RobotReviewer project is to speed up evidence-based medicine, by automating, or semi-automating the extraction of data from clinical trial reports.

To date we can to automatically extract information about bias (using the Cochrane Risk of Bias tool), and additionally can extract information about the participants, interventions, and outcomes studied in a trial. We are actively researching and adding new machine learning models to the system, and aim to extract all the pieces of information from a clinical trial needed to produce a systematic review.

The first version of RobotReviewer processed one paper at a time. The most important change is in how RobotReviewer can process multiple trial PDFs at once. This will allow us in future to move beyond summarising individual trials, and towards synthesis.

The first screen of the new RobotReviewer interface. Simply drag a bunch of PDFs of clinical trial reports to the dashed box to start the process

Then comes our new Report View, which is the main user interface. We've taken small steps towards synthesising the results from individual studies, with the Risk of Bias table being one example.

Automatic Risk of Bias tables! Note in the top right we now have download buttons, you can download the report in HTML, MS Word, or JSON formats.

In the background, RobotReviewer now automatically recognises the identity of an uploaded trial, and is able to retrieve relevant related information from PubMed, and ICTRP. For the moment, we simply use this to make nicely formatted citations, but clearly lots more is possible.

Further down the report, characteristics of the study population, interventions, and outcomes is extracted, plus the justifications for the judgements in the Risk of Bias table.

One key feature is that we retain links to the source PDF. One important problem in research synthesis is that the provenance of data is often not clear. By contrast, by clicking any of the PDF icons next to the extracted data, RobotReviewer will take you to the exact place in the PDF where the data came from.

the source PDF with the text describing the trial population highlighted in green

We hope you enjoyed this tour of our new RobotReviewer! We're excited by the possibilities our new framework gives us, and will post updates as they happen.

Our code is available here on GitHub.

RobotReviewer + Trip

July 29, 2016

Automatic bias assessment for articles in the Trip Database

We are excited to announce that the Trip Database now uses RobotReviewer to automatically identify trials that are likely to have the lowest biases in conduct and reporting. Bias, when present, leads to over- or under-estimates of true intervention effects, in turn complicating treatment choice. Bias can be introduced in several ways in the context of clinical trials (e.g., improper blinding or poor randomization); these domains have been codified in the Cochrane risk of bias tool. RobotReviewer predicts the risk of bias for each of these.

RobotReviewer's risk of bias estimates are based on the words and short phrases used in the titles and abstracts of papers. RobotReviewer has ingested thousands of such articles and identified statistical correlations between word use and manually assessed risks of bias. This general approach is known as “machine learning”.

The full RobotReviewer system (shown above) performs a risk of bias assessment using the Cochrane Risk of Bias tools. In general, RobotReviewer accepts as input the full text of clinical trial reports (as PDFs). For the version deployed within Trip,… — The full RobotReviewer system (shown above) performs a risk of bias assessment using the Cochrane Risk of Bias tools. In general, RobotReviewer accepts as input the full text of clinical trial reports (as PDFs). For the version deployed within Trip, RobotReviewer performs a more limited assessment based on the title and abstract only. Moreover, in this case bias predictions are limited to the *Random sequence generation, Allocation concealment, and Blinding* domains.

How does RobotReviewer work?

RobotReviewer has ‘learned’ how to assess bias by examining the titles and abstracts of tens of thousands of articles describing clinical trials that are also included in the Cochrane Database of Systematic Reviews. These trials have all been manually assessed for bias by systematic review authors, using the Cochrane Risk of Bias tool. As mentioned above, this tool is used to assess various common biases in Randomized Controlled Trials, including whether a trial used robust randomization, and whether participants were adequately blinded to which intervention they received. Specifically, the version of RobotReviewer used in Trip focuses on biases in random sequence generation, allocation concealment, and blinding.

While the predictions are reasonably accurate, we caution that they are not perfect, especially because they rely only on the titles and abstracts of articles (rather than full-texts). Nonetheless, in the context of Trip searches, the predictions can help quickly identify what is likely to be the highest quality research for a given search topic. If you require very high accuracy bias assessments (for example if you are conducting a systematic review) you will of course need to consult the full text of the paper. You may wish to use the full RobotReviewer tool, which provides detailed information on all the domains of the Cochrane Risk of Bias tool from full text papers and can aid in your assessments.

For more information on the algorithm used in RobotReviewer, please see the references below.

[1] Iain J. Marshall, Joël Kuiper, and Byron C. Wallace. "RobotReviewer: evaluation of a system for automatically assessing bias in clinical trials." Journal of the American Medical Informatics Association (2015): ocv044.

[2] Iain J. Marshall, Joël Kuiper, and Byron C. Wallace. "Automating risk of bias assessment for clinical trials." IEEE journal of biomedical and health informatics 19.4 (2015): 1406-1412.