Hello fans of evidence-based medicine and open data! We’d like to officially announce that the technical work for OpenTrials is underway. Read on for an overview of what we are doing, how we are doing it, and the current roadmap for the first half of 2016. If you have any questions or comments, do not hesitate to ask on Twitter via @opentrials or by email at [email protected].

Technical team

Technical work on OpenTrials is conducted by Open Knowledge International. While different people may come into the project at different times, the current technical team is:

Feel free to reach out to any of us at OpenTrials on GitHub.

Contributing

As with all Open Knowledge International projects, we welcome and encourage contribution. For the technical work on OpenTrials, contributions can mean any or all of code, documentation, testing, etc. See the OpenTrials issue tracker for interesting tasks to take up.

Overview

OpenTrials aims to provide a comprehensive picture of the data and documents on all clinical trials conducted on medicines and other treatments. The platform will present data aggregated from a wide variety of existing sources, starting with clinical trial registers and moving on to academic journals, systematic reviews and other data sources. See Ben Goldacre’s video for greater context. Here, we are are focused on the technical aspects of the OpenTrials platform.

Architecture

Let’s start with a look at the general architecture of the OpenTrials platform. This is a high-level overview, describing the general data flow, and the relation between different components.

architecture

OpenTrials will be implemented as a set of loosely coupled components, from data acquisition through to user-facing applications:

Of course, there are many details inside each component as described in the above architecture diagram. We plan to blog regularly as we develop the platform, and give deeper insight into each component and our strategies.

Data model

We are not setting out to design a perfectly formed vocabulary around trial data. Rather, we accept that the data itself is messy, inconsistent, divergent and non-standard, and we set out to increase the value of this data by threading it together based on a range of matching techniques, and by extracting a set of relations between various entities that are manifest in the data itself.

model

The above diagram centers around our “ideal” representation of a trial, which is derived from various sources of data on a trial, starting with the trial records published on clinical trial registers. This “ideal” representation has a minimal set of core, pre-defined fields based on the WHO Data Set, and a less structured set of associated data making up the graph of everything the OpenTrials platform knows on a given trial.

Technology stack

OpenTrials will be written in Javascript and Python, being the core languages used at Open Knowledge International, and the most common languages used in the open data sector.

The majority of web-facing code will be in Javascript, using Node.js for servers, and either React or Angular for clients. The OpenTrials API will be a Node.js server implementing an OpenAPI-compatible HTTP API exposing data in JSON.

Significant portions of the platform are not web facing, and are concerned with data acquisition and processing. The majority of this code will be in Python, leveraging the extensive ecosystem of data processing tooling it offers.

For databases, OpenTrials will use Elasticsearch and PostgreSQL. Both of these solutions have been chosen based on previous experience, and the flexibility that each offering brings to data storage (Elasticsearch is much more than “just” a search backend, and PostgreSQL is much more than “just” an SQL backend).

3rd party integrations

We are working towards a number of 3rd party integrations for OpenTrials, and we expect these type of integrations to increase over 2016. Some of the first integrations include:

Glossary

We use several terms in the roadmap to describe the various components of the OpenTrials platform. For ease of understanding, here’s a short glossary explaining these terms in this context.

Roadmap

Here we’ll present a high-level view of shape and development of the OpenTrials platform. We practice agile development at Open Knowledge International, so do not think of this roadmap as a strict plan of action, but rather, as a document that reflects our current thinking and estimates, and is subject to change.

February – March 2016

Goals

Contact [email protected] if you have a particular interest in any of the efforts highlighted above, and would like to contribute!

Outcomes

April – May 2016

Goals

Contact [email protected] if you have a particular interest in any of the efforts highlighted above, and would like to contribute!

Outcomes

June – July 2016

Goals

Contact [email protected] if you have a particular interest in any of the efforts highlighted above, and would like to contribute!

Outcomes

Onwards

Development of OpenTrials will continue throughout 2016 with the broad goals of expanding the database and exposing interfaces for crowdsourcing mechanisms to contribute new data and clean existing data. As the year progresses, we can solidify the roadmap for Q2 2016 based on actual development status, our improved understanding of user needs, and new opportunities around data.

Goals

Contact [email protected] if you have a particular interest in any of the efforts highlighted above, and would like to contribute!

Outcomes

Leave a Reply

Your email address will not be published. Required fields are marked *