Clinical evaluation of digital therapeutics

Alice Ravizza¹, Oriana Ciani², Antonio Ferrari³
Sabrina Grigolo⁴, Francesco Perrone⁵, Elisabetta Ravot⁶
Eugenio Santoro⁷, Federico Sternini¹
¹Use-Me-D, Torino
²CERGAS, SDA Bocconi School of Management, Milano
³IQVIA Ltd, Durham, NC, United States
⁴European Patients’Academy / EUPATI Non-Profit Organization
⁵Clinical Trials Unit, IRCCS National Cancer Institute “Fondazione Pascale”, Napoli
⁶Healthware Group, Milano
⁷Laboratory of Medical Informatics, Mario Negri Institute for Pharmacological Research, Milano
Tendenze nuove, Numero Speciale 4 2021; 71-104: DOI: 10.32032/TENDENZENUOVENS04202105.PDF

1. Evaluation of clinical benefit

Since software delivering digital therapeutics (DTx) is classified as a medical device, in Europe it must comply with Regulation (EU) 2017/745.

An important difference between the obsolescent Medical Device Directive (MDD) and the Medical Device Regulation, which came fully into effect in May 2021, is the detailed requirement for evaluation of performance and clinical benefit data. Evaluation of the software’s efficacy must be based on clinical data and clinical evidence. This means that it is no longer enough to demonstrate that the software works: there must also be documentation of statistically significant efficacy in providing the clinical benefit for which it was designed. In many cases, it will not be possible to collect sufficient clinical evidence merely on the basis of the literature, and adequate clinical investigation will therefore be necessary.

Developers can refer to ISO 14155 (“Clinical investigation of medical devices for human subjects – Good Clinical Practice”), which details methods for planning and monitoring of clinical investigation.

Manufacturers and regulatory authorities can also consult MDCG 2020-1 “Guidance on Clinical Evaluation (MDR)/Performance Evaluation (IVDR) of Medical Device Software”, dated March 2020, a source of detailed indications regarding how to plan the various aspects of collecting clinical evidence and how to draft reports for regulatory purposes.

The guidance is particularly appropriate when dealing with diagnosis and monitoring software, whose expected benefit is the availability of accurate information. Regarding clinical evaluation of DTx, we find that the document provides some useful information but should go into greater detail regarding specific modalities.

First and foremost, the definition of clinical benefit is confirmed (Article 2.53 of MDR) as follows: “[means] the positive impact of a device on the health of an individual, expressed in terms of a meaningful, measurable, patient-relevant clinical outcome(s), including outcome(s) related to diagnosis, or a positive impact on patient management or public health.”

The MDR (Article 2.51) also clarifies the definition of clinical evidence (“clinical data and clinical evaluation results pertaining to a device of a sufficient amount and quality to allow a qualified assessment of whether the device is safe and achieves the intended clinical benefit(s), when used as intended by the manufacturer”).

Another key concept reiterated in the MDCG 2020-1 guidance is that of “valid clinical association”, meaning “the extent to which the (software) output is associated with the targeted physiological state or clinicalcondition”. This association must demonstrate a robust connection between the scientific principles underlying the software’s use and the expected benefit.

Also important in MDCG 2020-1 is the definition of generalizability (“the ability of medical device software – MDSW to extend the intended performance tested on a specified set of data to the broader intended population”). The concept of generalizability should be taken into account whether the intention is to create a database for the software’s technical and preclinical validation, or to define inclusion and exclusion criteria for clinical investigation – and, more generally, when discussing implementation of a device outside its clinical investigation setting. In terms of planning, in most cases the clinical development of DTx includes an exploratory initial phase (pilot studies), followed by an investigation phase to produce evidence of efficacy and safety as required for approval in the specific therapeutic indication (pivotal studies for CE certification).

The characteristics of pilot studies and pivotal studies for DTx tend to reflect the pattern typically adopted in investigation of traditional therapeutic interventions (e.g., pharmacological therapy or behavioural therapy); however, the very nature of a DTx intervention may require revision or adaptation of the standard investigation models, in particular with regard to choice of endpoints, indicators of efficacy and control procedures, not ruling out the possibility of creating/validating outcome indicators (clinical, health technology assessment – HTA, etc.) specifically suited to these types of product. One of the main characteristics of DTx is the active participation of the patient/caregiver in the treatment pathway. Based inter aliaon indications which have given evidence of benefit, the planning of studies to produce evidence in relation to DTx should therefore give close attention to selection and motivation of patient/caregiver, with representatives of both these groups ideally involved from the very outset in definition of the study project^(1,2).

2. Overview of existing clinical trials

The aim of this section is to provide an overview of studies designed for development of some medical devices identified as DTx.

To provide a consistent analysis, various aspects have been extrapolated, such as the presence of gold standard comparators, the population from which subjects are recruited, the type of intervention and control, and the definition of primary outcomes.

2.1 Advanced phase products

This section analyses studies that have provided clinical evidence of efficacy for some products in an advanced phase of development and/or already on the market.

The following is a list of studies whose results have been used by manufacturersto demonstrate the efficacy of the solution under investigation, as reported by scientific publications in peer-reviewed journals or registration dossiers on the site clinicaltrials.gov (Table 1).

The table presents the state of the art as at 11 April 2021. RCT = Randomized controlled trial;
TAU = Therapy as usual (Standard therapy); CRA = Community reinforcement approach; CM = Contingency management; ADHD = Attention deficit hyperactivity disorder; PRO = Patient reported outcome;
COPD = Chronic obstructive pulmonary disease; IBS = Irritable bowel syndrome; SPD = Sensory processing dysfunction.

It can be seen that for products in an advanced phase of development, many have been tested in randomized controlled trials (RCT), with at least 200 patients evaluated. For primary outcomes, items considered include physiological parameters, patient behaviour, or results of validated questionnaires. For the control group, there is a preference for therapy as usual or for administration of a digital placebo, to all intents and purposes “analogous” to the concept of placebo as usually understood in the RCT setting. By digital placebo is meant a digital solution comparable with the DTx being tested, in terms of content (the placebo carries all the information provided by the DTx, but using static rather than interactive interfaces) or in terms of graphic presentation (it presents the same interfaces and introduces the same user routine, but without the elements responsible for clinical benefit).

2.2 Development phase products

There are many examples of DTx currently in the development phase. The following is a list of these, in both completed and ongoing studies, and as registered on the site clinicaltrials.gov (Table 2).

3. Characteristics of pilot studies

A pilot study or feasibility study is generally a small-scale study, useful in testing whether a project is adequate, assessing its feasibility or obtaining information on the basis of which to determine sample size for more extensive (pivotal) investigation⁽⁷⁵⁾.

In the case of medical software, a pilot study can be relevant for:

•demonstration of usability, for the interface or for associated hardware devices (e.g., wearable sensors);

•creation of a training database for artificial intelligence;

•definition, for subsequent studies, of the population to be evaluated (including sample determination), of key study design features such as recruitment, randomization and blinding modalities, as well as primary aim and secondary and/or exploratory endpoints;

•demonstration and quantification of the valid clinical association between outputs delivered by the software and clinical benefit.

Almost all pilot studies are non-randomized and have no control group. In addition, they do not always include efficacy endpoints, while they often focus on safety endpoints.

The following is a brief presentation of two illustrative cases, differing in size and study characteristics.

3.1. Treatment for smoking cessation (Clickotine)

This is an illustrative case of an initial study, conducted on a broad population of more than 400 subjects, to evaluate usability, efficacy and safety outcomes⁽⁷⁾.

Clickotine is an app that can be used on a smartphone, designed and developed to deliver the essential elements of the United States programme for smoking cessation (US Clinical Practice Guidelines) – e.g., advice and encouragement regarding smoking cessation, assessment of the will to stop smoking, motivational stimuli, support for planning, links with other interventions (counselling, pharmacological treatment, social support, dedicated phone line), follow-up.

Results of a single-arm 8-week initial study were published in 2017. The outcomes measured were the subjects’engagement (number of times the app was opened, number of interactions with the programme, active weeks in the programme), smoking cessation efficacy (7 and 30 non-smoking days reported by the patient) and safety (adverse events spontaneously reported and actively investigated by questionnaire, after completion of the 8-week programme). Data were statistically analysed; in addition, post hoc analyses were carried out, given the non-normal distribution of the most predictive variables or dichotomous outcome variables. Overall, the study produced encouraging engagement data: on average, each participant opened the app more than 100 times during the 8 weeks of the study (mean = 110.6, median = 69); participants had numerous interactions with the programme (mean = 214.4, median = 178) and remained engaged for a mean of 5.3 weeks (median = 5 weeks). In addition, at the end of 8 weeks, 45.2% of the intention-to-treat sample reported 7 non-smoking days and 26.2% reported 30 smoke-free days. The few adverse events reported were consistent with nicotine suspension symptoms, while no safety events occurred that were specifically attributable to the product.

Figure 1 shows the study flowchart (see clinicaltrials.gov: NCT02656745).

3.2. Intervention for anxiety management (Unwinding Anxiety)

This is an illustrative case of a RCT in comparison with habitual therapy, carried out on a sample of about 60 subjects (a common number in pilot studies), with evaluation of outcomes both for engagement/ acceptability and for efficacy.

Unwinding Anxiety is a behavioural intervention programme, usable on a smartphone or tablet, to help the patient manage anxiety, including generalized anxiety disorder, panic attacks and social anxiety. The programme includes video lessons, exercises, weekly contact with experts, diary entries (journaling), participation in a community moderated by experts, and various daily assessments.

Pilot testing of Unwinding Anxiety comprised a RCT against habitual therapy alone, in subjects with generalized anxiety disorder, to evaluate their engagement and the programme’s acceptability, measure its effects in association with habitual treatment versus habitual treatment alone, and carry out preliminary evaluation of the mechanism of action. The study involved 65 adult patients (of whom 57 proved eligible for evaluation). Inclusion criteria included possession of a smartphone and a GAD-7 (General Anxiety Disorder-7) score of 10 or higher.

Primary outcome measures were the number of modules completed (to evaluate engagement) and changes in Penn State Worry Questionnaire score (PSWQ). Other outcome measures included use of validated questionnaires for evaluation of mindfulness, interoception, anxiety level (GAD-7) and acceptability of the programme for users (Net Promoter Score – NPS). The investigators intend to use the results of this phase to design a broader phase 2 study. (See clinicaltrials.gov: NCT03683472).

It is interesting to note that, in an independent study by Australian investigators, Unwinding Anxiety proved to be one of the only two apps (out of 348 health and wellness apps assessed) to obtain a score of 4/5 (the highest score obtained). The study looked at multiple parameters in terms of the programme’s operation and its potential to modify behaviour⁽⁷⁶⁾.

In addition, a recent pilot study investigated the effects of Unwinding Anxiety in a sample of United States doctors with anxiety (n = 34), demonstrating 48% reduction of anxiety (GAD-7) after 28 modules (1 month), and 57% reduction after 3 months (p <0.001 in both cases)⁽⁷⁷⁾.

4. Characteristics of pivotal studies

A pivotal study is study specifically designed to enhance data collected during development, and thus provide adequate clinical evidence of efficacy and safety for evaluation by a regulatory agency. In all cases in which evidence in the literature is insufficient for purposes of CE certification, the manufacturer must carry out a pivotal study.

Generally, proof of efficacy for DTx should come from high-quality clinical trials, robust and compliant with state-of-the-art methodological standards for evidence based medicine. A useful reference is the extension of Consolidated Standards of Reporting Trials (CONSORT) to include web-based and m-health interventions⁽⁷⁸⁾. In addition, it should be demonstrated that the study tested the software for all foreseen uses and user groups, respecting the target population, conditions of use and usage environment.

Study design must enable collection of statistically significant evidence, involving sufficiently large samples, whose size must be defined beforehand on the basis of a statistical hypothesis. Study design must also enable documentation of the device’s possible place in therapy, defining the appropriate control group accordingly. Since the characteristics of medical software often make blinded study design very complicated, choice of endpoints and metrics is essential to a study’s reliability and methodological consistency.

Multicentre studies are recommended, particularly for pivotal studies. In addition to enabling collection of data from a number of hospitals/geographical areas, and thus offering evidence that is more representative of the potential target for DTx, this approach should also offer the advantage of reducing the lead time for recruitment – a particularly important consideration with an eye to the risk of the proposed technology becoming outdated.

In Europe, valuable information for investigators can be found in the NICE “Evidence standards for digital health technologies”⁽⁷⁹⁾, issued in March 2019 by the National Institute for Health and Care Excellence, UK.

Clinical evidence requirements set out by NICE vary according to the type of intervention the device carries out, according to whether its aim is preventive or therapeutic. In the latter case, requirements are understandably more stringent.

Software is classified along the lines presented in the NICE guidance, which places DTx evidence requirements within tier 3, subdivided in turn into tiers 3a and 3b on the following basis:

•If the software is intended to modify the subject’s behaviour, being used for preventive behaviour change or to allow self-management of a diagnosed condition, tier 3a evidence standards apply;

•If the software is used for treatment, being designed to provide or guide treatment, active monitoring and clinical calculations, or to provide or guide a diagnosis, tier 3b evidence standards apply.

The standards are cumulative. This means that, while each tier comprises specific evidence of effectiveness requirements (see Tables 3 and 4), it also requires full compliance with the evidence standards of lower tiers. Thus, tier 3 DTx must also meet the tier 1 and 2 standards.

Hence the importance of analysing requirements for prevention devices. The NICE evidence standards include minimum and recommended levels of scientific evidence, the latter being defined as “best practices”.

It can be seen that studies on products in advanced phase development (Table 1) all fall within the tier 3b evidence for effectiveness standards, while pilot studies can have greater flexibility in design. Three illustrative cases of pivotal studies on DTx in advanced phase development are presented below. All three studies showed statistically significant benefit for DTx (versus the control treatment) in terms of primary endpoints, in many cases related to clinical efficacy or to prevention of the disease concerned. It should be borne in mind that both therapeutic and preventive aims are consistent with common definitions of DTx – e.g., that of the Digital Medicine Society/Digital Therapeutic Alliance.

4.1. DTx for paediatric attention deficit hyperactivity disorder (ADHD)(Akili, AKL-T01)

This is an illustrative case of a pivotal study for DTx used in monotherapy, controlled versus active digital intervention.

AKL-T01 is a digital treatment based on a video game interface, developed for treatment of paediatric ADHD. The software sends motor and sensory stimuli, able to reach and activate the prefrontal cortex (the brain area involved in cognitive functions). Clinical development comprised a number of studies, up to completion of a multicentre pivotal study (20 centres) that was randomized, double-blind and controlled versus active intervention. The study involved 348 children of 8-12 years with confirmed diagnosis of ADHD. DTx was used in monotherapy (the children were not taking pharmacological treatment, or were able to suspend any prior pharmacological treatment), the control intervention being a different videogame (AKL-T09); the primary endpoint was improvement of the Test of Variable Attention Performance Index, measured basally and then at the end of 4 weeks (therapy about 25 minutes per day, 5 days per week)⁽⁵⁾.

This study is important, as one of the first and most rigorous investigations of DTx in a large prospective RCT, with an approach similar to that usually used for pharmacological treatment. Figure 2 shows the study design flowchart (see the site clinicaltrials.gov: NCT02674633).

4.2. DTx for treatment of substance abuse (reSET)

This is an illustrative case of a pivotal study for DTx used in association with a modified habitual therapy, the control being the habitual therapy alone.

reSet is a digital intervention for treatment of substance abuse. It is a 90-day cognitive behavioural therapy CBT, to be administered as a partial replacement of habitual therapy (part of which is substituted with reSet), and is complementary to normal contingency management. This solution enables integration of a dashboard for doctors and other healthcare operators, with information on the patients’use of reSet, substance use (both as reported by the patient and based on laboratory data), and other information provided by patients.

The pivotal, randomized, controlled multicentre study (10 centres) initially involved 1,781 adult subjects, 507 of whom were randomized to receive only habitual therapy or reSet in combination with a programme of modified (reduced) habitual therapy. The reSet component could be run by the participants on computers at the study centre, or managed autonomously by them outside the centre. Preset primary outcomes were abstinence from heavy drugs or alcohol in the last 4 weeks (self-reported or based on urine analysis)⁽³⁾.

The study showed that likelihood of abstinence increased in a statistically significant manner for subjects using stimulants. In particular, patients treated with reSet showed a lower dropout rate from treatment than in the case of habitual treatment (Hazard Ratio = 0.72 [IC 95% CI, 0.57-0.92], p = 0.010), as well as an increase in the period of abstinence (Odds Ratio = 1.62 [IC 95%: 1.12-2.35], p = 0.010); the effect was more marked in patients who, at the time of pre-recruitment screening, tested positive for substances in a urine sample and/or for alcohol in a breathalyser (n = 228) (Odds Ratio = 2.18 [IC 95%: 1.30-3.68], p = 0.003).

Figure 3 shows the flow chart for study enrolment and design (see clinicaltrials.gov: NCT01104805).

4.3. Digital intervention for diabetes prevention (Noom Coach app)

This is an illustrative case of a RCT for a digital coaching programme with a preventive aim.

Noom Coach app is a virtual coaching intervention delivering a diabetes prevention programme through an interactive interface. The app includes messenger functions between coach and participants, group messages, support to behavioural changes by setting of daily challenges, educational articles on DPP, input on food intake and automatic feedback on dietary choices. After an extensive pilot phase, a RCT was recently completed with 202 adult participants, the primary aim of which was to evaluate the efficacy of the mobile platform Noom Coach DPP by comparison with a habitual medical intervention (participants were free to seek assistance or medical support during the study, and were also given a hard copy version of the DPP)⁽⁶⁾. The primary endpoint was modification of body weight as measured basally, at 6 months and at the end of the study (total duration, 52 weeks); further objectives included variations in glycosylated haemoglobin levels, in productivity at work (by validated questionnaire), and in quality of life (by questionnaire CDC HRQOL-4).

5. Evaluation of the safety profile

Alongside evaluation of benefit, pivotal studies (or RCT) generally provide information about possible side effects associated with the (new) treatment under evaluation. This measure, together with evaluation of benefits, serves to determine tolerability and thus to evaluate the risk-benefit ratio of the new treatment.

DTx interventions can have undesired effects. While these are generally less serious than those caused by traditional drugs, and often more readily managed, they may be more frequent than in the control arms of the studies concerned and must, in any case, not be underestimated.

For example, a study that demonstrated the efficacy of DTx for smoking cessation showed side effects in two patients (mood swings, nightmares), probably associated with the treatment, and, in a few other cases, fatigue (not associated with the treatment)⁽⁷⁾.

Recently, in illustrating the benefits of Endeavor (previously known by the experimental code AKL-T01), a video game-based DTx treatment of paediatric ADHD that received FDA market approval in June 2020, the treatment was not associated with any severe adverse events. However, in 9.3% of cases, patients treated showed non-severe adverse events associatedwith the treatment, including a sense of frustration, headache, vertigo, emotional reactions, nausea or aggressiveness⁽⁸⁰⁾.

Adverse events that can be associated with DTx must in any case be measured and evaluated. Not in every case, however, are events attributable to the DTx intervention. For example, in a RCT to measure the efficacy of a DTx intervention in management of diabetes, there was a low percentage of hypoglycaemic events, hospital admissions and visits to A&E, the rates being wholly comparable to those in the control group), and these adverse events were not related to the study treatment⁽⁸⁾.

As in the case of traditional therapies, careful evaluation of undesired effects that can be associated with DTx is possible thanks to post-marketing studies (phase IV studies). This makes it possible to obtain information on broader and more heterogeneous populations, as well as on long-term safety.

An equally interesting, and perhaps more important, topic is that of dropouts – in other words, those who discontinue the DTx before the end of the trial. In a study to evaluate the efficacy of DTx for treatment of chronic back pain, there was progressive dropout: after 8 weeks of DTx, only 32% of the initial study sample were still following the treatment, while at the end of the study (12 weeks) only 18% were left.

Close attention to DTx design, greater involvement of patients in the development phase and closer attention to user feedback, together with greater user engagement thanks to memoranda, emails, SMS and messages, are aspects that should be focused on with a view to reducing the dropout rate as far as possible.

6. The role of Institutional Review Boards

To favour a system enabling a significant european contribution to the development of DTx, the IRB must have adequate know-how and specific guidelines, bearing in mind that investigation of DTx can differ significantly from that of a drug or of a traditional medical device (e.g., a joint prosthesis), which boards will be more familar with.

DTx study design must guarantee protection of sensitive data and allow management of the rapid technological evolution to which these systems are subject.

In this respect, the main critical issues are controlling changes to the product during the study, and data backups for the study. Given the rapid evolution of stand-alone medical device software, there is every likelihood of frequent version updates for the device during the study.

The manufacturer must therefore factor into the study protocol and the product itself a change control procedure not requiring the approval of an IRB or other authority. This will cover updates such as bugfixes, modifications related to usability criteria rather than to clinical aspects, or other changes of a technical nature that the DTx manufacturer is bound to accept as a matter of course (e.g., updates of an Android iOS platform or health app).

7. Post-marketing surveillance

From the very first issue of a CE certificate, the DTx manufacturer must plan collection of post-marketing clinical data. By definition, the studies concerned must be carried out in compliance with indications for use, in a patient population for which the device is already certified: this means that the studies cannot be used to test new clinical applications of the software, the aim being to confirm the risk-benefit ratio and collect information on large-scale use. In this perspective, post-marketing surveillance studies provide an excellent opportunity to collect important information in terms of treatment compliance and user experience.

The MDR encourages use of follow-up study results to provide new information with a view to enhancing and updating software. Hence the relevance of, for example, adequately controlled post-marketing studies to increase the training database for an artificial intelligence system used in healthcare.

The very nature of DTx enables massive post-marketing data collection. Manufacturers should therefore set up systems for real-world clinical data collection, while obviously respecting patients’data protection.

An excellent example of this practice is Natural Cycles, a DTx application for birth control. Use of this device has enabled data collection in a number of observational and retrospective clinical studies to evaluate its efficacy, both in experimental use and in real-world settings^{(24, 81-83)}. In addition, the data thus obtained have enabled the manufacturer to identify further indications not strictly linked to demonstration of the device’s efficacy, but related to its use, thus making it possible to share important scientific information^{(84, 85)}.

The manufacturer’s data collection system should also take into account effects on the patient’s health once the treatment has stopped.

8. Maximum acceptable modifications to DTx not requiring a completely new start to clinical validation

As already pointed out, by comparison with other forms of drug or medical devices, a peculiar feature of products like DTx is the possibility of rapid technological evolution taking place during the clinical validation phase. This is a particularly sensitive topic, for which clear and well-defined guidelines would be appropriate.

In this respect, attention should be drawn to the interesting proposals drawn up by Torous et al. in their paper “Towards a consensus around standards for smartphone apps and digital mental health”⁽⁸⁶⁾. Stating that they “represent leaders in mHealth research, industry and health care systems from around the globe”, the authors state:

“Changes in technology may mean that app updates need to be re-evaluated for their efficacy. Small cosmetic changes, platform changes and aspect changes do not likely require a retest of an intervention, as long as the therapeutic principle that has been evaluated remains intact. […] However, significant changes, such as adding a new therapeutic principle or substantial changes to that principle, must demonstrate efficacy through the same evaluationpathways as novel therapeutics.

Our recommendations are:

a. newly adapted therapeutic principles, which should be identified and defined, must undergo controlled clinical trials to determine their efficacy and effectiveness;

b. small changes to an app with an evidence base need not undergo another clinical trial, but any major change requires a re-evaluation of app effectiveness.”

On the basis of these general principles, it is fundamental to understand how to distinguish the “significant” or “small”modifications on the basis of which the product’s effects may or may not have to be re-evaluated.

The manufacturer must first define policies to include a definition of how any modifications can impact the device, on the basis of a risk analysis. The following is an example readily applicable to DTx:

a. Major modification

• any variation that can potentially have an impact in terms of safety, efficacy and risk characteristics of the software system – e.g., creation of a new training database with different clinical indicators;

• any variation in intended use, for example:

• use of the system for a different patient population

• integration of the system with a module to enable flagging of a clinical emergency

• integration of the system with a module to manage interaction with wearable devices

• change to the method of interaction with the patient, introducing vocal interaction

• integration of the system with a new, additional parameter that can be measured to provide input for the algorithm

• any change in the main architectural structure, meaning a change to the organization of modules and software elements that will in turn bring changes to the data flow for the intended clinical use, or to segmentation (e.g., transition from a system based on a locally used downloadable app to a mainly cloud-based system)

• any change that could cause incompatibility between the modules or with the interface systems (database or operating system);

b. Minor modification

• any modification not included in the definition of “major”, such as:

• improvement of performance, without modification to intended use or new input types

• update of training or validation database, with data that are comparable in terms of quality and information content to those used in the previous version;

• modifications of access, logistic or management input, without changing intended use, such as:

• new means of user identification (e.g., biometrics rather than password)

• data input by professional user’s voice command;

• update of user interface, new icons

• translations

• update of user manual

• modifications or integrations to non-critical software of unknown provenance (SOUP)

• bots for technical assistance to patients/professional users – e.g., for password change assistance.

Each modification classified in this way has to be interpreted and analysed when proposed within a clinical study:

• in the case of a major modification, it should be evaluated in detail, to identify whether the modification is completely unacceptable or the information collected with the product in the previous version is still partly significant;

• minor modifications, which we have seen should be acceptable and not involve the need for introducing a completely new validation process, can be communicated to investigators and formalized by updates to theInvestigator’s Brochure; where appropriate, they can also be communicatedto users (e.g., in the event of minor modifications to the interface, in which case training activities could possibly be indicated with a view to guaranteeing that the user is able to continue using the product);

• bugfixes are to be considered acceptable, and should be communicated to investigators and users with automatic notifications (e.g., in-app notifications).

What is known

• Software delivering digital therapeutics (DTx) consists of medical devices and, in Europe, must therefore comply with Regulation (EU) 2017 745

• The Regulation states that it is necessary to demonstrate clinical benefit, by means of clinical investigation, but it does not provide details regarding the characteristics of such investigation

• Today, there are examples of approved DTx, while many others are in the development phase. The level of evidence available, or that is being generated, varies

• Evaluation of clinical benefit for DTx must be based on demonstrable and statistically significant clinical evidence regarding improvement of a specific, measurable health parameter. The novelty and innovativeness of these technologies must not be considered ends in themselves, because not everything that is technologically advanced is automatically useful, efficacious and cost-effective, to the same extent as (if not more than) traditional methods

• Traditional evidence-based research methods are not always bound to be perfect or automatically applicable to DTx, but there must be a clear aim to achieve exhaustive description of the risk-benefit ratio and to develop a good knowledge of what can be reasonably foreseen in terms of clinical use.

What is uncertain

• In the absence of precise recommendations in the Regulation, it is not clear what must be the characteristics of clinical trials to support DTx certification in Europe

• DTx present peculiar characteristics that raise specific needs to be addressed during study design, with a view to obtaining exhaustivecharacterization of the risk-benefit profile: choice of control, as well as metrics for outcome measurent in relation to efficacy and safety, must often be customized on a case-by-case basis.

What we recommend

• More detailed regulations are needed in Europe regarding clinical investigation of DTx, in order to ensure adequate and uniform standards of efficacy and safety for DTx medical devices

• To enable developers to collect clear evidence of benefit, and to enable its correct evaluation by regulatory authorities, the following recommendations are made:

• possibility of expressing clinical benefits through metrics appropriate not only to the intended use, but also to the technology involved

• in a controlled study, not only a clinical but also a technical rationale for choice of control (e.g., in the case of DTx for cognitive behaviouraltherapy, the control could be a drug or CBT administered in person, in a telemedicine setting or by means of a normal videogame)

• in the clinical protocol, possibility of defining the pipeline of possible foreseen changes to technical, interface and software architecture, defining for each of them an impact policy in relation to the conduct of the study

• adequate patient enrolment numbers, with a view to demonstrating the possible advantages of DTx by comparison with the control arm

• activation of patient engagement strategies to optimize planning and conduct of the study (e.g., limiting the possible dropout rate), interaliawith a view to enhancing alignment of study aims with fulfilment of patients’needs

• clinical validation plans, taking into account long-term side-effects

• It is also recommended that clinical trials of DTx can be evaluated with the involvement of different regulatory authorities – e.g., the European Medicines Agency for studies comparing DTx with a drug, or in cases where its intended use is similar or identical to that of a drug.

Bibliography

1.O’Connor S, Hanlon P, O’Donnell CA, et al. Barriers and facilitators to patient and public engagement and recruitment to digital health interventions protocol of a systematic review of qualitative studies. BMJ Open 2016; 6: e010895.

2.Putrino D, Zanders H, Hamilton T, et al. Patient engagement is related to impairment reduction during digital game-based therapy in stroke. Games Health J 2017; 6: 295-302.

3.Campbell AN, Nunes EV, Matthews AG, et al. Internet-delivered treatment for substance abuse: a multi-site randomized controlled clinical trial. Am J Psychiatry 2014; 171: 683-90.

4.Christensen DR, Landes RD, Jackson L, et al. Adding an Internet-delivered treatment to an efficacious treatment package for opioid dependence. J Consult Clin Psychol 2014; 82: 964-72.

5.Kollins SH, Bower J, Findling RL, et al. 2.40 A multicenter, randomized, active-control registration trial of software treatment for actively reducing severity of ADHD (Stars-Adhd) to assess the efficacy and safety of a novel, home-based, digital treatment for pediatric ADHD. J ACAAP 2018; 57: S172.

6.Akili Interactive Labs, Inc. Software Treatment for Actively Reducing Severity of ADHD as Adjunctive Treatment to Stimulant (STARS-ADHD Adjunctive). https://clinicaltrials.gov/ct2/show/NCT03649074.

7.Davis NO, Bower J, Collins SH. Proof-of-concept study of an at-home, engaging, digital intervention for pediatric ADHD. PloS One 2018; 13: e0189749.

8.Anguera JA, Brandes-Aitken AN, Antovich AD, et al. A pilot study to determine the feasibility of enhancing cognitive abilities in children with sensory processing dysfunction. PLoS One 2017; 12: e0172616.

9.Yerys BE, Bertollo JR, Kenworthy L, et al. Brief Report: Pilot Study of a Novel Interactive Digital Treatment to Improve Cognitive Control in Children with Autism Spectrum Disorder and Co-occurring ADHD Symptoms. J Autism Dev Disord 2019; 49, 1727-37.

10.Prevention of Diabetes Using Mobile-enabled, Virtual Delivery of the National Diabetes Prevention Program. https://clinicaltrials.gov/ct2/show/NCT03865342. I risultati di precedente studio pilota sono pubblicati come: Michaelides A, Major J, Pienkosz E Jr, Wood M, Kim Y, Toro-Ramos T. Usefulness of a novel mobile diabetes prevention program delivery platform with human coaching: 65-week observational follow-up. JMIR Mhealth Uhealth 2018; 6(5): e93.

11.Iacoviello BM, Steinerman JR, Klein DB, et al. Clickotine, a personalized smartphone App for smoking cessation: initial evaluation. JMIR Mhealth Uhealth2017; 5: e56.

12.Quinn CC, Shardell MD, Terrin ML, et al. Cluster-randomized trial of a mobile phone personalized behavioral intervention for blood glucose control. Diabetes Care 2011; 34 : 1934-42.

13.Jeandidier N, Chaillous L, Franc S, et al. DIABEO App software and telemedicine versus usual follow-up in the treatment of diabetic patients: protocol for the TELESAGE randomized controlled trial. JMIR Res Protoc 2018; 7: e66.

14.Espie CA, Kyle SD, Williams C, et al. A randomized, placebo-controlled trial of online cognitive behavioral therapy for chronic insomnia disorder delivered via an automated media-rich web application. Sleep 2012; 35: 769-81.

15.Rassouli F, Boutellier D, Duss J, et al. Digitalizing multidisciplinary pulmonary rehabilitation in COPD with a smartphone application: an international observational pilot study. Int J Chron Obstruct Pulmon Dis 2018; 13: 3831-6.

16.Huber S, Priebe JA, Baumann KM, et al. Treatment of low back pain with a digital multidisciplinary pain treatment App: short-term results. JMIR Rehabil Assist Technol 2017; 4: e11.

17.Merchant RK, Inamdar R, Quade RC. Effectiveness of population health management using the Propeller Health Asthma Platform: a randomized clinical trial. J Allergy Clin Immunol Pract 2016; 4: 455-63.

18.A 12-week treatment study to evaluate the effectiveness of albuterol multidose dry powder inhaler with integrated electronic Module Digital System (eMDPI DS) in patients 13 years or older with asthma – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03890666.

19.Clinical efficacy and cost effectiveness of MYCOPD in patients with mild and moderate newly diagnosed COPD – Full Text View – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03620630.

20.Preventing diabetes with digital health and coaching – Full Text View – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03312764.

21.Parks AC, Williams AL, Tugade MM, et al. Testing a scalable web and smartphone based intervention to improve depression, anxiety, and resilience: A randomized controlled trial. Int J Wellbeing 2018; 2: 8.

22.McDermott AF, Rose M, Norris T, Gordon EA. Novel feed-forward modeling system leads to sustained improvements in attention and academic performance. J Atten Disord 2016; 1087054715623044 doi: 10.1177/1087054715623044.

23.Trial of an innovative smartphone intervention for smoking cessation – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT02724462.

24.Bull J, Rowland S, Lundberg O, et al. Typical use effectiveness of Natural Cycles: postmarket surveillance study investigating the impact of previous contraceptive choice on the risk of unintended pregnancy. BMJ Open 2019; 9: e026474.

25.Ritterband LM, Thorndike FP, Ingersoll KS, et al. Effect of a web-based cognitive behavior therapy for insomnia intervention with 1-year follow-up: a randomized clinical trial. JAMA Psychiatry 2017; 74: 68-75.

26.Masaki K, Tateno H, Nomura A, et al. A randomized controlled trial of a smoking cessation smartphone application with a carbon monoxide checker. NPJ Digit Med 2020; 3: 35.

27.BoveR, RowlesW, ZhaoC, et al.A novel in-home digital treatment to improve processing speed in people with multiple sclerosis: A pilot study. Mult Scler2020;1352458520930371. doi: 10.1177/1352458520930371.

28.Bricker JB, Watson NL, Mull KE, et al. Efficacy of smartphone applications for smoking cessation: a randomized clinical trial. JAMA Intern Med 2020; e204055. doi: 10.1001/jamainternmed.2020.4055.

29.Brendryen H, Lund IO, Johansen AB, et al. Balance–a pragmatic randomized controlled trial of an online intensive self-help alcohol intervention. Addiction 2014; 109: 218-26.

30.Osborn CY, Hirsch A, Sears LE, et al. One Drop App With an Activity Tracker for adults with type 1 diabetes: randomized controlled trial. JMIR Mhealth Uhealth 2020; 8: e16745.

31.Contingency management using smartphone App in patients with SUD – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT04162132.

32.Braley M. A parallel, virtual, randomized trial of PCT for speech, language, and cognitive intervention in stroke patients. https://clinicaltrials.gov/ct2/show/study/NCT04488029 .

33.Developing a novel digital therapeutic for the treatment of generalized anxiety disorder – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03683472.

34.Traumatic nightmares treated by NightWare (to arouse not awaken) – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT04040387.

35.Everitt H, Landau S, Little P, et al. Therapist telephone-delivered CBT and web-based CBT compared with treatment as usual in refractory irritable bowel syndrome: the ACTIB three-arm RCT. Health Technol Assess 2019; 23: 1-154.

36.Webb J, Peerbux S, Smittenaar P, et al. Preliminary outcomes of a digital therapeutic intervention for smoking cessation in adult smokers: randomized controlled trial. JMIR Ment Health 2020; 7: e22833.

37.Wiecek E, Torres-Robles A, Cutler RL, et al. Impact of a multicomponent digital therapeutic mobile App on medication adherence in patients with chronic conditions: retrospective analysis. J Med Internet Res 2020; 22: e17834.

38.Thorgeirsson T, Torfadottir JE, Egilsson E, et al. Randomized Trial for Weight Loss Using a Digital Therapeutic Application. Diabetes Sci Technol 2021: 19322968211000815. doi:10.1177/19322968211000815.

39.Krishnakumar A, Verma R, Chawla R, et al. Evaluating glycemic control in patients of South Asian origin with type 2 diabetes using a digital therapeutic platform: analysis of real-world data. J Med Internet Res 2021; 23: e17908.

40.A digital therapy for diabetes prevention among overweight adults in Terengganu, Malaysia – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03997656.

41.Fit-One: A trial evaluating the effect of One Drop and Fitbit on diabetes and pre-diabetes outcomes – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03459573.

42.FLARE Lupus Research Study Systemic Lupus Erythematosus – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03426384.

43.Impact of Ibis on patients with advanced COPD – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03131622.

44.Web-based addiction treatment: cultural adaptation with American Indians – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03363256.

45.Innovative digital therapeutic for smoking cessation – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03694327.

46.Effectiveness of a digital therapeutic on type 2 diabetes – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03369626.

47.Study of efficacy of PEAR-004 in schizophrenia – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03751280.

48.Remote study of NightWare for PTSD with nightmares – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03934658.

49.NightWare Open Enrollment Study – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03828656.

50.Traumatic nightmares treated by NightWare (to arouse not awaken) – Full text view – ClinicalTrials.gov.https://clinicaltrials.gov/ct2/show/NCT03795987.

51.Luminopia VR Device Pilot Study – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT02782117.

52.Bove RM, Rush G, Zhao C, et al. A videogame-based digital therapeutic to improve processing speed in people with multiple sclerosis: a feasibility study. Neurol Ther2019; 8: 135-45.

53.Hsu WY, Rowles W, Anguera JA,et al. Application of an adaptive, digital, game-based approach for cognitive assessment in multiple sclerosis: observational study. J Med Internet Res 2021; 23: e24356.

54.Limbix Health, Inc. Limbix Spark: A CBT-based mobile intervention as first line treatment for adolescent depression during COVID-19.https://clinicaltrials.gov/ct2/show/study/NCT04524598.

55.Continuing Care App for probationers and parolees – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT04574193.

56.mHealth titration and management – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT03908762.

57.Apricity Health, LLC. A study of ApricityRx™digital therapeutic for management of immune-related adverse events in patients on immuno-oncology therapy.https://clinicaltrials.gov/ct2/show/study/NCT04571398.

58.Pelvic floor muscle training with a digital therapeutic device to standard exercises for stress urinary incontinence – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT04508153.

59.Impact of digital therapeutic on metabolic parameters – Full Text View – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT04573296.

60.Cognoa, Inc. Cognoa ASD digital therapeutic engagement and usability study.https://clinicaltrials.gov/ct2/show/study/NCT04326231.

61.A remote, decentralized opioid use disorder study to evaluate patient engagement with a game-based digital therapeutic – Full text view – ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/NCT04542642.

62.Orion Corporation, Orion Pharma. DTx for pain: behavioural revalidation in augmented and virtual reality for chronic pain; and exploratory pilot study of a virtual reality software.https://clinicaltrials.gov/ct2/show/NCT04225884.

63.University College, London. Digital Interventions in Neuro-Rehabilitation: A Small Scale Randomised Clinical Trial to Test a Web-based Therapy Application for People With Naming Difficulties With Mild- Moderate Dementia (Gotcha!).https://clinicaltrials.gov/ct2/show/NCT04570215.

64.University College, London. Digital Interventions in Neuro-Rehabilitation (DINR): A Digital Neuro Intervention (DNI) for Word Retrieval (Anomia).https://clinicaltrials.gov/ct2/show/NCT04566081.

65.Brewer J. Mechanisms of Mindfulness for Smoking Cessation: Optimizing Quantity and Quality.https://clinicaltrials.gov/ct2/show/NCT02943499.

66.Ponzo S, Morelli D, Kawadler JM, et al. Plans DEfficacy of the Digital Therapeutic Mobile App BioBase to Reduce Stress and Improve Mental Well-Being Among University Students: Randomized Controlled Trial. JMIR MHealth UHealth 2020; 8: e17767.

67.Kario K, Nomura A, Kato A, et al. Digital therapeutics for essential hypertension using a smartphone application: A randomized, open-label, multicenter pilot study. J Clin Hypertens (Greenwich) 2021; 23: 923-34.

68.Fundazion Publica Andalusa para la gestión de la Investigación en Sevilla. A Randomized Open-label Parallel-group Trial is to Analyze the Efficacy and the Efficiency of the Social-Local-Mobile (So-Lo-Mo) Intervention Applied to the Smoking Cessation Process.https://clinicaltrials.gov/ct2/show/NCT03553173.

69.Fraser Health. Use of a Novel Digital Therapeutic Intervention for the Management of Delirium in the Critical Care Environment.https://clinicaltrials.gov/ct2/show/NCT04652622.

70.Little J. Supporting and Enabling Physicians to Deliver Personalized Therapeutic Nutrition Using an Online Technology Platform. https://clinicaltrials.gov/ct2/show/NCT04616482.

71.Otsuka Pharmaceutical Development & Commercialization, Inc. A Multi-center, Randomized, Controlled Trial to Evaluate the Effectiveness of Digital Therapeutics as Adjunctive Therapy in Adult Subjects Diagnosed With Major Depressive Disorder.https://clinicaltrials.gov/ct2/show/NCT04770285.

72.Nguyen L. Zemedy – Evaluation of Zemedy, a Cognitive Behavioral Therapy-based Digital Therapeutic Application for the Treatment of Irritable Bowel Syndrome.https://clinicaltrials.gov/ct2/show/NCT04723056.

73.Alberts J. Dual-task Augmented Reality Treatment for Parkinson’s Disease (DART). https://clinicaltrials.gov/ct2/show/NCT04634331.

74.Lazaridou A.. Biofeedback EMG Alternative Therapy for Chronic Low Back Pain (BEAT-Pain): A Pilot Efficacy Study.https://clinicaltrials.gov/ct2/show/NCT04607460.

75.Eldridge SM, Lancaster GA, Campbell MJ, et al. Defining feasibility and pilot studies in preparation for randomised controlled trials: development of a conceptual framework. PLoS ONE 2016; 11: e0150205.

76.https://www.prnewswire.com/news-releases/new-study-ranks-mindsciences-unwinding-anxiety-among-top-1-of-health-wellness-apps-300789441.html.

77.Roy A, Druker S, Hoge EA, Brewer JA. Physician anxiety and burnout: symptom correlates and a prospective pilot study of App-delivered mindfulness training. JMIR Mhealth Uhealth 2020; 8: e15608.

78.Eysenbach G, CONSORT-EHEALTH Group. CONSORT-EHEALT Improving and standardizing evaluation reports of web-based and mobile-health interventions. J Med Internet Res 2011; 13: e126.

79.Evidence standards framework for digital health technologies | Our programmes | What we do | About. NICEhttps://www.nice.org.uk/about/what-we-do/our-programmes/evidence-standards-framework-for-digital-health-technologies.

80.Endeavour TM. Instruction for use. Akili Interactive Labs. 2020.

81.Berglund Scherwitzl E, Gemzell Danielsson K, et al. Fertility awareness-based mobile application for contraception. Eur J Contracept Reprod Health Care 2016; 21: 234-41.

82.Berglund Scherwitzl E, Lundberg O, Kopp Kallner H, et al. Perfect-use and typical-use Pearl Index of a contraceptive mobile app. Contraception 2017; 96: 420-25.

83. Kleinschmidt TK, Bull JR, Lavorini V, et al. Advantages of determining the fertile window with the individualised Natural Cycles algorithm over calendar-based methods. Eur J Contracept Reprod Health Care 2019; 24: 457-63.

84.Bull JR, Rowland SP, Scherwitzl EB, et al. Real-world menstrual cycle characteristics of more than 600,000 menstrual cycles. NPJ Digital Medicine 2019; 2: 1-8.

85.Berglund Scherwitzl E, Lundberg O, et al. Short- and long-term effect of contraceptive methods on fecundity. Eur J Contracept Reprod Health Care 2019; 24: 260-65.

86. Torous J, Andersson G, Bertagnoli A, et al. Towards a consensus around standards for smartphone apps and digital mental health. World Psichiatry 2019; 18: 97-8.

Clinical evaluation of digital therapeutics

Log in with your credentials

Forgot your details?