Scolaris Content Display Scolaris Content Display

Deep brain stimulation for dystonia

Collapse all Expand all

Abstract

Background

Dystonia is a painful and disabling disorder, characterised by painful, involuntary posturing of the affected body region(s). Deep brain stimulation is an intervention typically reserved for severe and drug‐refractory cases, although uncertainty exists regarding its efficacy, safety, and tolerability.

Objectives

To compare the efficacy, safety, and tolerability of deep brain stimulation (DBS) versus placebo, sham intervention, or best medical care, including botulinum toxin and resective or lesional surgery, in adults with dystonia.

Search methods

We identified studies by searching the CENTRAL, MEDLINE, Embase, three other databases, four clinical trial registries, four grey literature databases, and reference lists of included articles. We ran the last search of all elements of the search strategy, with no language restrictions, on 29 May 2018.

Selection criteria

Double‐blind, parallel, randomised, controlled trials (RCTs) comparing DBS with sham stimulation, best medical care, or placebo in adults with dystonia.

Data collection and analysis

Two independent review authors assessed records, selected included studies, extracted data onto a standardised (or prespecified) data extraction form, and evaluated the risk of bias. We resolved disagreements by consensus or by consulting a third review author. We conducted meta‐analyses using a random‐effects model, to estimate pooled effects and corresponding 95% confidence intervals (95% CI). We assessed the quality of the evidence with GRADE methods. The primary efficacy outcome was symptom improvement on any validated symptomatic rating scale, and the primary safety outcome was adverse events.

Main results

We included two RCTs, enrolling a total of 102 participants. Both trials evaluated the effect of DBS on the internal globus pallidus nucleus, and assessed outcomes after three and six months of stimulation. One of the studies included participants with generalised and segmental dystonia; the other included participants with focal (cervical) dystonia. We assessed both studies at high risk for performance and for‐profit bias. One study was retrospectively registered with a clinical trial register, we judged the second at high risk of detection bias.

Low‐quality evidence suggests that DBS of the internal globus pallidus nucleus may improve overall cervical dystonia‐related symptoms (mean difference (MD) 9.8 units, 95% CI 3.52 to 16.08 units; 1 RCT, 59 participants), cervical dystonia‐related functional capacity (MD 3.8 units, 95% CI 1.41 to 6.19; 1 RCT, 61 participants), and mood at three months (MD 3.1 units, 95% CI 0.73 to 5.47; 1 RCT, 61 participants).

Low‐quality evidence suggests that In people with cervical dystonia, DBS may slightly improve the overall clinical status (MD 2.3 units, 95% CI 1.15 to 3.45; 1 RCT, 61 participants). We are uncertain whether DBS improves quality of life in cervical dystonia (MD 3 units, 95% CI ‐7.71 to 13.71; 1 RCT, 57 participants; very low‐quality evidence), or emotional state (MD 2.4 units, 95% CI ‐6.2 to 11.00; 1 RCT, 56 participants; very low‐quality evidence).

Low‐quality evidence suggests that DBS of the internal globus pallidus nucleus may improve generalised or segmental dystonia‐related symptoms (MD 14.4 units, 95% CI 8.0 to 20.8; 1 RCT, 40 participants), overall clinical status (MD 3.5 units, 95% CI 2.33 to 4.67; 1 RCT, 37 participants), physical functioning‐related quality of life (MD 6.3 units, 95% CI 1.06 to 11.54; 1 RCT, 33 participants), and overall dystonia‐related functional capacity at three months (MD 3.1 units, 95% CI 1.71 to 4.48; 1 RCT, 39 participants). We are uncertain whether DBS improves physical functioning‐related quality of life (MD 5.0 units, 95% CI ‐2.14 to 12.14, 1 RCT, 33 participants; very low‐quality evidence), or mental health‐related quality of life (MD ‐4.6 units, 95% CI ‐11.26 to 2.06; 1 RCT, 30 participants; very low‐quality evidence) in generalised or segmental dystonia.

We pooled outcomes related to safety and tolerability, since both trials used the same intervention and comparison. We found very low‐quality evidence of inconclusive results for risk of adverse events (relative risk (RR) 1.58, 95% 0.98 to 2.54; 2 RCTs, 102 participants), and tolerability (RR 1.86, 95% CI 0.16 to 21.57; 2 RCTs,102 participants).

Authors' conclusions

DBS of the internal globus pallidus nucleus may reduce symptom severity and improve functional capacity in adults with cervical, segmental or generalised moderate to severe dystonia (low‐quality evidence), and may improve quality of life in adults with generalised or segmental dystonia (low‐quality evidence). We are uncertain whether the procedure improves quality of life in cervical dystonia (very low‐quality evidence). We are also uncertain about the safety and tolerability of the procedure in adults with either cervical and generalised, or segmental dystonia (very‐low quality evidence).

We could draw no conclusions for other populations with dystonia (i.e. children and adolescents, and adults with other types of dystonia), or for other DBS protocols (i.e. other target nuclei or stimulation paradigms). Further research is needed to establish the long‐term efficacy and safety of DBS of the internal globus pallidus nucleus.

PICOs

Population
Intervention
Comparison
Outcome

The PICO model is widely used and taught in evidence-based health care as a strategy for formulating questions and search strategies and for characterizing clinical studies or meta-analyses. PICO stands for four different potential components of a clinical question: Patient, Population or Problem; Intervention; Comparison; Outcome.

See more on using PICO in the Cochrane Handbook.

Deep brain stimulation for people with involuntary posturing, or dystonia

The review question

We reviewed the evidence about the effect of deep brain stimulation (DBS) for adults with dystonia. We assessed the efficacy, safety, and tolerability of this procedure.

Background

Dystonia is a disease that causes undesired, uncontrollable, often painful, abnormal movement of an affected limb or body region. It is a relatively uncommon condition, which can be very disabling and negatively affect a person's quality of life. In most cases, the cause is unknown; no cure exists. Dystonia is normally a long‐term disease that requires long‐term treatment.

Deep brain stimulation (DBS) involves a surgical procedure to place electrical stimulators in the brain. Afterwards, the stimulators are connected to a battery, and deliver electrical impulses to the brain over time. For people with dystonia, DBS is usually considered to be a therapeutic option for severe cases only, once other treatments have failed.

Study characteristics

We conducted a literature search on 29 May 2018 for studies that compared DBS with sham stimulation (same surgical procedure, but no electrical impulses are delivered through the electrodes placed in the brain), best medical therapy, and placebo (a pretend medicine). We found two studies that compared DBS with sham stimulation, and included a total of 102 participants. One study included participants with dystonia of the limbs and trunk, and the other with dystonia of the neck. Participants received active DBS for a total of six months. The average age of people in the studies was 50 years; the average duration of the disease was 16 years. Both studies were funded by a DBS device manufacturer with possible interests in the results of the studies.

Key results

For limb and trunk dystonia, DBS may improve symptoms, self‐assessed clinical status, and functioning. The results showed that for neck dystonia, DBS may improve symptoms, clinical status, functioning, and mood. For either type of dystonia, we are uncertain about the impact that DBS has on harmful or undesired events, or treatment tolerability.

Quality of the evidence

The overall quality of the evidence for neck, limb, and trunk dystonia was low to very low. Further research is needed to draw conclusions about the clinical efficacy, safety, and tolerability of DBS in people with dystonia, especially beyond the three‐ to six‐month duration of the included studies.

Authors' conclusions

Implications for practice

We found that constant deep brain stimulation of the internal globus pallidus may reduce symptom severity and improve functional capacity for adults with cervical, segmental, or generalised moderate to severe dystonia. We are uncertain whether the procedure is associated with adverse effects or issues of tolerability in this population, owing to the short follow‐up duration, the fact that the comparator in the trials was sham stimulation, and the presence of study limitations. Due to lack of evidence, we could not draw any conclusions about long‐term efficacy or safety, neither could our results be generalised for other populations with dystonia (namely, children and adolescents, and adults with other types of dystonia), or for other deep brain stimulation target nuclei or stimulation paradigms.

Implications for research

We only found published research data from trials of deep brain stimulation (DBS) versus sham stimulation. This area of research represents an unmet need in movement disorders.

We believe the programming sessions used by both trials may have introduced performance bias, as a different protocol was applied to each of the treatment arms, and this may have compromised randomisation.

Both trials included data on quality of life, and other patient‐related outcomes. Future trials should reinforce this aspect.

Further studies are needed to establish the clinical effectiveness of DBS, assessing efficacy, safety, duration of effect, and quality of life in different populations with dystonia. Because DBS typically requires that neurostimulation be optimised for each patient, this line of research would be important to support physicians' management of the stimulation, and inform a more solid and safe individualisation of a patient's treatment. Further studies are also needed to establish if there is significant difference in outcomes between DBS in different target nuclei, between different DBS devices, and with different stimulation protocols.

Future research on DBS should establish clinical effectiveness based on changes from baseline, and validated measures of minimal clinically important differences for outcome measures, such as the Burke‐Fahn‐Marsden Dystonia Rating Scale (BFMDRS) and the Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS (Brożek 2006)). We are aware of efforts to create a new clinical scale in dystonia, the Comprehensive Cervical Dystonia Rating Scale, which will include a revision of the TWSTRS (to be named TWSTRS‐2), and testing to validate a minimal clinically important change (Comella 2015). We are also aware of clinimetric testing completed on the Comprehensive Cervical Dystonia Rating Scale, which will be of considerable importance (Comella 2016).

Additional research is needed to establish long‐term clinical efficacy and safety profiles in DBS.

Summary of findings

Open in table viewer
Summary of findings 1. Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia

Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia

Patient or population: adults with generalised or segmental dystonia
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Certainty of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Dystonia‐specific improvement
assessed with BFMDRS movement score
(follow‐up: 3 months)

The mean dystonia‐specific improvement without DBS was 1.4 fewer units

The mean dystonia‐specific improvement with DBS was 15.8 fewer units

14.4 units fewer
(8.0 to 20.8 fewer)

40
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall generalised or segmental dystonia severity

Subjective Evaluation of Clinical Status
assessed with Visual Analogue Scale
(follow‐up: 3 months)

The mean subjective Evaluation of Clinical Status without DBS was 0.1 higher units

The mean subjective Evaluation of Clinical Status with DBS was 3.4 higher units

3.5 units fewer
(2.33 to 4.67 fewer)

37
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall subjective improvement of clinical status

Quality of Life Assessment
assessed with SF‐36: physical function
(follow up: 3 months)

The mean quality of Life Assessment without DBS was 3.8 higher units

The mean quality of Life Assessment with DBS was 10.1 higher units

6.3 units higher
(1.06 to 11.54 higher)

33
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall physical functioning quality of life

Quality of Life Assessment
assessed with SF‐36: mental health
(follow up: 3 months)

The mean quality of Life Assessment without DBS was 0.2 higher units

The mean quality of Life Assessment with DBS was 5.2 higher units

5.0 units higher
(2.14 lower to 12.14 higher)

33
(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes overall mental health quality of life

Functional Capacity
assessed with BFMDRS disability score
(follow up: 3 months)

The mean functional Capacity without DBS was 0.8 fewer units

The mean functional Capacity with DBS was 3.9 fewer units

3.1 units fewer
(1.71 to 4.48 fewer)

39
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall dystonia related functional capacity

Emotional Assessment
assessed with Beck Depression Inventory
(follow up: 3 months)

The mean emotional Assessment without DBS was 0.5 fewer units

The mean emotional Assessment with DBS was 5.1 fewer units

4.6 units fewer
(11.26 fewer to 2.06 more)

30
(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes overall emotional assessment

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious study limitations: moderate risk of bias (three domains with high risk of bias)

2 Serious indirectness: short‐term follow‐up (3 to 6 months) precludes firm conclusions

3 Serious imprecision: gathered information size criteria was met but the 95% CI failed to exclude important benefit or important harm

Open in table viewer
Summary of findings 2. Deep brain stimulation compared to sham stimulation in cervical dystonia

Deep brain stimulation compared to sham stimulation in cervical dystonia

Patient or population: adults with cervical dystonia
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Certainty of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Dystonia‐specific symptoms

(assessed with TWSTRS; score range 0 to 85; higher = worse; follow‐up 3 months)

The mean dystonia‐specific Improvement without DBS was 8.5 fewer units

The mean dystonia‐specific Improvement with DBS was 18.3 fewer units

9.8 units fewer
(3.52 to 16.08 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall cervical dystonia severity

Clinical status
(assessed with Clinical Global Impression Scale
(follow‐up: 3 months)

The mean subjective Evaluation of Clinical Status without DBS was 1.2 fewer units

The mean subjective Evaluation of Clinical Status with DBS was 3.5 fewer units

2.3 units fewer
(1.15 to 3.45 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may slightly improve overall subjective improvement of clinical status

Quality of Life
using SF‐36: physical functioning
(follow‐up: 3 months)

The mean quality of Life Assessment without DBS was 3.6 higher units

The mean quality of Life Assessment with DBS was 6.6 higher units

3 units higher
(7.71 lower to 13.71 higher)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1 2,3

We are uncertain whether DBS changes overall physical functioning quality of life

Quality of Life Assessment
using SF‐36: mental health
(follow‐up: 3 months)

The mean quality of Life Assessment without DBS was 8.9 higher units

The mean quality of Life Assessment with DBS was 11.3 higher units

2.4 units higher
(6.2 lower to 11 higher)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1 2,3

We are uncertain whether DBS changes overall mental health quality of life

Functional capacity
assessed with TWSTRS disability sub‐scale
(follow‐up: 3 months)

The mean functional capacity without DBS was 1.8 fewer units

The mean functional capacity with DBS was 5.6 fewer units

3.8 units fewer
(1.41 to 6.19 fewer)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS improves overall functional capacity

Emotional assessment
assessed with Beck Depression Inventory
(follow‐up: 3 months)

The mean emotional assessment without DBS was 0.4 fewer units

The mean emotional assessment with DBS was 3.5 fewer units

3.1 units fewer
(0.73 to 5.47 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall emotional assessment

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio; TWSTRS

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious study limitations: moderate risk of bias (three domains with high risk of bias)

2 Serious indirectness: short‐term follow‐up (3 to 6 months) precludes firm conclusions

3 Serious Imprecision: gathered information size criteria was met, but the 95% CI failed to exclude important benefit or important harm

Open in table viewer
Summary of findings 3. Deep brain stimulation compared to sham stimulation in dystonia

Deep brain stimultion compared to sham stimulation in dystonia

Patient or population: adults with dystonia (generalised, segmental, and cervical)
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Relative effect
(95% CI)

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Adverse Events
follow up: 3 months

RR 1.58
(0.98 to 2.54)

Study population

102
(2 RCTs)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes the risk of developing adverse events.

30.0%

47.4%
(29.4 to 76.2)

17.4% more
(0.6 fewer to 46.2 more)

Tolerability
follow up: 3 months

RR 1.86
(0.16 to 21.57)

Study population

102
(2 RCTs)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes the risk of tolerability.

0.0%

0.0%
(0.0 to 0.0)

0.0% fewer
(0 fewer to 0 fewer)

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio

GRADE Working Group grades of evidence
High quality: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious Study limitations: Moderate risk of bias across all included studies (three domains with high risk of bias in each study)

2 Serious Indirectness: Short‐term follow‐up (3‐6 months) precludes firm conclusions

3 Serious Imprecision: Minimal information size criteria was less than the number generated by a conventional sample size and alpha‐spending sample size calculations

Background

Description of the condition

See Additional Table 1 for glossary of terms.

Open in table viewer
Table 1. Glossary of terms

Term

Definition

Deep brain stimulation

Neurosurgical procedure whereby an electric current is delivered by electrodes placed in the deep brain stimulate target nuclei

Target nucleus or nuclei

Groups of neuronal cell bodies, located in the deep areas of the brain, selected for deep brain stimulation

Dystonia

Common movement disorder in which people have abnormal torsion movements, or postures of one or more body segments, such as the neck or a limb, that they cannot control. It is frequently accompanied by social embarrassment and pain.

Primary dystonia

Dystonic disorder caused by an intrinsic basal ganglia problem unrelated to any other disease. It is sometimes caused by a mutation; dystonia is the main clinical manifestation in the majority of primary dystonias

Secondary dystonia

Dystonic disorder caused by another disease (i.e. caused by stroke)

Generalised dystonia

Dystonia affecting all body segments (i.e. trunk, upper and lower limbs)

Cervical dystonia

Dystonia affecting the neck

Blepharospasm

Dystonia affecting the eye lids

Dystonia is the third most common movement disorder, after Parkinson's disease and essential tremor, with an estimated overall prevalence of 164 per million (Steeves 2012). Dystonia syndromes are a group of disabling, painful disorders characterised by involuntary sustained or intermittent muscle contractions causing abnormal, often repetitive, movements or postures of the face, neck, trunk or limbs, among other muscles (Albanese 2013). Dystonic movements are typically patterned or twisting, and are often initiated or worsened by voluntary action (Albanese 2013). These neurological disorders are classified according to two different axes. Axis I is based on clinical manifestations of dystonia, and divided into four separate dimensions: age at onset, body distribution, temporal pattern, and associated features. Age at onset classifies the dystonia under the standard age groups used for other neurological disorders (Jinnah 2014). Body distribution includes focal dystonia, segmental dystonia, multifocal dystonia, hemidystonia, and generalised dystonia (Albanese 2013; Tarsy 2006). Temporal pattern classifies dystonia according to its course and type of short‐term variation (Jinnah 2014). The absence of other associated features defines isolated dystonia, formerly known as primary dystonia (Albanese 2013). Combined dystonia is defined in the presence of other neurological or systemic features and includes the previous terms of secondary dystonia, dystonia‐plus syndromes and heredodegenerative dystonia (Jinnah 2014). Axis II is based on the aetiology of dystonia and divided into three dimensions: heritability, nervous system pathology, and idiopathic. In heritability, dystonia can be defined by association with hereditary neurological conditions (e.g. sex‐linked, autosomal or mitochondrial), or by an acquired cause (Albanese 2013; Jinnah 2014; Tarsy 2006). Among the most common known causes are drug‐induced dystonia (caused by agents such as levodopa or antidopaminergics), and acquired lesions to the central nervous system (CNS), such as brain injury, infections, toxins, vascular or neoplastic disorders (Calne 1988). Dystonia can have a psychogenic origin (i.e. functional (Albanese 2013)). The term idiopathic dystonia is used when there is no acquired cause, and the dystonia remains genetically unclassified; it can be further classified into sporadic or familial idiopathic dystonia (Jinnah 2014).

The aetiology of most forms of dystonia is still not fully understood; early‐onset dystonia is one of the exceptions for which a hereditary aetiology is common (Balint 2015). In most cases of focal adult‐onset dystonia, such as cervical dystonia (the most common form of focal dystonia), the pathophysiology is generally considered to result from impaired inhibition of the CNS at multiple levels, resulting in abnormal sensorimotor integration (Hallett 1998).

The generalised increase in cortical and basal ganglia excitability leads to a diminished motor function inhibition, a decrease in spatial and temporal somatosensory discrimination, and loss of surround inhibition (incapacity to suppress adjacent regions to activated neural circuits (Phukan 2011; Tarsy 2006)).

Previous systematic reviews have demonstrated that botulinum toxin is effective in the treatment of cervical dystonia (Castelão 2017; Duarte 2016; Marques 2016), and blepharospasm (Costa 2004), the two most common forms of focal dystonia. Without exception, all guidelines recommend botulinum toxin as first‐line treatment for focal dystonias (Simpson 2016). However, even in moderate‐severity dystonia, there is evidence that people attach a considerable expectation of harm due to botulinum toxin (Duarte 2018). The pharmacological treatment of generalised dystonia is more challenging, with poor results (Pirio Richardson 2017). Some people with dystonia have severe impairment, and are refractory to pharmacological treatments, including botulinum toxin.

Description of the intervention

Deep brain stimulation (DBS) is a method of intracerebral stimulation that uses a controlled, direct application of an electrical current to specific subcortical nuclei. It is important to note that it is not a curative treatment. Parkinson's disease is the most common neurological disease for which DBS is used, and the most common target nucleus in this condition is the subthalamic nucleus (Fasano 2012). In selected patients with Parkinson's disease, DBS improved the time without dyskinesia at six months by an average of 4.6 hours a day versus 0 hours in participants randomised to best medical therapy, while also reporting a higher rate of clinically meaningful motor improvement – 71% with DBS versus 32% with best medical therapy (Weaver 2009). DBS also appeared to provide a higher rate of quality of life for patients with Parkinson's disease – 64% improvement for DBS versus 36% for best medical therapy (Weaver 2009).

Electrical stimulation of CNS targets is delivered through electrodes that are surgically implanted, then connected to an implantable pulse generator, which is most often placed subcutaneously in the pectoral region (Fasano 2012).

Different target nuclei for DBS have been studied in people with dystonia, including the internal globus pallidus, the thalamus ventrointermediate nucleus, and the subthalamic nucleus, with the purpose of modulating cortical excitability (Limousin‐Dowsey 1999). In routine practice, the internal globus pallidus is typically the primary target for people with dystonia (Kupsch 2006; Vidailhet 2005).

Different techniques may be used, among them, high‐ or low‐frequency stimulation, with varying degrees of intensity and effect duration (Fasano 2012; Limousin‐Dowsey 1999). The stimulation can be produced with constant voltage, or more recently, constant current, which some have suggested improves the tolerance and effectiveness of DBS (Gross 2013). In recent years, novel advances of DBS technology, not specifically for dystonia treatment, have emerged on the basis of electrodes engineering (allowing new stimulation paradigms, such as interleaving stimulation), and on‐demand stimulation systems (Toda 2016).

In routine clinical practice, adjustments are made to the stimulation parameters (voltage, frequency, and pulse width) in ambulatory follow‐up examinations, to ensure optimal therapeutic effects (Montuno 2013). Implantable pulse generators have a limited battery life, at the end of which, surgery is required to replace the battery. Rechargeable Implantable pulse generators have been developed to reduce the number of surgeries needed to replace the batteries (Waln 2014).

How the intervention might work

There are different hypotheses on how DBS might work. The inhibitory hypothesis suggests that the therapeutic efficacy of DBS is a result of reducing the activity of neurons adjacent to the stimulation lead (Filali 2004), most likely due to activation of GABAergic afferent pathways (Chiken 2014). The excitatory hypothesis claims that the excitation of efferent pathways, and antidromic excitation of afferent pathways, result in the suppression of abnormal activity (Hashimoto 2003). Finally, the disruption hypothesis suggests that blocking aberrant neural stimuli in the cortico‐basal ganglia loop, creates a dissociation between neural afferent and efferent signals (Chiken 2015). The most plausible mechanism is probably a combination of different effects.

Why it is important to do this review

Recent studies reported the beneficial effects that DBS has in people with certain movement disorders, including selected cases of Parkinson's disease and essential tremor (Flora 2010; Weaver 2009). However, no systematic review has yet examined the available literature on the outcomes of DBS in people with dystonia. There are reports of serious events, such as mood changes, cognitive deficit, and an increase in suicide rates among patients treated with DBS for dystonia (Fasano 2012; Foncke 2006), and pulmonary embolism, myocardial infarction, stroke, intracerebral haemorrhage, and infection in patients with Parkinson's disease (Fasano 2012; Weaver 2009). Therefore, uncertainty exists regarding the overall risk‐benefit of this intervention in dystonia.

Objectives

To compare the efficacy, safety, and tolerability of deep brain stimulation versus placebo, sham intervention, or best medical care, including botulinum toxin and resective/lesional surgery, in adults with dystonia.

Methods

Criteria for considering studies for this review

Types of studies

Randomised controlled trials (RCTs) with a parallel design, of any duration, assessing the efficacy, safety, or tolerability of deep brain stimulation (DBS) versus placebo, a sham intervention, or best medical treatment in people with dystonia were eligible for this review. We considered both open and blinded trials. We excluded trials in which participants were their own controls (before‐and‐after trial design, or on‐and‐off stimulation studies) because of the possibility of selection bias, carry‐over effect, and the impossibility to isolate the lesional effect of the intervention in the outcome estimate.

Types of participants

Adults (i.e. ≥ 18 years of age), in any setting, with a clinical diagnosis of any type of dystonia (primary or secondary; focal, segmental, or generalised). We adopted a pragmatic approach to the definition of dystonia. Namely, we considered patients included in randomised trials with the diagnosis of dystonia, who were evaluated on a validated and fit‐for‐purpose dystonia‐specific severity scale.

Studies that included only a subset of the relevant participants were eligible for inclusion.

We imposed no restrictions on the number of participants recruited to trials, or the number of recruitment centres.

Types of interventions

We accepted any type of DBS, independent of the target‐nucleus, the device used or the stimulation parameters and modality. We planned to compare DBS with either: 1) the best available pharmacological treatment, including botulinum toxin, 2) sham stimulation, or 3) resective/lesional surgery. Sham stimulation had to be considered fit for purpose in order to be included.

Types of outcome measures

Any included study had to explicitly report at least one of the outcomes below.

Critical outcomes
Dystonia‐specific symptoms

Symptoms were measured as the mean change from baseline on any well‐characterised dystonia‐specific symptomatic rating scale, measured at least one month after DBS surgery.

Adverse events

Adverse events were measured as the proportion of participants with any adverse event, at any point during study follow‐up. We planned to study surgery‐related adverse events of special interest, such as device infection, electrode dislocation, central nervous system haemorrhage, stroke, and death, measured at any point during study follow‐up. We also planned to look specifically for stimulation‐related adverse events of special interest, such as dysarthria, dyskinesia, loss of desired effect, and suicide attempts, measured at any point during study follow‐up. Finally, we aimed to study the proportions of participants with specific adverse events, measured at any point during study follow‐up.

Important outcomes
Clinical status

This outcome could be evaluated by both patients and clinicians, with well‐characterised assessment tools, such as the Patient Subjective Assessment of Change, Patient Global Assessment of Improvement, Patient Evaluation of Global Response (PEGR), Patient and Physician Global Assessment of Change, Investigator Global Assessment of Efficacy (IGAE), Physician Global Assessment of Change (PGAC), and a visual analogue scale (VAS) for symptom severity, measured at least one month after DBS surgery. We had planned to dichotomise patients into those who reported improvement or were classified by clinicians as having improved, and those without improvement.

Quality of life

Changes in quality‐of‐life assessments, measured with well‐characterised assessment tools, such as the 36‐item Short Form Health Survey (SF‐36), measured at any point during study follow‐up.

Functional capacity

Ability assessed using a well‐characterised assessment tool measured at any point during study follow‐up. We had also planned to study the proportions of participants who were able to perform selected activities of daily living, such as working capabilities and the ability to drive a car, measured at any point during study follow‐up.

Emotional state

Frame of mind (mood) assessed by well‐characterised scales, such as the Beck Depression Inventory (BDI), Brief Psychiatric Rating Scale (BPRS), measured at any point during study follow‐up.

Tolerability

Participant's ability to manage the effects of the procedure, assessed by the proportion of participants who withdrew from the study, or interrupted DBS due to adverse events, measured at any point during study follow‐up.

Search methods for identification of studies

Electronic searches

We searched the following databases from 1993, the first year DBS was reported in any condition, until 29 May 2018.

  1. Cochrane Central Register of Controlled Trials (CENTRAL; 2018, Issue 5) in the Cochrane Library.

  2. MEDLINE Ovid.

  3. Embase Ovid.

  4. Web of Science.

  5. SciELO (Scientific Electronic Library Online).

  6. LILACS (Latin American and Caribbean Health Science Information database).

We developed detailed search strategies for each database searched. Please see Appendix 1 for the CENTRAL search strategy, Appendix 2 for the MEDLINE search strategy, and Appendix 3 for the Embase search strategy.

We assessed non‐English language papers equally, translating them as necessary, and evaluating them for inclusion.

Searching other resources

We searched the following clinical trial registries on 29 May 2018.

  1. US National Institutes of Health Ongoing Trials Register ClinicalTrials.gov (www.clinicaltrials.gov).

  2. EU Clinical Trials Register (www.clinicaltrialsregister.eu; from 1995).

  3. World Health Organization International Clinical Trials Registry Platform (apps.who.int/trialsearch).

  4. ISRCTN Registry (www.isrctn.com; from 2000).

We searched the grey literature via the following databases on 29 May 2018.

  1. OpenSIGLE (from 1993).

  2. Database of Abstracts of Reviews of Effects (DARE).

  3. British Library Thesis Service.

  4. National Technical Information Service (NTIS).

We had planned to handsearch abstracts from the following international congresses of movement disorders:

  1. American Academy of Neurology (from 1993);

  2. European Academy of Neurology;

  3. European Neurological Society (up till 2013);

  4. European Federation of Neurological Science (up till 2013);

  5. Movement Disorders Society;

  6. International Association of Parkinsonism and Related Disorders.

However, owing to the fact that all of the conference proceedings were published in indexed journals, at least since 1993, we opted against conducting a handsearch, since we did not expect to find further citations.

We cross‐checked the reference lists of both selected and potentially eligible studies for additional studies. We had no need to translate non‐English reports. We had no need to contact study authors and DBS device companies for further access to data.

Data collection and analysis

Selection of studies

Two review authors, independently and in duplicate, screened all titles and abstracts identified from searches to determine which ones met the inclusion criteria. We retrieved the full text of any papers identified as potentially relevant by at least one author, or those without an available abstract. Two review authors independently screened full‐text articles; they resolved discrepancies by discussion, and by consulting a third author, when necessary to reach consensus. We collated duplicate publications and presented these by individual study. We outlined the screening and selection process in a PRISMA flow chart (Liberati 2009).

Data extraction and management

Two review authors independently extracted study data onto pre‐piloted, standardised forms, after which we cross‐checked the forms for accuracy. We used the Covidence platform for this purpose (Covidence). We resolved disagreements by discussion, or if necessary, sought arbitration by a third review author. We extracted the following data from each study.

  1. Participants: method for referral, inclusion and exclusion criteria, demographics and clinical baseline characteristics, number and reasons for withdrawals, exclusions and loss to follow‐up, if any.

  2. Interventions: full description of intervention, duration of treatment period and follow‐up, providers, and co‐interventions, if any.

  3. Comparisons: number of participants randomised to each arm, compliance and dropouts, reasons for dropouts, and ability to perform an intention‐to‐treat analysis.

  4. Outcomes: definition of outcomes, use of validated measurement tools, time point of measurements, change from baseline or post‐interventional measures, and missing outcomes, if any.

  5. Study design: interventional, randomised, controlled, double‐blind.

Assessment of risk of bias in included studies

We assessed the risk of bias of included studies according to the domains described in the Cochrane tool for assessing risk of bias, and classified the risk of bias for each domain as high, unclear, or low (Higgins 2011a). We assessed two further domains, as set out in our review protocol, which are described below: 'for‐profit bias' and 'prospective clinical trial registration'. We used the following definitions for each domain in the risk of bias assessment.

Random sequence generation

  • Low risk of bias: the study performed sequence generation using computer random number generation or a random number table. Drawing lots, tossing a coin, shuffling cards, and throwing dice were adequate if an independent person, not otherwise involved in the study, performed them.

  • Unclear risk of bias: the study authors did not report the sequence generation method.

  • High risk of bias: the sequence generation method was not random.

Allocation concealment

  • Low risk of bias: participants and investigators enrolling participants could not foresee assignment because one of the following, or an equivalent method, was used to conceal allocation: central allocation, sequentially numbered drug containers of identical appearance; sequentially numbered, opaque, sealed envelopes.

  • Unclear risk of bias: insufficient information to permit judgement of low risk or high risk.

  • High risk of bias: participants or investigators enrolling participants could possibly foresee assignments, and thus introduce selection bias.

In addition to these criteria, we considered the implications of baseline imbalances in prognostic factors affecting the trial outcomes, as these may lead to selection bias (Corbett 2014).

Blinding of participants and personnel

  • Low risk of bias: any of the following: no blinding or incomplete blinding, but the review authors judged that the outcome was not likely to be influenced by lack of blinding; or blinding of participants and key study personnel ensured, and it was unlikely that the blinding could have been broken.

  • Unclear risk of bias: any of the following: insufficient information to permit judgement of low risk or high risk; or the trial did not address this outcome.

  • High risk of bias: any of the following: no blinding or incomplete blinding, and the outcome was likely to be influenced by lack of blinding; or blinding of key study participants and personnel attempted, but likely that the blinding could have been broken, and the outcome was likely to be influenced by lack of blinding.

Blinded outcome assessment

We considered blinding separately for different outcomes, as appropriate.

  • Low risk of bias: any of the following: no blinding of outcome assessment, but the review authors judged that the outcome measurement was not likely to be influenced by lack of blinding; or blinding of outcome assessment ensured, and unlikely that the blinding could have been broken.

  • Unclear risk of bias: any of the following: insufficient information to permit judgement of low risk or high risk; or the trial did not address this outcome.

  • High risk of bias: any of the following: no blinding of outcome assessment, and the outcome measurement was likely to be influenced by lack of blinding; or blinding of outcome assessment, but likely that the blinding could have been broken, and the outcome measurement was likely to be influenced by lack of blinding.

Incomplete outcome data

We considered the last data available.

  • Low risk of bias: missing data were unlikely to make treatment effects depart from plausible values. The study used sufficient methods, such as multiple imputation, to handle missing data.

  • Unclear risk of bias: there was insufficient information to assess whether missing data, in combination with the method used to handle missing data, were likely to induce bias on the results.

  • High risk of bias: the results were likely to be biased due to missing data.

Selective outcome reporting

  • Low risk: the trial reported the following predefined outcomes. If the original trial protocol was available, the outcomes were called for in that protocol. If the trial protocol was obtained from a trial registry, the outcomes sought should have been those enumerated in the original protocol, if the trial protocol was registered before, or when the trial began. If the trial protocol was registered after the trial had begun, we did not consider the outcomes to be reliable.

  • Unclear risk: the study authors did not report all predefined outcomes fully, or it was unclear whether the study authors recorded data on these outcomes or not.

  • High risk: the study authors did not report one or more predefined outcomes.

For‐profit bias

In order to assess the study's source of funding, we added this domain in place of the 'other bias' domain.

  • Low risk of bias: the trial appeared to be free of industry sponsorship or other types of for‐profit support that may influence the trial design, conduct, or trial results.

  • Unclear risk of bias: the trial may or may not be free of for‐profit bias, as the trial did not provide any information on clinical trial support or sponsorship.

  • High risk of bias: the trial was sponsored by industry, or received other types of for‐profit support.

Prospective clinical trial registration

This domain is different from selective outcome reporting, as it refers to the publication of a study protocol after the initiation of a clinical study, and therefore, is an indirect indicator of a risk of publication bias.

  • Low risk of bias: a trial protocol was available, and was published before the start of the trial.

  • Unclear risk of bias: insufficient information to permit judgement of low risk or high risk.

  • High risk of bias: no trial protocol was available, or the trial was registered after it had already begun.

Measures of treatment effect

Whenever possible, we extracted continuous outcomes. We pooled these data, when adequate, and used them for comparison.

Continuous data

We analysed these data based on the mean, standard deviation, and number of people assessed for both the intervention and comparison groups to calculate mean difference and 95% confidence interval (CI). Since the included trials reported the mean difference without individual group data, we used this to report the study results. If more than one study measured the same outcome using different validated tools, we had intended to calculate a standardised mean difference, namely, Hedges' (adjusted) g (Hedges 1985), and 95% CI, though this need did not arise. If necessary for comparison, we would have dichotomised rating scales using each study author's own criteria for improvement or no improvement. If these criteria were not described, we had planned to define 'improvement' as any beneficial change from baseline, and 'no improvement' as lack of improvement, or any deterioration from baseline.

Dichotomous data

We analysed these data based on the number of events and the number of people assessed in the intervention and comparison groups. We used them to calculate the risk ratio and 95% CI.

Unit of analysis issues

The primary unit of analysis in the included studies was individual trial participants.

We examined data from parallel‐group RCTs, and preferentially used data from intention‐to‐treat analyses.

When data were presented at different periods of follow‐up, we reported the same outcome separately each time it was presented, based on the different periods of follow‐up being reported. If the number of studies could not adequately populate such subgroups, we opted to select the longest period of follow‐up for each study.

In cases where studies included multiple active DBS arms, we had planned to combine all arms into a single pair‐wise comparison, using the Review Manager 5 calculator, using the methods suggested by Cochrane (Higgins 2011c; Review Manager 2014).

Given that individual participants are liable to experience an adverse event more than once, and adverse events may be reported as such, we had planned to preferentially request data from study authors concerning the number of participants with adverse events. When this approach was not successful, we treated adverse events as count data, not as categorical data (did or did not experience the event). Thus, we considered not only if the data were reported, but how many times they were reported. In such cases, we planned to treat the adverse events as Poisson data, and had planned to preferentially summarise the data as rate ratios, standardised to a given time period, to be defined post‐hoc.

Dealing with missing data

For missing outcome or summary data, we planned to use imputation methods to derive the missing data (where possible), and report any assumptions in the review. We had planned to investigate all cases through sensitivity analyses, to investigate the effects of any imputed data on pooled effect estimates.

As a first option, we planned to use the available information (e.g. standard error, 95% CI, or exact P value) to algebraically recover the missing data (Higgins 2011b; Higgins 2011c; Wiebe 2006). When the change from baseline standard deviation was not reported, or was not possible to extract, we planned to create a correlation coefficient, based on another study in this review, and then use this correlation coefficient to impute a change from baseline standard deviation (Abrams 2005; Follmann 1992; Higgins 2011c).

If this failed, and if at least one sufficiently large and similar study existed, we planned to use a method of single imputation (Furukawa 2006; Higgins 2011c).

Lastly, if a sufficient number of included studies with complete information existed, we planned to use multiple imputation methods to derive missing data (Carpenter 2013; Rubin 1991).

If none of these methods were successful, we planned to conduct a narrative synthesis for the data in question.

In case relevant data were only reported through figures or graphs, two authors would have independently extracted the relevant information. We planned to only use the data if the two extractions gave the same result.

We had no need to apply these methods, though we may have to do so in future updates.

Assessment of heterogeneity

For those outcomes where we pooled data in a meta‐analysis, we assessed the degree of heterogeneity by visual inspection of forest plots, and by examining the Chi² test for heterogeneity. We quantified statistical heterogeneity using the I² statistic. We considered an I² value of 50% or more to represent substantial heterogeneity, but interpreted this value in light of the size and direction of effects, and the strength of the evidence for heterogeneity, based on the P value from the Chi² test (Higgins 2003). When we found heterogeneity in the pooled effect estimates, we planned to explore possible reasons for variability by conducting subgroup and sensitivity analyses, when possible.

Assessment of reporting biases

We had intended to assess publication bias through visual inspection of funnel plot asymmetry (Sterne 2001), and Peters' regression tests (Peters 2006), provided that 10 or more studies per outcome were available (Sterne 2011).

Data synthesis

We performed statistical analysis using Review Manager 5 (Review Manager 2014), Stata version 14 (Stata 2015), and Trial Sequential Analysis (TSA) software (Thorlund 2011; TSA 2011).

Meta‐analysis

We pooled effect measures by applying the Mantel‐Haenszel method for dichotomous outcomes, and the inverse‐variance method for continuous and rate ratio syntheses, if required. We conducted data synthesis using a random‐effects model by default, independently of the presence or lack of considerable statistical heterogeneity, owing to the variety of disease subtypes that we intended to analyse. We presented all results with 95% CI.

We calculated the number of participants needed to treat for an additional beneficial outcome (NNTB) and for an additional harmful outcome (NNTH) from meta‐analysis estimates, rather than treating data as if they came from a single trial, as the latter approach is more prone to bias, especially when there are significant imbalances between groups within one or more trials in the meta‐analysis (Altman 2002). However, readers should be cautious when interpreting these findings, since they may be misleading because of variation in the event rates in each trial, differences in the outcomes considered, effects of secular trends on disease risk, and differences in clinical setting (Smeeth 1999).

When we could not combine data from the study reports in a meta‐analysis, we presented a qualitative summary of the study results in the review text.

Trial sequential analysis

In order to explore whether the cumulative data were adequately powered to evaluate the critical outcomes of this review, we performed a trial sequential analysis (Wetterslev 2008), and calculated a required information size (also known as the 'heterogeneity‐adjusted required information size' (Wetterslev 2009)) for the critical outcomes. Trial sequential analysis aims to evaluate whether statistically significant results of meta‐analysis are reliable, by accounting for the required information size (i.e. the number of participants in the meta‐analysis required to accept or reject an intervention effect). The technique is analogous to sequential monitoring boundaries in single trials. Trial sequential analysis adjusts the threshold of statistical significance, and has been shown to reduce the risk of random errors due to repetitive testing of accumulating data (Imberger 2016).

We calculated the required information size and computed the trial sequential monitoring boundaries using the O’Brien‐Fleming approach (O'Brien 1979). The required information size was based on the event proportion or standard deviation in the control group; assumption of a plausible relative risk reduction of 20%; a 5% risk of type I error; a 20% risk of type II error (power = 80%); and the observed heterogeneity of the meta‐analysis (Jakobsen 2014; Wetterslev 2009). In cases where a single trial is present, conducting a TSA is analogous to conducting a post‐hoc power calculation.

Assessment of confidence in cumulative evidence

As recommended by the GRADE approach, two review authors independently assessed all of the outcomes in the following domains: risk of bias, inconsistency, indirectness, imprecision, and publication bias (Atkins 2004). In cases of disagreement, the authors met to reach consensus, consulting an independent third review author if necessary. We used GRADEpro GDT software to develop a 'Summary of findings' table, which we included in the review (GRADEpro GDT).

To ensure the consistency and reproducibility of GRADE judgements, we applied the following criteria to each domain for all key comparisons of the critical outcomes.

  • Study limitations: downgraded once if more than 30% of participants were from studies classified as being at a high risk of bias across any domain.

  • Inconsistency: downgraded once if heterogeneity was statistically significant, or if the I² value was more than 40%. When we did not perform a meta‐analysis, we planned to downgrade once if trials did not show effects in the same direction.

  • Indirectness: downgraded once if more than 50% of the participants were outside the target group.

  • Imprecision: downgraded once if the optimal information size criterion was not met, or if it was met, but the 95% CI failed to exclude important benefit or important harm (Guyatt 2011).

  • Publication bias: downgraded once when there was direct evidence of publication bias, or if estimates of effect were based on small scale, industry‐sponsored studies, raising a high index of suspicion of publication bias.

We applied the following definitions of the quality of evidence (Balshem 2011)

  • High quality: we are very confident that the true effect lies close to that of the estimate of the effect.

  • Moderate quality: we are moderately confident in the effect estimate; the true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different.

  • Low quality: our confidence in the effect estimate is limited; the true effect may be substantially different from the estimate of the effect.

  • Very low quality: we have very little confidence in the effect estimate; the true effect is likely to be substantially different from the estimate of effect.

We assessed the following outcomes with the GRADE method.

  • Dystonia‐specific symptoms.

  • Proportion of participants with adverse events.

  • Clinical status.

  • Quality‐of‐life.

  • Functional capacity.

  • Emotional state.

  • Tolerability.

'Summary of findings' table

We included a 'Summary of findings' table to present the main findings of this review in a simple tabular format. We included key information concerning the quality of evidence, the magnitude of effect of the interventions examined, and the sum of available data on the available outcomes. When possible, we included both physical functioning and mental health measures of quality of life, as they are thought to be similarly relevant to people with cervical dystonia.

Subgroup analysis and investigation of heterogeneity

We had planned the following subgroup analyses.

  1. Disease subtypes (i.e. generalised and non‐generalised dystonia; primary and secondary dystonia).

  2. Target‐nucleus (i.e. internal globus pallidus, thalamus ventrointermediate nucleus, and subthalamic nucleus.

  3. Stimulation parameters (i.e. constant current and constant voltage).

  4. Risk of bias (i.e. overall low versus overall high).

  5. Control intervention used (i.e. botulinum treatment and lesional surgery; placebo and sham intervention).

We were unable to conduct these analyses due to the lack of available data, though we may be able to do so in future updates.

Sensitivity analysis

We had planned to conduct sensitivity analyses by excluding (i) studies in which imputation methods were applied, and (ii) studies assessed as being at an overall high risk of bias, in order to evaluate the robustness of the results.

We were unable to conduct these analyses due to the lack of available data, though we may be able to do so in future updates.

Results

Description of studies

We included two randomised, double‐blind, parallel‐designed studies comparing deep brain stimulation (DBS) to sham stimulation, with a total 102 participants with generalised, segmental, or focal dystonia (Kupsch 2006; Volkmann 2014).

Results of the search

See Figure 1 for the Study Flow diagram.


Study flow diagram

Study flow diagram

The search returned 379 records (80 from CENTRAL, 96 through MEDLINE, 203 through Embase, and none from other databases), resulting in 328 records after all duplicates were removed. After title and abstract screening, we retrieved 34 full‐text articles. Of these, we excluded 32 citations: five due to duplication (Dinkelbach 2015; Morgan 2008; Mueller 2008; Schjerling 2011; Volkmann 2012), 20 due to ineligible study design (Ellis 2011; Foncke 2005; Gale 2011; Grabli 2009; Houeto 2007; Kefalopoulou 2009; Kiss 2007; Koch 2014; Kovacs 2013; Levin 2014; Mills 2011; Moro 2009; Moro 2012; Ostrem 2011; Pauls 2011; Pretto 2008; Schupbach 2012; Skogseid 2009; Slotty 2015; Vidailhet 2005), four due to ineligible comparator (Odekerken 2013; Schjerling 2013; Simms 2011; Wojtecki 2015), and three due to ineligible patient population (Odekerken 2012; Teixeira 2015; Weaver 2009).

Included studies

We listed details of the included studies in the 'Characteristics of included studies' table.

The two included studies were parallel‐group, randomised, double‐blind clinical trials comparing DBS with sham stimulation in adults (i.e. 18 years of age or over) with generalised and segmental (Kupsch 2006), or focal (cervical) dystonia (Volkmann 2014). The studies enrolled a total of 102 participants, 54 of whom were male (52.9%). Kupsch 2006 included 40 participants; Volkmann 2014 included 62. Both were multiple‐centre studies, and both were conducted in the same 10 academic centres in Germany, Norway, and Austria. Neither trial described the method of participant referral and recruitment prior to study enrolment. A total of 52 (50.9%) participants were assigned to the neurostimulation arm of their respective studies; 50 (49.1%) participants were assigned to the sham stimulation arm. Both studies had similar inclusion criteria. Kupsch 2006 required a disease duration of at least five years, while Volkmann 2014 required a disease duration of at least three years.

Both studies excluded participants with cognitive impairment (< 120 points on the Mattis Dementia Rating Scale), moderate to severe depression (> 25 points on the BDI), marked brain atrophy (detected by Magnetic Resonance Imaging (MRI) or Computerised Tomography (CT)), or medical or psychiatric coexisting disorders that could increase the surgical risk, or interfere with completion of the trial. Volkmann 2014 also excluded participants with hemidystonia or generalised dystonia, increased bleeding risk, immune deficiency, previous brain surgery, and pregnant women.

Overall, within studies, participants were well matched between neurostimulation and sham stimulation arms, both in terms of allocation and baseline characteristics. The mean duration of disease was 16.7 years across both studies. The mean age across both studies was 50 years. In Kupsch 2006, the mean Burke‐Fahn‐Marsden Dystonia Rating Scale (BFMDRS) total movement score at baseline was 36.4, and the mean BFMDRS total disability score at baseline was 10 in both study arms, which can be interpreted as severe motor impairment and moderate disease‐related disability (Burke 1985). In Volkmann 2014, the mean Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS) total score at baseline was 48.7 in both study arms, which can be interpreted as severe disease‐related impairment (Consky 1990).

Electrode implantation and neurostimulation did not vary considerably between the two trials. In both studies, electrodes were implanted bilaterally in the posteroventrolateral portion of the internal globus pallidus (GPi) in one session, while the participants were under general anaesthesia. One week after surgery, trial participants attended a programming session or consultation. Both trials assessed the effect of neurostimulation at 0.5 volt below the threshold of inducing acute adverse effects in each participant, while participants assigned to sham stimulation were programmed to a 0 volt stimulation. Neither study allowed adjustments to measures of stimulation during the first three months of the study, unless intolerable adverse events occurred. A follow‐up assessment was scheduled at three months. After this assessment, neurostimulation was activated in the sham stimulation group, and was adjusted in the neurostimulation group if needed. Patients were reassessed after six months of active neurostimulation (i.e. six months after randomisation for the neurostimulation group, and nine months for the sham stimulation group).

In both studies, the primary outcomes of dystonia‐specific symptoms (measured with BFMDRS in Kupsch 2006, and TWSTRS in Volkmann 2014), and adverse events were assessed using an intention‐to‐treat (ITT) approach, with adequate imputation methods.

Excluded studies

We listed all the excluded studies, together with reasons for their exclusion, in the 'Characteristics of excluded studies' table. We included all reports that entered the full‐text screening phase.

Risk of bias in included studies

See Characteristics of included studies: 'Risk of bias' table.

See Figure 2 and Figure 3 for the 'Risk of bias' summary graphs. These assessments were based on the information available in the primary report data. We did not consider either of the studies to be at low risk of bias across all domains. We attributed high risk of bias to the 'blinding of personnel and participants' and 'for‐profit bias' domains in both studies, and to the 'prospective clinical trial registration' domain in Kupsch 2006.


Risk of bias graph: review authors' judgements about each source of risk of bias presented as percentages across all included studies

Risk of bias graph: review authors' judgements about each source of risk of bias presented as percentages across all included studies


Risk of bias summary: review authors' judgements about each source of risk of bias for each included study

Risk of bias summary: review authors' judgements about each source of risk of bias for each included study

Allocation

Both studies described the process of random sequence generation (permuted block allocation scheme), and an adequate allocation concealment process, and we rated them as being at a low risk of bias. In addition, we considered baseline characteristics to the balanced between intervention groups.

Blinding

We considered the blinding of participants and personnel to be at high risk for both included studies. In Volkmann 2014, while participants were adequately blinded and assessed on the success of the blinding, treating physicians were not blinded. We considered that the programming session was not adequately blinded in both studies.

We divided the detection bias domain into two sub‐domains, one for primary, the other for secondary outcomes. Kupsch 2006 adequately blinded investigators assessing the primary and secondary outcomes. Volkmann 2014 adequately blinded investigators assessing the primary study outcomes, though the secondary trial outcomes were assessed unblinded, which represented a high risk of detection bias.

Incomplete outcome data

Both studies adequately reported the number and reasons for participant exclusions or missing data in both treatment arms, and these were evenly distributed across both treatment arms, so we rated them as having a low risk of bias. In both studies, the primary outcome (measured with the BFMDRS in Kupsch 2006, and the TWSTRS in Volkmann 2014) was reported with adequate imputation methods. However, all remaining outcome data were reported per protocol.

Selective reporting

We considered that the more clinically relevant outcomes that are usually evaluated in intervention trials for this condition were reported in both Kupsch 2006 and Volkmann 2014, so we considered them at low risk of bias for reporting data. Both studies had a protocol available at clinicaltrials.gov. The Kupsch 2006 protocol was registered under number NCT00142259, with an "unknown" status, meaning study had passed its completion date, and status had not been verified in more than two years. The Volkmann 2014 protocol was registered under number NCT00148889, and had a "completed" status.

Other potential sources of bias

For‐profit bias

Both studies were supported and funded by Medtronic; we rated them as high risk of bias in this domain.

Prospective clinical trial registration

We rated Kupsch 2006 as high risk of bias because the trial was registered after the trial had begun. Volkmann 2014 had a prospective clinical trial registration; therefore, we rated it as low risk of bias in this domain.

Publication bias

We had intended to use funnel plots to explore publication bias. However, due to the small number of included studies, the power of this analysis was considered to be inadequate (Sterne 2011).

Effects of interventions

See: Summary of findings 1 Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia; Summary of findings 2 Deep brain stimulation compared to sham stimulation in cervical dystonia; Summary of findings 3 Deep brain stimulation compared to sham stimulation in dystonia

The key results of this review can be found in summary of findings Table 2, summary of findings Table 1, and summary of findings Table 3.

The two studies included in this review evaluated two populations that were not clinical comparable. Kupsch 2006 included people with generalised and segmental dystonia, and Volkmann 2014 included people with focal (cervical) dystonia. Therefore, for all efficacy outcomes, we presented the results separately for each population subgroup, since pooling the data would not be justifiable or useful on clinical grounds. For safety outcomes, we opted to pool the proportion of participants with adverse events from both studies in a meta‐analysis, since both used the same intervention and comparison, applied to the same region of the brain (Chen 2014). It is important to note that this comparison isolates only the effect of neurostimulation on overall safety, and not the risk of adverse events with DBS compared to placebo or no intervention.

Critical Outcomes

Dystonia‐specific symptoms

The primary outcome in Kupsch 2006 was measured as change from baseline with the BFMDRS (total score range 0 to 150), which is composed of a movement sub‐scale, based on clinical patient examination, that assesses dystonia severity and provoking factors in different body areas, for a score of 120, and a disability sub‐scale, that evaluates the patient's report of disability in activities of daily living, for a score of 30. The higher the score, the greater the level of morbidity. In the absence of an established minimum important difference in the BFMDRS total score, we considered a 20% change from baseline to represent a clinically meaningful change.

The primary outcome in Volkmann 2014 was measured as change from baseline with the TWSTRS, which is currently the clinically validated tool most commonly used to assess and document the status of people with cervical dystonia. The TWSTRS (total score range 0 to 85) is a composite of three sub‐scales that evaluate different features of cervical dystonia: severity (range 0 to 35), disability (range 0 to 30), and pain (range 0 to 20). The higher the score, the greater the level of morbidity. In the absence of an established minimum important difference in the TWSTRS total score, we considered a 20% change from baseline as representing a clinically meaningful change.

Kupsch 2006 reported data for the mean change from baseline in the BFMDRS movement sub‐scale at three months. Treatment with DBS was associated with a greater improvement than sham stimulation for adults with generalized and segmental dystonia (mean difference (MD) 14.40 BFMDRS units (95% confidence interval (CI) 8.0 to 20.80; N = 40)). Treatment with DBS was also associated with a greater improvement in the BFMDRS disability sub‐scale at three months (MD 3.10 units, 95% CI 1.72 to 4.48; N = 39).

Volkmann 2014 reported data for the mean change from baseline in all three TWSTRS sub‐scores and the total score at three months. Treatment with DBS was associated with a greater improvement than sham stimulation on each sub‐scale, with the exception of the TWSTRS pain sub‐scale, which showed inconclusive results between interventions for adults with combined and complex dystonia (TWSTRS total: MD 9.80 units, 95% CI 3.52 to 16.08; N = 59; TWSTRS severity: MD 3.80, 95% CI 1.84 to 5.76; N = 62; TWSTRS disability: MD 3.80 units, 95% CI 1.41 to 6.19; N = 61; TWSTRS pain: MD 0.70, 95% CI ‐3.06 to 1.66; N = 61).

Volkmann 2014 also reported the mean change from baseline with the Tsui score (total score range 0 to 25; the higher the score, the greater the level of morbidity), and the Bain and Findlay Clinical Tremor Rating Scale (BFCTRS; total score range 0 to 10; the higher the score, the greater the level of morbidity) at three months, as secondary outcomes. Treatment with DBS was associated with improvements on both scales (Tsui: MD 4.20 units, 95% CI 2.08 to 6.32; N = 56; BFCTRS: MD 1.60 units, 95% CI 0.48 to 2.72; N = 59).

In the trial sequential analysis, the evidence overcame the necessary sample size (considering a 20% change from control group baseline status) generated by a superiority sample size calculation. Therefore, we considered that the cumulative evidence was adequately powered in both studies.

Adverse events

Both studies reported data on the proportion of participants with adverse events. Neither study found conclusive results for risk of adverse events between neurostimulation and sham stimulation. Kupsch 2006 reported a risk ratio (RR) of 1.67 (95% CI 0.46 to 6.06; N = 40), and Volkmann 2014 reported a RR of 1.56 (95% CI 0.93 to 2.61; N = 62).

Kupsch 2006 considered infection at the stimulator site and lead dislodgement to be serious adverse events. The former outcome occurred in one participant (5%) in the neurostimulation group and two participants (10%) in the sham stimulation group. Lead dislodgment occurred once (5%) in the neurostimulation group.

In Volkmann 2014, five participants (16%) in the neurostimulation group suffered serious adverse events (device infection, implantable pulse generators dislocation, electrode misplacement, hemiparesis or stroke, depression), while six (20%) suffered serious adverse events in the sham stimulation group. However, these data were presented together with the adverse events recorded during the unblinded phase of the trial.

The most frequently reported adverse events were device infection in the stimulation site (7.5% of all participants in Kupsch 2006, and 3% of all participants in Volkmann 2014, or one participant in each intervention arm, and surgical exchange of device components (3% of all participants in Volkmann 2014, occurring in two participants in the sham stimulation group. Kupsch 2006 also reported postoperative confusion (one participant, or 5% of neurostimulation group, seizures (one participant, or 5% of neurostimulation group, seroma (one participant, or 5% of neurostimulation group, dysarthria (one participant, or 5% of neurostimulation group, and facial weakness (one participant, or 5% of sham stimulation group.

The risk of adverse events between the stimulation and non‐stimulation groups was inconclusive (RR 1.58, 95% CI 0.98 to 2.54; I² = 0%; N = 102 participants; Analysis 1.1; Figure 4).


Forest plot of comparison: Neurostimulation vs sham stimulation for outcome – adverse events

Forest plot of comparison: Neurostimulation vs sham stimulation for outcome – adverse events

In the trial sequential analysis, the evidence did not overcome the sample size generated by a superiority sample size calculation. Therefore, the cumulative evidence was not adequately powered for the purpose of safety evaluation.

Important Outcomes

Clinical status

Both studies reported data on clinical status by both clinicians and patients. The instruments used to measure this outcome were the visual analogue scale (VAS) in Kupsch 2006 and the Clinical Global Impression Scale (CGIS) in Volkmann 2014. This VAS was a composite of three sub‐scores of dystonia severity rated by the patient, dystonia severity rated by the physician, and pain severity rated by the patient, each ranging from 0 to 10, with higher scores indicating higher severity. The CGIS is a composite of two sub‐scores on dystonia severity rated by the patient and by the physician, also ranging from 0 to 10, with higher scores indicating higher severity.

Kupsch 2006 reported data for mean change in dystonia severity at three months. Overall, DBS was associated with improved clinical status, reported by both patients and clinicians (patient assessment: MD 3.50, 95% CI 2.33 to 4.67; N = 37; clinician assessment: MD 3.00, 95% CI 2.32 to 3.68; N = 38).

Volkmann 2014 reported data for mean change in dystonia severity at three months. Overall, DBS was associated with improved clinical status, reported by both patients and clinicians (patient assessment: MD 2.30, 95% CI 1.15 to 3.45; N = 61; clinician assessment: MD 2.20, 95% CI 1.30 to 3.10; N = 61).

Quality of life

The principal instrument used to assess mean change in quality of life was the 36‐item Short Form Health Survey (SF‐36), used by both studies. The SF‐36 is a clinically well‐characterised quality of life rating scale that evaluates eight domains of functioning, each with a 0 to 100 range, with a higher score indicating higher level of functioning. The following domains were assessed: physical functioning, bodily pain, role limitations due to physical health problems, role limitations due to personal or emotional problems, mental health, social functioning, vitality, and general health perceptions. Volkmann 2014 also reported data using the Craniocervical Dystonia Questionnaire (CDQ‐24), which is a patient‐rated quality of life questionnaire used to measure craniocervical dystonia (mainly cervical dystonia and blepharospasm). It consists of 24 items in 5 domains: stigma, emotional well‐being, pain, activities of daily living, and social life. Each item is rated on a 5‐point scale. The higher the score, the higher the level of morbidity.

Kupsch 2006 reported data as mean change from baseline in the physical functioning and mental health domains of the SF‐36 at three months. Overall, DBS was associated with an improvement in the physical functioning domain, and inconclusive results in the mental health domain (physical functioning: MD 6.30, 95% CI 1.06 to 11.54; mental health: MD 5.00, 95% CI ‐2.14 to 12.14; N = 33).

Volkmann 2014 reported data as mean change from baseline for each of the eight domains of the SF‐36 at three months. Overall, there were no conclusive results between the DBS and sham stimulation groups on any domain (physical functioning: MD 3.00, 95% CI ‐7.71 to 13.71; N = 57; role limitation due to physical problems: MD 6.20, 95% CI ‐16.05 to 28.45; N = 56; bodily pain: MD 5.50, 95% CI ‐5.57 to 16.57; N = 58; general health perception: MD 6.70, 95% CI ‐1.21 to 14.61; N = 56; vitality: MD 3.80, 95% CI ‐4.45 to 12.05; N = 57; social functioning: MD 3.90, 95% CI ‐12.17 to 19.97; N = 58; role limitation due to emotional problems: MD 19.50, 95% CI ‐3.64 to 42.64; N = 57; mental health: MD 2.40, 95% CI ‐6.20 to 11.00; N = 56). The results measured with the CDQ‐24 were also inconclusive between the DBS and sham stimulation groups (MD 6.00, 95% CI ‐0.87 to 12.87; N = 59).

Functional capacity

Kupsch 2006 reported mean change from baseline in the BFMDRS disability sub‐scale at three months. Overall, DBS was associated with improved functional capacity (MD 3.10, 95% CI 1.72 to 4.48; N = 39).

Volkmann 2014 reported mean change from baseline in the TWSTRS disability score at three months. Overall, DBS was associated with improved functional capacity (MD 3.80, 95% CI 1.41 to 6.19; N = 61).

Emotional state

Different tools were used to assess change from baseline for emotional functioning. The Beck Depression Inventory (BDI) is an instrument with 21 participant‐rated items that measure attitudes and symptoms typical of depression; each item can be rated from 0 to 3, for a total score range of 0 to 63, where higher scores indicate more severity. The Beck Anxiety Inventory is a clinically‐validated tool with 21 participant‐rated items that measure attitudes and symptoms of anxiety; each item can be rated from 0 to 3, for a total score range of 0 to 63, where higher scores indicate more severity. The Brief Psychiatric Rating Scale (BPRS) is a clinically‐validated psychiatric tool, with 18 symptom domains, for which the rater evaluates the participant on a range of 1 to 7 for each domain, for a total score range of 18 to 126; higher scores indicate more severity.

Kupsch 2006 reported mean change from baseline in emotional state with the BDI, the Beck Anxiety Inventory, and the BPRS at three months. Overall, the results were inconclusive between the DBS and sham stimulation groups with any of these instruments (BDI: MD 4.60, 95% CI ‐2.06 to 11.26; N = 30; Beck Anxiety Inventory: MD 4.50, 95% CI ‐2.66 to 11.66; N = 35; BPRS: MD 2.90, 95% CI ‐1.87 to 7.67; N = 37).

Volkmann 2014 reported mean change from baseline in emotional state with the BDI and the BPRS at three months. Overall, DBS was associated with improved scores on the BDI, though not on the BPRS (BDI: MD 3.10, 95% CI 0.73 to 5.47; N = 61; BPRS: MD ‐0.30, 95% CI ‐3.80 to 3.20; N = 60).

Tolerability

We assessed tolerability as the proportion of participants who withdrew from the study, or interrupted DBS due to adverse events, measured at any point during study follow‐up. Kupsch 2006 did not report any withdrawals. Volkmann 2014 reported a single withdrawal, due to withdrawal of consent after failure of electrode implantation in the neurostimulation group.

Since Review Manager 5 does not allow combination of zero‐event data, we used R to combine these data (R 2017). We handled the zero‐events by applying a constant correction of 0.5. Overall, the results between the neurostimulation and sham stimulations groups were inconclusive for tolerability (RR 1.86, 95% CI 0.16 to 21.57; I² = 0%; N = 102 participants).

Discussion

Summary of main results

This review included two parallel‐group, randomised, double‐blind clinical trials comparing deep brain stimulation (DBS) to sham stimulation in adults with generalised or segmental (Kupsch 2006), and focal (cervical) dystonia (Volkmann 2014), with a combined total of 102 participants.

Due to the difference in the body distribution of dystonia between patient populations, we analysed outcomes related to DBS efficacy separately for each study, even though the other inclusion criteria were similar.

As can be seen in summary of findings Table 2, due to low‐quality evidence and an important effect size, we concluded that DBS may improve cervical dystonia‐related impairment, overall functional capacity, and overall mood. Due to low‐quality evidence and a small effect size, we concluded that DBS may slightly improve overall subjective evaluation of clinical status. Due to very low‐quality evidence and an inconclusive effect, we were uncertain whether DBS improved overall physical functioning‐related quality of life and overall mental health‐quality of life (Volkmann 2014).

As can be seen in summary of findings Table 1, due to low‐quality evidence and an important effect size, we concluded that DBS may improve generalised or segmental dystonia‐related impairment, overall subjective evaluation of clinical status, overall physical functioning‐related quality of life, and overall dystonia‐related functional capacity. Due to very low‐quality evidence and an inconclusive effect, we were uncertain whether DBS improved overall mental health‐quality of life and mood (Kupsch 2006).

We pooled outcomes related to safety and tolerability, since both trials used the same intervention and comparison. As can be seen in summary of findings Table 3, due to very low‐quality evidence and an inconclusive effect, we concluded that we were uncertain whether DBS impacts the risk of adverse events and tolerability. The risk of adverse events was inconclusive between groups, though this may be due to the small sample that was analysed for this outcome. The short duration of the trials, and the small sample size, precluded strong conclusions regarding the inconclusive differences between DBS and sham stimulation. Volkmann 2014 reported a higher proportion of adverse events than Kupsch 2006, but the proportion of adverse events between groups in the study was inconclusive. Serious adverse events of special interest to those contemplating treatment were device infection at the stimulation site, lead dislodgment, surgical exchange of device components, hemiparesis or stroke, and depression. Out of these, the most common were device infection and lead dislodgement.

Overall completeness and applicability of evidence

Both trials answered the primary research question directly, using different assessment tools – the Toronto Western Spasmodic Torticollis Rating Scale (TWSTRS) and the Burke‐Fahn‐Marsden Dystonia Rating Scale (BFMDRS). Data were reported fully for all the outcomes, however, in most cases, results could not be pooled and compared among the studies, due to the difference in the body distribution of dystonia between patient populations. This limited the amount of data available, and consequently, our confidence in the overall conclusions.

The participants included in the studies were not fully representative of the overall population of people with dystonia, as they represented only three body distributions typical of the condition (generalised, segmental, and cervical dystonia). The effects of population enrichment, and the moderate to severe disease impairment at baseline, assessed by the TWSTRS and BFMDRS, precluded definite conclusions concerning all people with this condition. Since Kupsch 2006 studied two different body distributions of dystonia (generalised and segmental dystonia), we considered access to subgroup data to be important, given possible differences between efficacy, risk, safety, and benefit profiles.

Both trials evaluated the same DBS device (Kinetra model from Medtronic, Inc) and lead models 3387 and 3389 (Medtronic, Inc). However, different DBS devices are manufactured, including neurostimulators, leads, extensions, programmers, and DBS surgery kits. It would be important to evaluate if there was a significantly different efficacy, safety, or tolerability profile depending on the devices, stimulation protocols used, or both.

Both trials assessed the risk of adverse events. However, the trials were primarily designed to evaluate efficacy, so the investigators chose sham surgery as the comparator. This type of control reduces the ability to detect differences in safety outcomes, because both groups will experience the cause of most short‐term complications from implantable pulse generator DBS, namely the surgery itself. In addition, the limited trial duration meant that adverse events known to occur later, such as lead and battery complications, would not be detected. Therefore, readers should interpret the safety results from included trials with caution.

Costs for DBS range from USD43,232 to USD610,609, with an average cost over five years amounting to roughly USD186,244 in patients with Parkinson's disease, according to a qualitative systematic review (Becerra 2016). Another study presented the Medtronic UK price listing, with total DBS costs reaching GBP11,000 (for DBS extensions, leads, patient programmer, and implantation procedure). Implantable pulse generator replacement has to be taken into account after two to five years, depending on the device, and managing adverse events add direct cost (Eggington 2014). Besides these direct costs, expenses for follow‐up visits (which can occur at intervals of two weeks to three months) for stimulation reprogramming or adverse event management have to be taken into account to fully assess the cost and applicability of DBS. Availability and direct and indirect costs vary from country to country.

Since both trials studied the same target nuclei, data cannot be applied to other functional neurosurgery approaches.

Quality of the evidence

See 'Risk of bias' tables, and 'Risk of bias' summary tables (Figure 2; Figure 3).

Overall, most outcomes were supported by low‐quality or very low‐quality evidence. We considered both studies to be at a high risk of bias for 'blinding of personnel and participants' and 'for‐profit bias' domains. In Volkmann 2014, treating physicians were not blinded. In both studies, we judged that the programming sessions were not adequately blinded. We considered Volkmann 2014 at high risk of bias for 'blinding of outcome assessment'. While there was adequate blinding of investigators who assessed the primary study outcomes, the secondary trial outcomes were assessed unblinded, representing a high risk of detection bias. We considered Kupsch 2006 at high risk of bias for 'prospective clinical trial registration'. Both studies were supported and funded by Medtronic, so we rated both at high risk of bias in this domain. Thus, we downgraded both studies for study limitations.

Most people with focal and segmental dystonia are controlled with botulinum toxin therapy, and only those with the most severe forms of dystonia tend to opt for potentially dangerous surgery. The included trials followed participants for only three to six months, which raised concerns about the generalisability of the findings. Therefore, we downgraded our confidence in the evidence for all outcomes due to indirectness.

We were unable to compare outcomes across studies, with the exception of adverse events and withdrawals, due to different participant populations. The included trials enrolled between 40 and 62 participants, individually, more participants than the total number required for a single adequately powered trial. In Volkmann 2014, results were inconclusive for physical functioning‐related and mental health‐related quality of life. In Kupsch 2006, results were inconclusive for mental health‐related quality of life and mood. Therefore, we downgraded these outcomes for imprecision.

Risk of adverse events were under‐powered to evaluate the proportion of participants with adverse events, since cumulative evidence did not overcome the required information size generated by a conventional sample size calculation, and did not overcome the sample size generated by a superiority sample size calculation. Therefore, we also downgraded the evidence for the critical safety outcome for imprecision.

Potential biases in the review process

Although we followed the methods recommended by Cochrane in order to minimise bias in the review process, it has to be underlined that since trial authors did not describe the referral method for participants, and these studies were done in identical centres, there may be a form of selection bias that was not adequately explored in the current review. Since Kupsch 2006 studied two different body distributions of dystonia (generalised and segmental dystonia), we considered that access to subgroup data would have been important, as well as access to individual data for each population, which was lacking.

Agreements and disagreements with other studies or reviews

The current review is, to our knowledge, the first systematic review that compares DBS with sham stimulation in randomised controlled trials. We included all randomised controlled trials that address this question in the current review.

Recently, a systematic review and meta‐analysis on the efficacy of DBS targeting the internal globus pallidus in isolated inherited or idiopathic dystonia was published, which included 24 studies with a total of 523 patients (Moro 2017). All included studies had a prospective, uncontrolled, and observational design. They only included studies reporting results of the BFMDRS, since most studies using the TWSTRS had poor reporting or incomplete data. They did not include safety outcomes in the meta‐analysis due to poor reporting. Moro 2017 reported outcomes as absolute improvement and percentage improvement at 6 months, 12 months, and last follow‐up. The mean absolute change in BFMDRS movement scores at last follow‐up was 26.6 points (95% CI 22.4 to 30.8); the percentage improvement was 65% (95% CI 59.6 to 70.7). The corresponding change in the BFMDRS disability scores at the last follow‐up was 6.4 points (95% CI 5.0 to 7.8); the percentage improvement was 58.6% (95% CI 50.3 to 66.9). The study authors indicated that they used both fixed‐effect and random‐effects models for statistical analysis, and chose the most appropriate model for each meta‐analysis based on the presence of heterogeneity. They preferred a random‐effects model in the presence of significant heterogeneity. The authors stated that they assessed heterogeneity with Cochran's Q test and the I² statistic, but they presented no data on the results.

There is an ongoing randomised, double‐blinded, sham‐controlled trial assessing the efficacy and safety of pallidal deep brain stimulation versus botulinum toxin A therapy in cervical dystonia (Drechsel 2017). Recruitment was to start in 2017, with first results expected in 2018. Planned primary outcome was change in TWSTRS total score between baseline and six months of therapy. Planned secondary outcomes were changes in TWSTRS motor score, Tsui score, CDQ‐24, and SF‐36. Safety outcomes will be assessed by spontaneously reported adverse effects.

Study flow diagram

Figures and Tables -
Figure 1

Study flow diagram

Risk of bias graph: review authors' judgements about each source of risk of bias presented as percentages across all included studies

Figures and Tables -
Figure 2

Risk of bias graph: review authors' judgements about each source of risk of bias presented as percentages across all included studies

Risk of bias summary: review authors' judgements about each source of risk of bias for each included study

Figures and Tables -
Figure 3

Risk of bias summary: review authors' judgements about each source of risk of bias for each included study

Forest plot of comparison: Neurostimulation vs sham stimulation for outcome – adverse events

Figures and Tables -
Figure 4

Forest plot of comparison: Neurostimulation vs sham stimulation for outcome – adverse events

Comparison 1: Neurostimulation vs sham stimulation, Outcome 1: Adverse events

Figures and Tables -
Analysis 1.1

Comparison 1: Neurostimulation vs sham stimulation, Outcome 1: Adverse events

Summary of findings 1. Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia

Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia

Patient or population: adults with generalised or segmental dystonia
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Certainty of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Dystonia‐specific improvement
assessed with BFMDRS movement score
(follow‐up: 3 months)

The mean dystonia‐specific improvement without DBS was 1.4 fewer units

The mean dystonia‐specific improvement with DBS was 15.8 fewer units

14.4 units fewer
(8.0 to 20.8 fewer)

40
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall generalised or segmental dystonia severity

Subjective Evaluation of Clinical Status
assessed with Visual Analogue Scale
(follow‐up: 3 months)

The mean subjective Evaluation of Clinical Status without DBS was 0.1 higher units

The mean subjective Evaluation of Clinical Status with DBS was 3.4 higher units

3.5 units fewer
(2.33 to 4.67 fewer)

37
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall subjective improvement of clinical status

Quality of Life Assessment
assessed with SF‐36: physical function
(follow up: 3 months)

The mean quality of Life Assessment without DBS was 3.8 higher units

The mean quality of Life Assessment with DBS was 10.1 higher units

6.3 units higher
(1.06 to 11.54 higher)

33
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall physical functioning quality of life

Quality of Life Assessment
assessed with SF‐36: mental health
(follow up: 3 months)

The mean quality of Life Assessment without DBS was 0.2 higher units

The mean quality of Life Assessment with DBS was 5.2 higher units

5.0 units higher
(2.14 lower to 12.14 higher)

33
(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes overall mental health quality of life

Functional Capacity
assessed with BFMDRS disability score
(follow up: 3 months)

The mean functional Capacity without DBS was 0.8 fewer units

The mean functional Capacity with DBS was 3.9 fewer units

3.1 units fewer
(1.71 to 4.48 fewer)

39
(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall dystonia related functional capacity

Emotional Assessment
assessed with Beck Depression Inventory
(follow up: 3 months)

The mean emotional Assessment without DBS was 0.5 fewer units

The mean emotional Assessment with DBS was 5.1 fewer units

4.6 units fewer
(11.26 fewer to 2.06 more)

30
(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes overall emotional assessment

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious study limitations: moderate risk of bias (three domains with high risk of bias)

2 Serious indirectness: short‐term follow‐up (3 to 6 months) precludes firm conclusions

3 Serious imprecision: gathered information size criteria was met but the 95% CI failed to exclude important benefit or important harm

Figures and Tables -
Summary of findings 1. Deep brain stimulation compared to sham stimulation in generalised or segmental dystonia
Summary of findings 2. Deep brain stimulation compared to sham stimulation in cervical dystonia

Deep brain stimulation compared to sham stimulation in cervical dystonia

Patient or population: adults with cervical dystonia
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Certainty of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Dystonia‐specific symptoms

(assessed with TWSTRS; score range 0 to 85; higher = worse; follow‐up 3 months)

The mean dystonia‐specific Improvement without DBS was 8.5 fewer units

The mean dystonia‐specific Improvement with DBS was 18.3 fewer units

9.8 units fewer
(3.52 to 16.08 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall cervical dystonia severity

Clinical status
(assessed with Clinical Global Impression Scale
(follow‐up: 3 months)

The mean subjective Evaluation of Clinical Status without DBS was 1.2 fewer units

The mean subjective Evaluation of Clinical Status with DBS was 3.5 fewer units

2.3 units fewer
(1.15 to 3.45 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may slightly improve overall subjective improvement of clinical status

Quality of Life
using SF‐36: physical functioning
(follow‐up: 3 months)

The mean quality of Life Assessment without DBS was 3.6 higher units

The mean quality of Life Assessment with DBS was 6.6 higher units

3 units higher
(7.71 lower to 13.71 higher)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1 2,3

We are uncertain whether DBS changes overall physical functioning quality of life

Quality of Life Assessment
using SF‐36: mental health
(follow‐up: 3 months)

The mean quality of Life Assessment without DBS was 8.9 higher units

The mean quality of Life Assessment with DBS was 11.3 higher units

2.4 units higher
(6.2 lower to 11 higher)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1 2,3

We are uncertain whether DBS changes overall mental health quality of life

Functional capacity
assessed with TWSTRS disability sub‐scale
(follow‐up: 3 months)

The mean functional capacity without DBS was 1.8 fewer units

The mean functional capacity with DBS was 5.6 fewer units

3.8 units fewer
(1.41 to 6.19 fewer)

62

(1 RCT)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS improves overall functional capacity

Emotional assessment
assessed with Beck Depression Inventory
(follow‐up: 3 months)

The mean emotional assessment without DBS was 0.4 fewer units

The mean emotional assessment with DBS was 3.5 fewer units

3.1 units fewer
(0.73 to 5.47 fewer)

62

(1 RCT)

⊕⊕⊝⊝
LOW 1,2

DBS may improve overall emotional assessment

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio; TWSTRS

GRADE Working Group grades of evidence
High certainty: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate certainty: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low certainty: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low certainty: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious study limitations: moderate risk of bias (three domains with high risk of bias)

2 Serious indirectness: short‐term follow‐up (3 to 6 months) precludes firm conclusions

3 Serious Imprecision: gathered information size criteria was met, but the 95% CI failed to exclude important benefit or important harm

Figures and Tables -
Summary of findings 2. Deep brain stimulation compared to sham stimulation in cervical dystonia
Summary of findings 3. Deep brain stimulation compared to sham stimulation in dystonia

Deep brain stimultion compared to sham stimulation in dystonia

Patient or population: adults with dystonia (generalised, segmental, and cervical)
Setting: tertiary hospitals in Germany, Norway, and Austria
Intervention: deep brain stimulation (DBS)
Comparison: sham stimulation

Outcomes

Relative effect
(95% CI)

Anticipated absolute effects* (95% CI)

No of Participants
(studies)

Quality of the evidence
(GRADE)

What happens

Without DBS

With DBS

Difference

Adverse Events
follow up: 3 months

RR 1.58
(0.98 to 2.54)

Study population

102
(2 RCTs)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes the risk of developing adverse events.

30.0%

47.4%
(29.4 to 76.2)

17.4% more
(0.6 fewer to 46.2 more)

Tolerability
follow up: 3 months

RR 1.86
(0.16 to 21.57)

Study population

102
(2 RCTs)

⊕⊝⊝⊝
VERY LOW 1,2,3

We are uncertain whether DBS changes the risk of tolerability.

0.0%

0.0%
(0.0 to 0.0)

0.0% fewer
(0 fewer to 0 fewer)

*The risk in the intervention group (and its 95% confidence interval) is based on the assumed risk in the comparison group and the relative effect of the intervention (and its 95% CI).

CI: Confidence interval; RR: Risk ratio

GRADE Working Group grades of evidence
High quality: We are very confident that the true effect lies close to that of the estimate of the effect
Moderate quality: We are moderately confident in the effect estimate: The true effect is likely to be close to the estimate of the effect, but there is a possibility that it is substantially different
Low quality: Our confidence in the effect estimate is limited: The true effect may be substantially different from the estimate of the effect
Very low quality: We have very little confidence in the effect estimate: The true effect is likely to be substantially different from the estimate of effect

1 Serious Study limitations: Moderate risk of bias across all included studies (three domains with high risk of bias in each study)

2 Serious Indirectness: Short‐term follow‐up (3‐6 months) precludes firm conclusions

3 Serious Imprecision: Minimal information size criteria was less than the number generated by a conventional sample size and alpha‐spending sample size calculations

Figures and Tables -
Summary of findings 3. Deep brain stimulation compared to sham stimulation in dystonia
Table 1. Glossary of terms

Term

Definition

Deep brain stimulation

Neurosurgical procedure whereby an electric current is delivered by electrodes placed in the deep brain stimulate target nuclei

Target nucleus or nuclei

Groups of neuronal cell bodies, located in the deep areas of the brain, selected for deep brain stimulation

Dystonia

Common movement disorder in which people have abnormal torsion movements, or postures of one or more body segments, such as the neck or a limb, that they cannot control. It is frequently accompanied by social embarrassment and pain.

Primary dystonia

Dystonic disorder caused by an intrinsic basal ganglia problem unrelated to any other disease. It is sometimes caused by a mutation; dystonia is the main clinical manifestation in the majority of primary dystonias

Secondary dystonia

Dystonic disorder caused by another disease (i.e. caused by stroke)

Generalised dystonia

Dystonia affecting all body segments (i.e. trunk, upper and lower limbs)

Cervical dystonia

Dystonia affecting the neck

Blepharospasm

Dystonia affecting the eye lids

Figures and Tables -
Table 1. Glossary of terms
Comparison 1. Neurostimulation vs sham stimulation

Outcome or subgroup title

No. of studies

No. of participants

Statistical method

Effect size

1.1 Adverse events Show forest plot

2

102

Risk Ratio (M‐H, Random, 95% CI)

1.58 [0.98, 2.54]

Figures and Tables -
Comparison 1. Neurostimulation vs sham stimulation