Skip to main content
ADVERTISEMENT
New Adriaan Louw textbook - Learn more at OPTP.com
Full AccessResearch Report

The Responsiveness and Interpretability of the Shoulder Pain and Disability Index

Journal of Orthopaedic & Sports Physical Therapy
Published Online:Volume47Issue4Pages278-286

Abstract

Study Design

Clinical measurement study, prospective cohort design.

Background

Shoulder pain is a common disorder, and treatment is most often focused on a reduction of pain and functional disabilities. Several reviews have encouraged the use of the Shoulder Pain and Disability Index (SPADI) to objectify functional disability. It is important to assess the responsiveness and interpretability of the SPADI in patients seeking physical therapy treatment for their shoulder pain in a primary care setting.

Objective

To assess the responsiveness and interpretability of the SPADI in patients with shoulder pain visiting a physical therapist in primary care.

Methods

The target population consisted of patients who consulted a physical therapist for their shoulder pain. The patients received physical therapy treatment and completed the Dutch-language version of the SPADI at baseline and at 26-week follow-up. The interpretability floor and ceiling effects and the minimal important change (MIC) were assessed using the receiver operating characteristic method, and a visual anchor-based MIC distribution method was used to assess several Global Perceived Effect scale (GPE)-based anchors. The measurement error was calculated using the smallest detectable change. For the responsiveness, the area under the receiver operating characteristic curve was used, and correlations with the GPE and the change score of the Shoulder Disability Questionnaire (as this questionnaire measures the same construct) were assessed.

Results

A total of 356 patients participated at baseline and 237 (67%) returned the SPADI after 26 weeks. The mean score on the SPADI at baseline was 46.7 points (on a 0–100 scale). The SPADI showed no signs of floor and ceiling effects. The smallest detectable change was 19.7 points. The MIC was 20 (43% of baseline value), and therefore a change of 43% or more in an individual patient was considered to be clinically relevant. The area under the receiver operating characteristic curve (AUC) was 0.81, the Spearman correlation between the SPADI change score and the GPE was 0.53, and the Pearson correlation between the Shoulder Disability Questionnaire and the SPADI change score was 0.71.

Conclusion

The results of this study confirm the responsiveness of the SPADI, making it a useful instrument to assess functional disability in longitudinal studies; however, the measurement error should be taken into account when making decisions in individual patients. J Orthop Sports Phys Ther 2017;47(4):278–286. Epub 3 Feb 2017. doi:10.2519/jospt.2017.7079

Shoulder pain is a common disorder in Western society,18 with a prevalence ranging from 7% to 27%,29 making it the second most reported musculoskeletal complaint in general practice.35 Apart from pain, one of the main complaints of patients with shoulder pain is functional disability. Thus, treatment of shoulder disorders is usually aimed at reducing pain and functional disability.50

Self-administered shoulder pain and disability questionnaires are designed to measure functional disability. These patient-reported outcome measures are often used in both clinical and research environments to assess patients' perceived levels of disability and the impact of the disease on daily activities,30 and to evaluate functional status.50

Several reviews have encouraged the use of the Shoulder Pain and Disability Index (SPADI).3,4,37,46 The SPADI is a disease-specific instrument and is frequently used in primary care. The SPADI is easy to complete, convenient to use, and not time consuming to fill out.34 It has been translated and validated (using hypothesis testing) into Danish, Norwegian, Tamil, German, Turkish, and Slovene.1,2,6,15,19,21 The Dutch SPADI (SPADI-D) has been recently validated using hypothesis testing for known-group validity (high initial pain and work absence), divergent validity (depression item on the EuroQol-5 dimensions questionnaire [EQ-5D]), and convergent validity (Shoulder Disability Questionnaire [SDQ]), has been shown to be reliable,45 and has been recommended in an evidence-based statement on shoulder pain by the Royal Dutch Physical Therapy Association.20 The responsiveness and interpretability of the SPADI-D have not been assessed before.46

A systematic review found moderate positive evidence for responsiveness of the English and Norwegian versions of the SPADI.46 There was variety in both setting and included patients that were part of the 3 primary studies included in this review, which assessed the responsiveness of the SPADI. Only 1 study was performed in a physical therapy setting, with participating patients diagnosed with adhesive capsulitis.40 Both of the other studies were performed in different settings: a general practitioner setting (patients with rotator cuff disease)14 and a shoulder pain clinic setting (patients with mixed “shoulder pain”).34

It is important to assess the responsiveness and interpretability of the SPADI-D in patients who seek treatment by a physical therapist for their shoulder pain in a primary care setting. In the literature, interpretability is defined as the degree to which one can assign qualitative meaning to an instrument's quantitative scores or changes in scores.32 Therefore, information about floor and ceiling effects and the minimal important change (MIC) should be provided.31 The MIC is the smallest change in the score of an instrument that patients perceive as important.11 The measurement error is the systematic and random error of a patient's score that is not attributed to true changes in the construct to be measured.32 Preferably, the measurement error should be smaller than the MIC.31,44 However, this is often not the case, which can be a consequence of the use of different mediators and calculations. Responsiveness is defined as the ability of an instrument to detect changes over time in the construct to be measured.32

Therefore, the aim of this study was to evaluate the measurement error, interpretability, and responsiveness of the SPADI-D in patients seeking treatment by a physical therapist for shoulder pain in a primary care setting. We used the Global Perceived Effect (GPE) scale as the external criterion for improvement. To assess responsiveness, we hypothesized that the change score of the SPADI-D would be highly correlated with a shoulder-specific instrument (the SDQ) and with the GPE scale. A lower correlation was expected with a questionnaire with a different focus (3-level version of the EQ-5D [EQ-5D-3L]).

Methods

Design

This study is part of a prospective cohort study, including patients with shoulder complaints in a primary care physical therapy setting. Details of the design are presented elsewhere.25 The Medical Ethics Committee of the Erasmus Medical Center in Rotterdam approved the study protocol (MEC-2011-414). All participants signed informed consent.

Study Population

Patients were recruited from primary care physical therapy clinics between November 2011 and December 2012. Patients with shoulder pain were eligible for inclusion if they were 18 years of age or older and adequately understood the Dutch language. Patients were excluded if they had serious pathology (infection, cancer, or fracture), previous surgery, or diagnostic imaging techniques of the shoulder in the previous 3 months.

Therapists

Physical therapy sessions were not standardized. Physical therapists collected data at baseline and after 12 weeks on what kind of diagnostic label was used on patients, what type of treatment was used, and how many treatment sessions were given within the time frame.

Baseline Measurement

Patients received a baseline assessment followed by usual physical therapy care. Participating patients received an online questionnaire that included the SDQ, SPADI, and EQ-5D-3L, all in Dutch. All 3 questionnaires have been reported to take approximately 3 minutes to complete.5,13,34,51

The SPADI-D The SPADI is a self-administered questionnaire designed to measure pain and disability associated with shoulder pain. It consists of 13 items (5 pain-related items and 8 disability-related items).36 However, factor analysis of the SPADI-D did not confirm the original factor structure and is based on 1 factor only.45 Each question refers to the past week. Items can be scored on a visual analog scale, ranging from 0 to 10, where 0 represents “no pain/no difficulty” and 10 “worst pain imaginable/so difficult it requires help.”34,36 The total score varies between 0 and 100, with a higher score indicating greater pain-related disability.36

The SDQ The SDQ is a pain-related disability questionnaire that consists of 16 items. All items refer to pain-related disability in the preceding 24 hours. Response options are “yes,” “no,” or “not applicable.” The option “not applicable” indicates that the situation at issue has not occurred in the past 24 hours. The SDQ score ranges from 0 to 100, with a higher score indicating more severe disability.19,50 The SDQ was originally designed and validated in Dutch.12,48 The SDQ shows acceptable content and divergent and construct validity, 12 and is a responsive instrument.34,48,49

The EQ-5D-3L The EQ-5D-3L is a quality-of-life questionnaire covering 5 dimensions of health: mobility, self-care, usual activities, pain/discomfort, and anxiety/depression.16,51 Each dimension has 3 levels (response categories): no problems, some problems, extreme problems. Besides these 5 items, perceived health state is measured using a scale from 0 to 100, with higher scores indicating better health status. The EQ-5D-3L has been used frequently, most often as part of cost-effectiveness studies.22,43,51 The Dutch EQ-5D-3L is an official version and has been validated.27

Test-Retest Measurement

A randomly selected group of patients received a second SPADI-D after 1 week. The time interval was chosen to minimize recall bias as well as progression bias and is often considered appropriate.42 A sample size of approximately 80 is considered acceptable.31 The data collected from this test-retest measurement were used in a previously published study as well, in order to assess the reliability.45

Follow-up Measurements

All patients received the SPADI-D, SDQ, and GPE scale 26 weeks after initial presentation. Within this period, the patient received physical therapy treatment for 1 or more sessions.

GPE Scale

The GPE scale is a 7-point Likert scale scoring the degree to which the patient's condition has improved or deteriorated since the start of physical therapy treatment (“Could you please state the amount of change concerning your recovery compared to when you first started treatment?”). The GPE scale ranges from “worse than ever” to “completely recovered” (completely recovered, much improved, slightly improved, no change, slightly worse, much worse, and worse than ever). The GPE scale has good test-retest reliability and correlates well with changes in pain and disability.24 Despite controversy about the role of global rating items, the GPE scale has frequently been used as an anchor in responsiveness studies.7,23,28,39,52

All forms were available online, using LimeSurvey software ( https://www.limesurvey.org/).

Analysis

All statistical analyses were performed with SPSS Version 23 (IBM Corporation, Armonk, NY). Regarding missing items, as described by the original authors,12,36 patients were excluded from the analysis if more than 2 items were missing from a SPADI subscale36 or from the SDQ.12 The total score for the included patients was calculated by adding up the item scores and dividing them only by the items that were deemed applicable to the subject.12,36

All data were checked for normality using a stem-and-leaf plot, Q plot, and whisker box. Nonparametric tests were used if data were not normally distributed. Descriptive statistics were used to calculate frequencies.

Interpretability The distribution of scores in the patient population, floor and ceiling effects, and interpretation of change scores are part of interpretability. Frequencies were presented as mean and SD for data that were normally distributed and as median and interquartile range for data that were not normally distributed.

If at baseline or at the 26-week follow-up more than 15% of the respondents achieved the highest or lowest possible scores, then we concluded that there were signs of floor or ceiling effects.11

We calculated the amount of change between the baseline score and the SPADI-D score after 26 weeks, using the mean change and the SD per category of the GPE scale and of all anchors. This provides information on how a change score on the SPADI-D corresponds to the magnitude of change, as perceived by patients.

The interpretation of change scores included calculating the MIC, which is the smallest change in score in the construct to be measured that patients perceive to be important. We used the receiver operating characteristic (ROC) method, with the SPADI-D as the diagnostic test and the anchor (GPE scale) as the gold standard for calculating the MIC. The anchor distinguishes patients who are considered to be “recovered” from those considered to be “not importantly changed.” The instrument's sensitivity is the proportion of recovered patients according to the anchor who are correctly identified as such by the SPADI-D. Specificity is the proportion of patients with “no important change” who are correctly identified as such by the SPADI-D. The MIC is defined as the optimal ROC cutoff point, which is the point on the ROC curve nearest to the upper left-hand corner.11

On the GPE scale, a frequently used anchor, we considered patients to be recovered when they answered that they were “completely recovered” or “much improved,” and to be not importantly improved when they answered “slightly improved,” “no change,” or “slightly worse.”17,28,52

We also created a visual anchor-based MIC distribution, which shows how well an instrument is able to distinguish between patients who are importantly improved and those who are not importantly changed.9 The MIC can be influenced by the baseline score of patients (low or high); a percentage of the baseline score is more stable.8 Therefore, we performed a subgroup analysis to assess the difference in MIC values with high and low baseline SPADI values (mean split).

As some researchers categorized patients who were “slightly improved” as importantly changed, making the MIC lower by definition, we included the MIC based on this anchor in the analysis.17,41

Measurement Error Measurement error can be adequately expressed as the standard error of measurement (SEM). For this analysis, we used the test-retest data set. The group of patients has been described in an earlier published study,45 and we therefore were aware of 2 extreme values in the test-retest data.45 We excluded these 2 extreme values to calculate the measurement error, but presented the results based on data including these extreme values as well, to assess their influence.

We used the test-retest data to test whether there were systematic errors, using an analysis of variance. When there were no systematic errors, the intraclass correlation coefficient (ICC) was used to calculate the SEM, and in all other cases the ICC agreement was used. The SEM was calculated as SD × 44 and the smallest detectable change (SDC) was calculated as 1.96 × √2 × SEM44 to assess the change beyond measurement error. We presented a Bland-Altman plot to visually illustrate systematic errors. Ideally, the MIC should be higher than the SDC.10

Responsiveness Responsiveness was assessed using the area under the ROC curve (AUC) and hypothesis testing. As the GPE scale has a high level of face validity and is considered to be a suitable criterion to measure change, we were able to use the AUC method.11 However, doubt has been expressed about the reliability and validity of such measures of change,33 and we therefore chose to test specific hypotheses as well.

We calculated the AUC to assess the ability of the SPADI-D to discriminate between patients who are considered improved and not importantly changed according to the GPE scale, using an anchor similar to that described in the Interpretability section.11 A benchmark that has been previously used to establish that outcome measures are useful in discriminating improved and unimproved patients has been set at an AUC of 0.70.44

Hypothesis testing for responsiveness was based on the concept that the correlation between the change scores of related constructs (GPE scale and SDQ) must be higher than with unrelated constructs (the depression and mobility items of the EQ-5D-3L). Hypothesis testing was quantified by the Pearson correlation coefficient for normal data distribution and by a Spearman correlation coefficient for all other distributions. Correlation coefficients between the SPADI-D change score and the change scores of the SDQ and the GPE scale were expected to be above 0.50, and the correlations with the EQ-5D-3L mobility and depression items were expected to be lower than 0.20.11

Results

A total of 356 patients participated at baseline, 114 of whom did not return the SPADI-D follow-up assessment at 26 weeks. In total, 242 patients returned the SPADI-D, of whom 5 were excluded due to the missing-item criterion, resulting in 237 patients included in the analysis (66.6% of the baseline population). Some (22%) of the patients who were included in the test-retest measurement were not included in the responsiveness cohort, as they did not return the SPADI at 26 weeks or had missing items. The mean ± SD age of the total baseline population was 49.5 ± 13.1 years, and 47% were men.

The physical therapists used a variety of shoulder diagnoses to label the patients; however, the majority of patients were labeled as having subacromial impingement. The physical therapists also used a variety of treatment techniques, mainly including advice, exercise, and mobilization/manipulation of the shoulder or thoracic spine. After 12 weeks, the majority of patients (59.5%) stopped therapy. Overall, the median number of treatment sessions was 6. The characteristics of the participants are presented in TABLE 1.

TABLE 1

Baseline Characteristics of the Participants, Per Analysis

TABLE 1 Baseline Characteristics of the Participants, Per Analysis
Total Cohort (n = 356)Follow-up Cohort (n = 237)Test-Retest Cohort, Complete (n = 74)Test-Retest Cohort, Without Extreme Values (n = 72)
Sex (male), n (%)166 (47)109 (46)29 (39)29 (40)
Age, y*49.5 ± 13.150.0 ± 12.951.4 ± 12.751.5 ± 12.9
SPADI-D score*46.7 ± 21.347.0 ± 21.550.8 ± 22.650.2 ± 22.6
Use of medication, n (%)171 (48)117 (49)37 (50)37 (51)
Pain intensity (NRS)6 (4–7)6 (4–7)6 (4–8)6 (4–8)
Number of treatment sessions for patients who stopped therapy within 12 wk6 (4–9)

Abbreviations: NRS, numeric rating scale; SPADI-D, Dutch version of the Shoulder Pain and Disability Index.

*Values are mean ± SD.

Values are median (interquartile range).

A total of 141 patients (59.5%) stopped therapy sessions after 12 weeks.

The data of the SPADI-D at baseline and the change scores of both the SPADI-D and SDQ were considered to be normally distributed, in contrast to those of the EQ-5D-3L.

Interpretability

The mean score of the SPADI-D at baseline of the total population with shoulder pain was 46.7 ± 21.3, and at 26 weeks was 23.9 ± 24.2 points.

At baseline, only 1 patient had a SPADI-D score of zero, and none of the patients showed a score of 100; the highest score was 92 (0.3% of all patients). About 8.1% (n = 29) of the patients scored in the lower part of the range of the scale (a score between 0 and 15), and only 2.2% (n = 8) of the patients scored in the upper part of the range of the scale (between 85 and 100). After 6 months, 13.5% (n = 32) of the patients had a score of zero and none (0%) of the patients had a score of 100; the highest score was 89 (0.4%). We therefore concluded that there were no signs of floor and ceiling effects.

TABLE 2 shows the mean change per category on the GPE scale. A total of 139 patients were considered recovered (mean ± SD change score between baseline and 26 weeks of −33.4 ± 19.5), and 95 were not considered importantly changed (a change score between baseline and 26 weeks of −8.9 ± 21.4). The MIC was 20 points, resulting in a change of 42.8% of the baseline score. The sensitivity and specificity were both 0.75. Subgroup analysis resulted in similar results: the MIC for patients with a high baseline score was 43.0% (27.9 points), with a sensitivity of 0.82 and specificity of 0.77, and for patients with a low baseline score was 42.7% (12.2 points), with a sensitivity of 0.81 and specificity of 0.82.

TABLE 2

Mean Change Per Category on the GPE Scale

TABLE 2 Mean Change Per Category on the GPE Scale*
TotalHigh Baseline ScoreLow Baseline Score
GPE ScalePatients, n (n = 237)SPADI-D Change (Baseline - 26 wk)Patients, n (n = 120)SPADI-D Change (Baseline - 26 wk)Patients, n (n = 117)SPADI-D Change (Baseline - 26 wk)
1. Completely recovered43−36.5 ± 22.114−61.4 ± 13.829−24.4 ± 13.4
2. Much improved96−32.0 ± 18.148−42.2 ± 16.848−21.7 ± 12.8
3. Slightly improved61−12.5 ± 21.536−21.2 ± 20.025−0.04 ± 17.4
4. No change28−2.0 ± 16.517−7.7 ± 13.8116.8 ± 17.2
5. Slightly worse  6−4.4 ± 34.0  3−25.5 ± 37.3  316.8 ± 12.6
6. Much worse  37.1 ± 11.7  24.0 ± 14.7  113.11        
7. Worse than ever  0  0  0

Abbreviations: GPE, Global Perceived Effect; SPADI-D, Dutch version of the Shoulder Pain and Disability Index.

*Values are mean ± SD unless otherwise indicated.

The visual anchor-based MIC distribution is presented in FIGURE 1. It shows that the SPADI-D is capable of discriminating between patients who are importantly improved versus those who are not importantly changed.

FIGURE 1. Visual anchor-based MIC distribution. Distribution of change scores on the SPADI-D of patients who reported an important improvement (n = 139) compared with those who reported no important change (n = 95) on the first anchor (Global Perceived Effect scale). The left quadrant above the line represents the misclassified patients who felt importantly improved but were not classified as such by their SPADI-D change score (23.7%). In the lower right quadrant, beneath the green line are the patients who were misclassified, as they considered themselves as not importantly improved but, according to their SPADI-D change score, were classified as importantly improved (25.3%). Abbreviations: MIC, minimal important change; SPADI-D, Dutch version of the Shoulder Pain and Disability Index.

For the alternative anchor, on which “slightly improved” was considered to be importantly improved, the MIC was 16 points.

Measurement Error

The 2 patients with extreme values showed a change score between baseline and retest (7 days or less after baseline) of −31 and −30 points, respectively. These patients were no longer under physical therapy treatment after 3 weeks and felt completely recovered after 6 weeks. The analysis of variance revealed that there were systematic errors. With the outliers included, the mean ± SD difference was −4.1 ± 10.7 between baseline and retest (50.8 versus 46.7). After exclusion of the 2 extreme values, the mean ± SD difference was −3.4 ± 9.9 (50.2 versus 46.8). FIGURE 2 shows the Bland-Altman plot illustrating the systematic bias. The SEM was 7.1 and the SDC was 19.7.

FIGURE 2. Bland-Altman plot, illustrating the mean difference between 2 measurements and the limits of agreement. Abbreviation: SPADI-D, Dutch version of the Shoulder Pain and Disability Index.

Responsiveness

The AUC was 0.81, with a 95% confidence interval ranging from 0.75 to 0.87. FIGURE 3 shows the ROC curve.

FIGURE 3. Receiveroperating characteristic curve, based onanchor 1, resulting in an area under the curve of 0.81.

Hypothesis testing for responsiveness resulted in a Spearman correlation between the SPADI-D change score and the GPE scale of 0.53. The Pearson correlation between the SPADI-D change score and the SDQ change score was 0.71. The Spearman correlation between the change score of the SPADI-D and the EQ-5D-3L depression item was 0.06 and the EQ-5D-3L mobility item was 0.12. Based on the AUC values and with all hypotheses confirmed, we consider the SPADI-D to be a responsive measurement instrument.

Discussion

This study shows that the SPADI-D is responsive, making it a useful evaluative instrument to assess functional disability in longitudinal studies in patients with shoulder pain visiting a physical therapist. The SPADI-D can detect important changes. A change larger than 43% from the baseline score is considered to be a clinically relevant and important change. However, the measurement error should be taken into account when used for decision making in individual patients.

Comparison to the Literature

Interpretability Our study showed no signs of floor and ceiling effects, similar to earlier research.6,14

The MIC in our study was 20. One other study reported an MIC of 20.3 based on the ROC method, also using a similar GPE scale (an 18-point Likert scale) as an anchor, with a similar choice in dividing patients as “recovered” and “not importantly changed.”14 That study population consisted of patients with rotator cuff disorders who were referred by their general practitioner to the physical medicine and rehabilitation department of a hospital. The patients in our study were comparable in age, sex, and work absence to those of the previous study; however, the baseline SPADI score in the previous study was approximately 5 points higher than that in our study population.14 One study used a study population with upper extremity disorders, and calculated the MIC using mean change scores for patients with small but meaningful global change on a global disability rating scale they developed, resulting in an MIC of 13 points.38 However, none of the above studies assessed whether the MIC varied between high or low baseline scores.

Measurement Error Only a small number of studies assessed the measurement error of the SPADI.1,6,14,47 One study reported an SEM of 7.0 (95% confidence interval: 6.0, 8.5) and an SDC of 19.4.6 The sample of that study showed a higher level of pain-related disability, as the SPADI baseline score was approximately 7 points higher than that of our study.6 Another study reported an SDC of 19.7,14 and 1 study reported a smallest detectable difference of 17 points.47 A study using a different study population (patients who had undergone total shoulder arthroplasty or hemi-arthroplasty) reported an SDC of 18 points.1 All these SDC values are comparable to those of our study, when the results of the analysis that excluded outliers were used. We feel that the most appropriate analysis is the one that excluded the outliers, resulting in an SDC of 19.7. However, the analysis that included the outliers resulted in an SDC of 22.5. The MIC was higher than the SDC when the outliers were excluded.

Responsiveness The AUC in our study (0.81) was comparable with that of other studies (range, 0.80–0.92), despite using different GPE scales (5-point and an 18-point Likert scale).14,34,40

The Spearman correlation with the GPE scale found in our study was comparable with that of a previous study.34 No other studies used the SDQ change score as a comparator, although the construct of this questionnaire is comparable with the SPADI. One study used correlations between the SPADI and other pain-related disability questionnaires (Croft index; Disabilities of the Arm, Shoulder and Hand Questionnaire [DASH]; Problem Elicitation Technique [PET], and Health Assessment Questionnaire [HAQ]) and perceived improvement, and they were all above 0.49, except for the HAQ.40 Range of motion was also used as a comparator19,36,47; however, we feel that this measures a different construct and is therefore not appropriate.

Strengths and Limitations

This study has some limitations. We did not use the GPE scale to check whether patients were indeed stable within 7 days between the test and retest, which could have influenced the measurement error. However, the 7-day time frame we used is commonly accepted.42 Moreover, the median duration of shoulder pain at the start of inclusion was 16 weeks in our study population. Physical therapists usually treat patients with shoulder pain for a mean ± SD of 11 ± 11.3 weeks.26 It is therefore unlikely that patients would recover within 1 week. We checked data for patients with extreme change scores, as there is always the chance that a patient's condition will improve or worsen within this time frame. There was a systematic error, with a mean difference of −3.4 points between test and retest, suggesting a very small and minimal improvement. The 2 patients with extreme values were no longer under treatment after 3 weeks, and it is therefore likely that these patients were an exception and had indeed changed substantially. We reported the results for both the population with extreme values and without extreme values, so clinicians can take this into consideration.

One of the strengths of this study is that our population consisted of patients visiting a physical therapist. The SPADI is frequently used by physical therapists and pain/activity limitations are important outcome measures, thus it is important to assess its measurement properties in this study population. Moreover, this study had a relatively large sample size. Another strength of this study is that we assessed whether the MIC would vary over different parts of the complete range of SPADI scores (eg, high versus low baseline SPADI scores). This is important for clinical as well as research purposes, as it reflects that when symptoms are severe they can change more dramatically (in absolute terms) and be of greater importance to patients than when patients have a lower baseline score.

Implications for Clinical Practice

Patients with a change score of 43% or more of their baseline SPADI-D score considered themselves to be importantly improved; therefore, a change score of 43% in individual patients may be regarded as clinically relevant. A change score of less than 20 points could be due to measurement error. An example for clinicians: if a patient had a baseline SPADI-D score of 50 and a SPADI-D score of 20 at follow-up, one could consider this to be real change, as it is greater than the measurement error and clinically relevant, the change score being greater than the MIC (43%). However, when a patient has a baseline of 35 points and scores 20 points at follow-up, this could be considered clinically relevant, as it is a change of 43%, but this change could still be a measurement error. A change score of 15 points is beneath 19.7 and could be due to measurement error. Clinicians have to take the measurement error into account when they use the SPADI-D for evaluative purposes in individual patients.

Conclusion

The present study found the SPADI-D to be a responsive instrument for assessing patients who seek physical therapy care for shoulder pain and functional disability. The SPADI-D was able to detect changes larger than 43% of the baseline score, which is considered to be clinically relevant and important change. However, when making decisions based on SPADI-D scores in individual patients, measurement error must be taken into account.

Key points

Findings

This study indicates that the SPADI-D is responsive to change over time, but that measurement error should be taken into account when the instrument is used in clinical practice.

Implications

The SPADI-D is a useful questionnaire in physical therapy practice.

Caution

The interpretability, responsiveness, and measurement error should also be assessed in other study populations (eg, postoperative patients), as these could influence the results.

References

ADVERTISEMENT
Advances on the Knee, Shoulder, Hip and Sports Medicine