Publication
Research Article
International Journal of MS Care
Psychometric assessments are tests or questionnaires that have been designed to measure constructs of interest in an individual or a target population. A goal of many of these self-report instruments is to provide researchers with the ability to gather subjective information in a manner that might allow for quantitative analysis and interpretation of these results. This requires the instrument of choice to have adequate psychometric properties of reliability and validity. Much research has been conducted on creating self-report quality of life questionnaires for individuals with multiple sclerosis (MS). This article focuses on one in particular, the Modified Fatigue Impact Scale (MFIS). The article starts with a brief description of the rationale, construction, and scoring of the inventory. Next, the best available reliability and validity data on the MFIS are presented. The article concludes with a brief discussion on the interpretation of scores, followed by suggestions for future research. This summative analysis is intended to examine whether the instrument is adequately measuring the impact of fatigue and whether the scores allow for meaningful interpretations.
Multiple sclerosis (MS) is a chronic, progressive, and degenerative disease of the central nervous system characterized by demyelination1 and axonal deterioration.2 Symptoms of MS include abnormal gait, deficient balance, muscle weakness, spasticity, and fatigue, all of which can reduce physical function and profoundly affect health and quality of life.1 Fatigue is one of the most common symptoms, affecting more than 75% of people with MS.3 4 Fatigue has been defined as “a subjective lack of physical or mental energy that is perceived by the individual or caregiver to interfere with activities of daily living.”5 Fatigue can be considered primary or secondary. Primary fatigue refers to factors directly associated with the disease process, whereas secondary fatigue relates to the consequences of primary fatigue and may result from lack of conditioning, depression, and medication side effects.6 Despite the known prevalence of fatigue in MS and its negative effects on quality of life, the pathophysiology of MS-related fatigue remains unclear. Additionally, the difficulty of measuring fatigue due to its subjective nature and its impact on quality of life has been troublesome.
According to the Multiple Sclerosis Council for Clinical Practice and Guidelines, a subjective (self-reported) measure of fatigue should be based on the individual's assessment of fatigue and its impact on quality of life,7 which requires the measure to have adequate psychometric properties including reliability and validity. After performing a review of the literature, the Council recommended using the Modified Fatigue Impact Scale (MFIS).7 Although the Council called for further psychometric evaluation, researchers have been repeatedly using the MFIS in the absence of such a comprehensive evaluation of the instrument.
The MFIS is a modified version of the 40-item Fatigue Impact Scale (FIS), which was originally developed to assess the effects of fatigue on quality of life in patients with chronic diseases, specifically MS.3 The FIS has patients rate the extent to which fatigue has affected their life in the past 4 weeks on a questionnaire consisting of 10 “physical” items, 10 “cognitive” items, and 20 “social” items, with 0 indicating “no problem” and 4 indicating “extreme problem.” The maximum possible score is 160. The MFIS evolved from the FIS during the development of a clinical inventory assessing overall quality of life in individuals with MS, the Multiple Sclerosis Quality of Life Inventory (MSQLI). During the phase 2 field testing of the MSQLI, the 40-item FIS was abbreviated into the 21-item MFIS by “eliminating items which appeared both content-redundant and had high inter-item correlations,”8 but the exact procedures and data have not been published. The MFIS contains 9 “physical” items, 10 “cognitive” items, and 2 “psychosocial” items. The maximum possible score is 84, with higher scores indicating a greater impact on quality of life (Appendix 1). The original intention was to use the total score to reflect a global (unidimensional) score.3 9 The authors of the FIS and those involved in the modification of the FIS into the MFIS have not published evidence verifying the underlying structure of the instrument or the rationale for the three subscales and the items selected to create the subscales.
The underlying structure of the MFIS has been examined through principal components analysis with varimax rotation using data from 181 MS patients from four different European countries: Belgium (n = 51), Italy (n = 50), Slovenia (n = 50), and Spain (n = 30).10 All 21 items meet the required item-loading factor of 0.500. However, the first item, “I feel less alert” (a cognitive item), had a factor loading of 0.495 but also loaded with the psychosocial factor at 0.599. The eighth item, “I am less motivated to participate in social activities” (a psychosocial item), had a factor loading of 0.528 but also loaded with the physical factor at 0.499. The ninth item, “I am less motivated to do things away from home,” from the psychosocial subscale, was actually assigned to the physical factor. These results are not in agreement with the original underlying structure of the subscales, leading Kos and colleagues to recommend caution when interpreting the psychosocial subscale.10 The lack of agreement in the underlying structure could confound the interpretations of the MFIS as an outcome measure.
The total score of the MFIS ranges from 0 to 84. The ranges of scores for each subscale are as follows: physical, 0 to 36; cognitive, 0 to 40; and psychosocial, 0 to 8. No data have so far been published regarding population norms for the MFIS and its subscales. Some studies use a total score of 38 as a cutoff to discriminate fatigued from nonfatigued individuals.10 11 The score of 38 was based on a study that correlated the MFIS with another fatigue inventory and its defined scores for fatigued and nonfatigued.12 Verification of these results has been less than adequate, with researchers simply citing the study of Flachenecker et al.12 as a rationale for using a cutoff score of 38 to discriminate fatigued from nonfatigued individuals with MS. The lack of population norms and a possibly inappropriate cutoff score raise some questions about the instrument and the ability to interpret scores.
The reliability of the MFIS has been evaluated in a small number of articles10 13 and in the MSQLI technical inventory, which includes the MFIS. When considering reliability, two questions come to mind: 1) Is the MFIS reliable (internal consistency)? 2) Is the impact of fatigue experienced by individuals with MS stable and reproducible with repeated measurements? The MSQLI technical inventory provides some of the better reliability (internal consistency) evidence. The phase 2 field testing of the MSQLI instrument studied a sample of 300 individuals with MS selected from 4 MS clinics in Canada. The sampling of subjects was focused on gender and a commonly used measure of disability, the Expanded Disability Status Scale (EDSS), keeping it similar to those of previous epidemiological studies.8 The reported internal consistency of all the MFIS scores was “excellent,” with the following Cronbach α values: total, 0.81; cognitive, 0.95; physical, 0.91; and psychosocial, 0.81.8 The technical guide suggests that the MFIS can be used as a comprehensive (total score) and multidimensional (separate subscales) assessment of the impact of fatigue. The internal consistency reported by Kos et al.10 in 2005 showed similar results for the total score and two of the subscales, physical and cognitive (Cronbach α values of 0.92, 0.88, and 0.92, respectively). However, the Cronbach α for the psychosocial subscale was 0.65.10 The lack of agreement with the psychosocial subscale could be explained by the fact that the studies involved individuals with different cultural backgrounds, which could also influence the other subscales. Lack of agreement does not invalidate the subscale but does suggest further exploration of cultural differences as a confounding variable.
The second question regarding reliability relates to whether the impact of fatigue experienced by the individuals varies from test to test. The MFIS has been used as an outcome measure in several clinical trials,14 15 which assumes that the impact of fatigue persists and is the same within an individual and across repeated measures. A few studies have been conducted to investigate the test-retest reproducibility of the MFIS.10 13 In 2003, Kos et al.13 conducted a study in which participants (n = 51) completed the MFIS on two separate occasions at the same time of day. The subscales, along with the total score, were considered stable over the 3 days using a Wilcoxon rank sum test to assess differences from day to day. Similar results were observed in a subsequent study in 200510 in a much larger sample (n = 181). Once again, the MFIS was administered on two occasions 3 days apart, with no significant difference found in scores (intraclass correlation coefficient [ICC] of the subscales ranged from 0.84 to 0.91 and of the total was 0.91). However, a limitation of this study is the lack of explanation in the methodology regarding administration of the MFIS and whether confounding variables such as sleep or caffeine intake were accounted for. These data suggest that the MFIS has adequate test-retest reliability; however, because of the possibility of confounding variables affecting test results, this reliability should be verified by conducting a study in which these variables are controlled. Another confounding aspect that could affect reproducibility is the notion that fatigue may or may not be stable in a person with MS.
An instrument can be reliable but not valid; this is akin to the important distinction between precision and accuracy. Validity refers to how well the instrument is measuring the construct of interest. In assessing whether an instrument is valid, one should consider the nomological network under which the instrument was constructed. The nomological network under which the MFIS was constructed has not been directly established or clarified. Therefore, we must first identify what factors should correlate highly with the MFIS, as well as those factors that should be unrelated or weakly correlated. If the construct is assessing what it purports to assess, it should correlate highly with other measures of similar constructs (convergent validity), and it should weakly correlate with constructs that are theoretically unrelated (discriminant validity). One way to assess these aspects of validity is to simultaneously administer the MFIS with other related and unrelated psychometric instruments and assess the strength of their correlations.
Because the MFIS assesses the impact of fatigue, it should converge with other subjective measures of fatigue. The MFIS does have moderate to strong correlations with the Fatigue Severity Scale (FSS), a measure of fatigue in general use,11 12 which is not surprising given that the FIS was derived from the FSS. Tellez et al.11 reported correlations between the FSS scores and the total MFIS score, with a correlation coefficient of 0.68. The subscales also correlated with the FSS, with correlation coefficients of 0.75 for the physical subscale, 0.44 for the cognitive subscale, and 0.62 for the psychosocial subscale.11 Kos et al.13 reported a correlation of 0.66 between the FSS and the total MFIS score. With both instruments assessing fatigue, a much higher correlation might be expected; however, it should be considered that the MFIS assesses primarily the impact of fatigue, while the FSS also assesses severity and frequency. When the FSS is correlated with the separate subscales from the MFIS, the physical subscale shows a stronger correlation than the other subscales. This is not surprising given that the impact of fatigue questions on the FSS focus more on the physical aspects of daily living. Results from the phase 2 trial assessing the validity of the MSQLI indicated significant correlations of the 36-item Short Form Health Status Survey (SF-36) vitality scale (r = −0.59), the Sickness Impact Profile (SIP) Sleep and Rest Scale (r = 0.47), and the SIP Alertness Scale (r = 0.65) with the MFIS.8 Both the SF-36 and the SIP are widely used instruments. The correlation with the vitality scale helps establish validity, because it is logical that fatigue could be accompanied by decreased vitality. The correlations with the Sleep and Rest Scale and the Alertness Scale might suggest problems with the MFIS in that it cannot distinguish between sleepiness and alertness with respect to impact of fatigue.
Depression could be a symptom of fatigue, and fatigue could be a symptom of depression, leading to a long-standing debate about the causal direction between these variables. Tellez et al.11 found a clear relationship between depression and fatigue in individuals with MS, with a significant correlation (r = 0.70) between the MFIS total score and depression measured with the Beck Depression Inventory (BDI). Depression and fatigue are often associated with disease-specific variables such as disability and disease duration. When EDSS scores were controlled, the correlation between depression and fatigue remained positive.11 When these variables were analyzed using multiple linear regression, EDSS score, depression, and disease duration were all independent predictors of all the MFIS subscale and total scores.11 However, Greim et al.16 reported that depressed MS patients subjectively felt significantly more tired than nondepressed MS patients (mean [SD] = 49.8 [13.1] vs. mean [SD] = 32.6 [16.8]; P < .003), supporting the notion that a depressed mood can affect the subjective estimation of fatigue.
The validity of a subjective instrument can be strengthened when it is paired with an objective measure, provided that the measures are appropriate surrogates. The study conducted by Greim et al.16 assessed objective and subjective measures of mental and physical fatigue in individuals with and without MS. Participants completed subjective measures of fatigue, specifically the MFIS, before and after a 30-minute vigilance test (an objective test of mental fatigue) and a hand-dynamometer test (an objective test to assess physical fatigue). Depression was controlled during the study because of its possible influence on subjective and objective measures of fatigue. A significant correlation between the subjective feelings (MFIS) and the objective measures was observed in the individuals with MS. The individuals with higher levels of subjective fatigue performed worse on the objective measures. However, the depression scale (BDI) was significantly correlated with the measures of fatigue, confirming previous data indicating an interaction between fatigue and depression that can confound study results and subsequent interpretations.
Considerable research has been conducted to find ways to reduce the impact of fatigue in individuals with MS. Various pharmacologic agents are used to this end, and data from these studies can provide validity evidence for the MFIS. Modafinil (Provigil), a commonly prescribed medication used to reduce the symptoms of fatigue with the aim of reducing the impact of fatigue on quality of life, was tested during a 9-week single-blind study involving a sample of 72 individuals with MS.14 All participants received the medication, but the treatment sequence was blinded. The medication regimen included a placebo run-in period (weeks 1–2), 200 mg/day of modafinil (weeks 3–4), 400 mg/day of modafinil (weeks 5–6), and a placebo washout period (weeks 7–9). The MFIS was one of the main outcome variables for this study. The MFIS total score, along with scores for the three subscales, were significantly reduced when individuals were administered 200 mg/day of modafinil.14 The actual mean score went from 44.7 to 37.7 (7-point reduction). However, when participants were administered the higher dosage (400 mg/day), the mean MFIS score (42.1) did not differ from that of the placebo run-in period. The mean MFIS score following the placebo washout period was roughly equal to that after the placebo run-in period (43.0 and 44.7, respectively).14 This does provide some evidence that the MFIS can detect a change in scores; however, the lack of a true control group and the failure to control for other variables such as sleep and depression may confound the outcomes. These results suggest that the MFIS is sensitive to change but that the interpretation of these results is limited to just that: a reduction in MFIS scores.
Little to no research has been published outlining what a change in MFIS scores (increasing or decreasing) reflects objectively. Currently the interpretation is limited to change directions—that is, a reduction in scores, no significant change in scores, or an increase in scores. The term clinically meaningful or clinically relevant has been used in a number of studies in an attempt to give meaning to the results and allow for some kind of interpretation of the scores.15 17 However, these studies do not provide sufficient evidence for their selection of what is deemed clinically meaningful. A study by Kos et al.17 considered a change in score of 10 or more to be clinically relevant, based on other studies that found a difference in MFIS scores of 7 to 20.1. Another article used a cut-point of 45 or more for the total MFIS score as a study entry criterion, without providing a clear rationale for the selection of that score.15 The lack of objective anchors for the MFIS total and subscale scores limits the interpretations of this instrument's results.
A more recent article found flaws in the underlying structure of the MFIS and used the Rasch measurement model to determine that the use of the total score is invalid.18 The Rasch model represents the interactions between subjects and the test items to produce linear measurements: “The model states that the probability of a person giving a certain answer to an item is a logistic function of the difference between the person's ‘ability’ and the item's ‘difficulty.’”18 In the case of the MFIS, a person with high levels of fatigue would affirm items expressing high levels of fatigue, while a person with low levels of fatigue would have difficulty affirming these items. According to Wright and Linacre,19 it is the only way in which ordinal observations of clinical phenomena can be converted into linear measurements. The 21-item MFIS did not fit the Rasch model, which does not support the use of the total score as a global index but rather indicates that it is multidimensional.18 Mills and colleagues18 further explored this concept and were able to achieve fit to the Rasch model through the removal of the following items: 4 (“I am clumsy and uncoordinated”; physical subscale), 14 (“I am physically uncomfortable”; physical subscale), and 17 (“I am less able to complete tasks that require physical effort; physical subscale). The authors then identified two subscales of the MFIS: physical and cognitive. The psychosocial subscale was eliminated because the items were found to be part of the physical subscale, consistent with the findings of Kos et al.10 The authors argue that studies in which the global MFIS score was used as an outcome measure or a selection tool might be invalid or subject to misinterpretations. Despite the problems regarding the underlying structure, it appears that the physical and cognitive subscales can be useful as outcome measures.
Clearly, the MFIS as an outcome measure has some problems resulting in limitations in interpreting the scores. The lack of agreement between the underlying structure and its subscales raises complex issues, especially if the MFIS total score is invalid. The MFIS total score has been commonly reported as an outcome measure, but given the results reported by Mills et al.,18 the interpretations of studies using the MFIS total scores may need to be reevaluated. Additionally, the potential effects of confounding variables—specifically, depression—on MFIS scores may lead to misinterpretation of the results of uncontrolled studies. There may be no good way to separate these constructs in some samples such as people with MS. Probably the biggest problem with MFIS score interpretation is the lack of objective anchors. As the measurement stands, the interpretations are limited to whether a score changed significantly and the direction of that change. No studies have yet been published that objectively show what a change in the MFIS represents. In other words, if an MS patient's physical subscale score on the MFIS improves by 5 points, does that mean that the person can walk farther and faster? Further research on the MFIS is needed to address some of the problems described here and to identify what a change in MFIS score represents on an objective measure of quality of life in the physical and cognitive domains.
The Modified Fatigue Impact Scale (MFIS) is a widely used self-report measure of fatigue in people with MS, but its reliability and validity have not been adequately addressed.
The MFIS has various problems that result in limitations in interpreting the scores.
In particular, the scale's lack of agreement between the underlying structure and its subscales raises complex issues and should be considered before use of this instrument in clinical research and practice.
Romberg A, Virtanen A, Aunola S, Karppi SL, Karanko H, Ruutiainen J. Exercise capacity, disability and leisure physical activity of subjects with multiple sclerosis. Mult Scler. 2004; 10: 212–218.
Carroll CC, Gallagher PM, Seidle ME, Trappe SW. Skeletal muscle characteristics of people with multiple sclerosis. Arch Phys Med Rehabil. 2005; 86: 224–229.
Fisk JD, Pontefract A, Ritvo PG, Archibald CJ, Murray TJ. The impact of fatigue on patients with multiple sclerosis. Can J Neurol Sci. 1994; 21: 9–14.
Freal JE, Kraft GH, Coryell JK. Symptomatic fatigue in multiple sclerosis. Arch Phys Med Rehabil. 1984; 65: 135–138.
Kos D, Nagels G, D'Hooghe MB, Duportail M, Kerckhofs E. A rapid screening tool for fatigue impact in multiple sclerosis. BMC Neurol. 2006; 6:27.
Lapierre Y, Hum S. Treating fatigue. Int MS J. 2007; 14: 64–71.
Multiple Sclerosis Council. Fatigue and multiple sclerosis—clinical practice guideline. Paralyzed Veterans of America; 1998. http://www.kintera.org/AccountTempFiles/Account403152/ECSoft/MS-FatigueCPG.pdf 2003.
Ritvo PG, Fischer JS, Miller DM, Andrews H, Paty DW, LaRocca NG. MSQLI: Multiple Sclerosis Quality of Life Inventory: A User's Manual. New York, NY: National Multiple Sclerosis Society; 1997.
Krupp LB, LaRocca NG, Muir-Nash J, Steinberg AD. The fatigue severity scale: application to patients with multiple sclerosis and systemic lupus erythematosus. Arch Neurol. 1989; 46: 1121–1123.
Kos D, Kerckhofs E, Carrea I, Verza R, Ramos M, Jansa J. Evaluation of the Modified Fatigue Impact Scale in four different European countries. Mult Scler. 2005; 11: 76–80.
Tellez N, Rio J, Tintore M, Nos C, Galan I, Montalban X. Does the Modified Fatigue Impact Scale offer a more comprehensive assessment of fatigue in MS? Mult Scler. 2005; 11: 198–202.
Flachenecker P, Kumpfel T, Kallmann B, et al. Fatigue in multiple sclerosis: a comparison of different rating scales and correlation to clinical parameters. Mult Scler. 2002; 8: 523–526.
Kos D, Kerckhofs E, Nagels G, et al. Assessing fatigue in multiple sclerosis: Dutch Modified Fatigue Impact Scale. Acta Neurol Belg. 2003; 103: 185–191.
Rammohan KW, Rosenberg JH, Lynn DJ, Blumenfeld AM, Pollak CP, Nagaraja HN. Efficacy and safety of modafinil (Provigil) for the treatment of fatigue in multiple sclerosis: a two centre phase 2 study. J Neurol Neurosurg Psychiatry. 2002; 72: 179–183.
Stankoff B, Waubant E, Confavreux C, et al. Modafinil for fatigue in MS: a randomized placebo-controlled double-blind study. Neurology. 2005; 64: 1139–1143.
Greim B, Benecke R, Zettl UK. Qualitative and quantitative assessment of fatigue in multiple sclerosis (MS). J Neurol. 2007;254(suppl 2):II58–II64.
Kos D, Duportail M, D'hooghe M, Nagels G, Kerckhofs E. Multidisciplinary fatigue management programme in multiple sclerosis: a randomized clinical trial. Mult Scler. 2007; 13: 996–1003.
Mills RJ, Young CA, Pallant JF, Tennant A. Rasch analysis of the Modified Fatigue Impact Scale (MFIS) in multiple sclerosis. J Neurol Neurosurg Psychiatry. 2010; 81: 1049–1051.
Wright BD, Linacre JM. Observations are always ordinal; measurements, however, must be interval. Arch Phys Med Rehabil. 1989; 70: 857–860.
Financial Disclosures: The author has no conflicts of interest to disclose.
A Subspecialty for Half the World’s Population: Women’s Neurology