Thursday, November 15, 2018

Statistical Concepts of Measurement


Measurement can be defined as the process of assigning numbers to variables to represent qualities and quantities of characteristics.  A number can be assigned to qualitative data.  For example, the number "1" can be used to identify/classify males and the number "2" can be used to identify/classify females.  Another example is diagnostic testing studies.  The number "0" can be used to classify people who had a negative test result for a diagnosis and the number "1" can be used to classify people who had a positive test result for a diagnosis.

A number can also reflect an amount or quantity of a variable.  A continuous variable can theoretically be measured on a continuum within a defined range.  Goniometry is measurement of the range of motion of a joint and can be measured as a continuous variable in units of degrees.  A physical therapist may use goniometry to measure the amount of a patient's knee flexion range of motion as 100 degrees.  However, in practical application, a continuous variable can never be measured exactly due to lack of precision and measurement error.  The true amount of knee range of motion cannot be measured due to lack of precision of goniometric measurement. 

Another example is the hemoglobin A1c test.  The hemoglobin A1c test measures the percentage of a patient's hemoglobin that is glycated.  The hemoglobin A1c test is considered a standard test for measuring glucose control in patients with diabetes.  However, the hemoglobin A1c test has some measurement error related to reliability and validity of the test. 

To be clear, goniometry and the hemoglobin A1c test have established validity and are standard methods of measurement.  Yet, no test is perfectly accurate.  The key is that the test or measurement method should have an acceptable amount of measurement error.

Discrete variables are described in whole units of measurement.  Heart rate is measured in beats per minute and not recorded as a decimal or fraction, therefore heart rate would be a discrete variable.  When a qualitative variable can have only two values, like positive or negative test result, the variable is called a dichotomous variable.

The statistical concept of measurement also involves rules of measurement.  These rules dictate how numbers can be assigned to measure a variable.  In the case of gender, "1" can be assigned to represent males and "2" can be assigned to represent females.  Rules of measurement are important because such rules determine which mathematical operations can be performed for a set of data.  Consider the variable of gender.  If a study included five males and 10 females, the total number of study participants would be 15.  If one calculated the numbers that represent males and females (1 and 2, respectively), the total number of study participants would be 25.

1 = male, 2 = female
(1 x 5) + (2 x 10) = 25 study participants
Obviously, the correct answer is 15 study participants.

Statistical analysis of data is based on the rules that are applied to a measurement.  Data are analyzed according to different levels of measurement.  Nominal level data are also referred to as categorical data.  The gender variable is an example of nominal level data.  Diagnosis is another example of nominal level data.  Nominal level data can be expressed as counts/frequencies.

Measurement on an ordinal scale requires data that can be ranked.  One example of an ordinal scale is the measurement of pain on a Likert scale of 0 to 10, where "0" is defined as "no pain" and "10" is defined as "the worst pain ever experienced".  Another example is the measurement of loss of physical function.  Loss of physical function can be measured in terms of classifications, such as minor, moderate, and severe.  These classifications of loss of physical function can be placed on an ordinal scale.  Ordinal level data can be used for descriptive analyses, such as frequencies (like nominal data).  For example, a group of researchers may report the number of study participants that fall within different categories of physical disability.  But, technically, ordinal data cannot be analyzed using arithmetic operations.  However, one could argue that ordinal data can be analyzed using arithmetic operations (such as a mean pain rating) and that such analyses can be interpreted from a practical perspective.  If fact, peer-reviewed journals have published studies where ordinal data have been analyzed in such a manner. (https://academic.oup.com/ptj/article/90/9/1239/2737986?searchresult=1)

Data on the interval scale are rank-ordered (like ordinal data) but also consist of equal distances or intervals between units of measurement.  However, interval-level data do not consist of a true zero measurement.  Consider the measurement of temperature in degrees Celsius.  The measurement of 0 degrees Celsius is assigned arbitrarily.  Indeed, temperature can be measured in negative units (for example, -10 degrees Celsius).  Temperature is a measurement of the amount of heat.  Since 0 degrees Celsius does not represent the total absence of heat, the measurement of 0 degrees Celsius can be considered "artificial".  A strength of interval data is that arithmetic operations can be used to analyze the data since equal distances between units of measurement exist.

The highest level of measurement is using data that are on the ratio scale.  Ratio-level data are on the interval scale, but also have a true zero measurement.  A true zero measurement of the ratio scale reflects the total absence of the variable property and negative values are not possible.  The measurement of force in Newtons is an example of ratio-level data.  Because ratio data are on the interval scale and have a true zero measurement, all mathematical and statistical operations can be used for data analyses.

Click on the following link for another description of levels of measurement.

So, why is identification of the level of measurement (nominal, ordinal, interval, or ratio) important?  In the field of statistics, the most important reason may be utilization of appropriate statistical procedures, based on the level of data measurement.  A simple example is gender.  For the purpose of recording gender data, the number "1" may be used to represent males and the number "2" may be used to represent females.  This coding of data is often necessary for using statistical analysis software.  So, if a study includes five males and five females, a mean or average cannot be calculated based on such a coding system to reflect the variable of gender.  The mean would equal 1.5.  A mean of 1.5 does not represent the gender variable and we cannot make inferences from such a statistical analysis.

Collection and analysis of ordinal-level data occur frequently in social and health sciences.  As previously mentioned, ordinal data have been analyzed using arithmetic operations.  Although applying arithmetic operations to ordinal data is fundamentally inappropriate, such procedures have made interpretation of ordinal data more practical.  I will not attempt to debate "for or against" the use of arithmetic operations in ordinal data analysis.  The purpose of my comments is to make the reader aware of this topic.


No comments:

Post a Comment