Wednesday 7 December 2011

Validity and Realibility


VALIDITY
·         refers to the accuracy of an assessment -- whether or not it measures what it is supposed to measure. Even if a test is reliable, it may not provide a valid measure.
Type of Validity
Definition
Example/
Content
The extent to which the content of the test matches the instructional objectives
A semester or quarter exam that only includes content covered during the last six weeks is not a valid measure of the course's overall objectives -- it has very low content validity.
Criteria
The extent to which scores on the test are in agreement with (concurrent validity) or predict (predictive validity) an external criterion.
If the end-of-year math tests in 4th grade correlate highly with the statewide math tests, they would have high concurrent validity.
Construct
The term construct  is defined as a property that is offered to explain some aspect of human behavior, such as mechanical ability, intelligence, or introversion
early self-esteem studies - self-esteem refers to a person's sense of self-worth or self-respect. Clinical observations in psychology had shown that people who had low self-esteem often had depression. Therefore, to establish the construct validity of the self-esteem measure, the researchers showed that those with higher scores on the self-esteem measure had lower depression scores, while those with low self-esteem had higher rates of depression











Factor affecting Validity
explanation
Nature of the group

Consistency of the validity
coefficient for subgroups which
differ in any characteristic (e. g.
age, gender, educational level,
etc, …)
Sample heterogeneity
A wider range of scores results
in a higher validity coefficient
(range restriction phenomenon)
Criterion-predictor relationship
There must be a linear
relationship between predictor
and criterion. Otherwise, the
Pearson correlation coefficient
would be of no use!
Validity-reliability proportionality
Reliability has a limiting
influence on validity – we
simply cannot validate an
unreliable measure!

Moderator variables
Variables like age, gender,
personality characteristics may
help to predict performance for
particular variables only – keep
them in mind!
Criterion contamination
Get rid of bias by measuring
contaminated influences.
Then correct this influence
statistically by use of partial
correlation.



















Reliability

·      The degree of consistency between two measures of the same thing. (Mehrens and Lehman, 1987).

• The measure of how stable, dependable, trustworthy, and consistent a test is in measuring the same thing each time (Worthen et al., 1993)

TYPES OF RELIABILITY
DEFINITION
EXAMPLE
TEST-RETEST
The same form of a test on two or more separate occasions to the same group of examinees (Test-retest)
For example, the examinees will adapt the test format and thus tend to score higher in later tests. Hence, careful implementation of the test-retest approach is strongly recommendation
EQUIVALENT FORM

Two different forms of test, based on the same content, on one occasion to the same examinees
A examinee who took Form A earlier could not share the test items with another student who might take Form B later, because the two forms have different items.
INTERNAL CONSISTENCY

The coefficient of test scores obtained from a single test or survey
The same principle can be applied to a test. When no pattern is found in the students' responses, probably the test is too difficult and students just guess the answers randomly.
SPLIT HALF
A measure of consistency where a test is split in two and the scores for each half of the test is compared with one another.
you have the Math test and divide the items on it in two parts. If you correlated the first half of the items with the second half of the items, they should be highly correlated if they are reliable.
INTER RATER
When multiple people are giving assessments of some kind or are the subjects of some test, then similar people should lead to the same resulting scores.
Two people may be asked to categorize pictures of animals as being dogs or cats. A perfectly reliable result would be that they both classify the same pictures in the same way.



FACTORS THAT LOWER THE RELIABILITY OF ASSESSMENTS


  1. Insufficient number of tasks
Remedy: Accumulate results from several assessments
  1. Poorly structured assessment procedures
Remedy: Define carefully nature of tasks, conditions for obtaining the assessment and the criteria for scoring and judging the results.
  1. Dimensions of performance are specific to the tasks
Remedy: Increase generalizability of performance by selecting tasks that have dimensions like those in similar tasks
  1. Inadequate scoring guides for judgemental scoring
Remedy: Using scoring rubrics or rating scales that specifically describe the criteria and levels of quality
  1. Scoring judgements that are influenced by personal bias
Remedy: Check scores with those of an independent judge. Receive training in scoring and rating if possible
 
 











The relationship between validity and reliability.
At best, we have a measure that has both high validity and high reliability. It yields consistent results in repeated application and it accurately reflects what we hope to represent.
It is possible to have a measure that has high reliability but low validity - one that is consistent in getting bad information or consistent in missing the mark. *It is also possible to have one that has low reliability and low validity - inconsistent and not on target.
Finally, it is not possible to have a measure that has low reliability and high validity - you can't really get at what you want or what you're interested in if your measure fluctuates wildly.



 

1 comment:

  1. Assalamualaikum...

    Dalam membuat sebarang ujian penilaian, seseorang penyelidik perlu menjalankan pilot study terlebih dahulu untuk melihat sama ada ujian penilaian (tersebut benar-benar valid dan reliable untuk diedarkan kepada pelajar. Kat sini pentingnya bagi seorang guru untuk memahami bagaimana nak ukur validity dan reliability...renung2kan

    ReplyDelete