KPD3016 AT16(A) TEAM 7: Validity and Realibility

VALIDITY

· refers to the accuracy of an assessment -- whether or not it measures what it is supposed to measure. Even if a test is reliable, it may not provide a valid measure.

Type of Validity	Definition	Example/
Content	The extent to which the content of the test matches the instructional objectives	A semester or quarter exam that only includes content covered during the last six weeks is not a valid measure of the course's overall objectives -- it has very low content validity.
Criteria	The extent to which scores on the test are in agreement with (concurrent validity) or predict (predictive validity) an external criterion.	If the end-of-year math tests in 4th grade correlate highly with the statewide math tests, they would have high concurrent validity.
Construct	The term construct is defined as a property that is offered to explain some aspect of human behavior, such as mechanical ability, intelligence, or introversion	early self-esteem studies - self-esteem refers to a person's sense of self-worth or self-respect. Clinical observations in psychology had shown that people who had low self-esteem often had depression. Therefore, to establish the construct validity of the self-esteem measure, the researchers showed that those with higher scores on the self-esteem measure had lower depression scores, while those with low self-esteem had higher rates of depression

Factor affecting Validity	explanation
Nature of the group	Consistency of the validity coefficient for subgroups which differ in any characteristic (e. g. age, gender, educational level, etc, …)
Sample heterogeneity	A wider range of scores results in a higher validity coefficient (range restriction phenomenon)
Criterion-predictor relationship	There must be a linear relationship between predictor and criterion. Otherwise, the Pearson correlation coefficient would be of no use!
Validity-reliability proportionality	Reliability has a limiting influence on validity – we simply cannot validate an unreliable measure!
Moderator variables	Variables like age, gender, personality characteristics may help to predict performance for particular variables only – keep them in mind!
Criterion contamination	Get rid of bias by measuring contaminated influences. Then correct this influence statistically by use of partial correlation.

Reliability

· The degree of consistency between two measures of the same thing. (Mehrens and Lehman, 1987).

• The measure of how stable, dependable, trustworthy, and consistent a test is in measuring the same thing each time (Worthen et al., 1993)

TYPES OF RELIABILITY	DEFINITION	EXAMPLE
TEST-RETEST	The same form of a test on two or more separate occasions to the same group of examinees (Test-retest)	For example, the examinees will adapt the test format and thus tend to score higher in later tests. Hence, careful implementation of the test-retest approach is strongly recommendation
EQUIVALENT FORM	Two different forms of test, based on the same content, on one occasion to the same examinees	A examinee who took Form A earlier could not share the test items with another student who might take Form B later, because the two forms have different items.
INTERNAL CONSISTENCY	The coefficient of test scores obtained from a single test or survey	The same principle can be applied to a test. When no pattern is found in the students' responses, probably the test is too difficult and students just guess the answers randomly.
SPLIT HALF	A measure of consistency where a test is split in two and the scores for each half of the test is compared with one another.	you have the Math test and divide the items on it in two parts. If you correlated the first half of the items with the second half of the items, they should be highly correlated if they are reliable.
INTER RATER	When multiple people are giving assessments of some kind or are the subjects of some test, then similar people should lead to the same resulting scores.	Two people may be asked to categorize pictures of animals as being dogs or cats. A perfectly reliable result would be that they both classify the same pictures in the same way.

FACTORS THAT LOWER THE RELIABILITY OF ASSESSMENTS

Insufficient number of tasks

Remedy: Accumulate results from several assessments

Poorly structured assessment procedures

Remedy: Define carefully nature of tasks, conditions for obtaining the assessment and the criteria for scoring and judging the results.

Dimensions of performance are specific to the tasks

Remedy: Increase generalizability of performance by selecting tasks that have dimensions like those in similar tasks

Inadequate scoring guides for judgemental scoring

Remedy: Using scoring rubrics or rating scales that specifically describe the criteria and levels of quality

Scoring judgements that are influenced by personal bias

Remedy: Check scores with those of an independent judge. Receive training in scoring and rating if possible

The relationship between validity and reliability.

At best, we have a measure that has both high validity and high reliability. It yields consistent results in repeated application and it accurately reflects what we hope to represent.

It is possible to have a measure that has high reliability but low validity - one that is consistent in getting bad information or consistent in missing the mark. *It is also possible to have one that has low reliability and low validity - inconsistent and not on target.

Finally, it is not possible to have a measure that has low reliability and high validity - you can't really get at what you want or what you're interested in if your measure fluctuates wildly.

1 comment:

Hasnor Izzati Che Razali27 December 2011 at 02:15
Assalamualaikum...

Dalam membuat sebarang ujian penilaian, seseorang penyelidik perlu menjalankan pilot study terlebih dahulu untuk melihat sama ada ujian penilaian (tersebut benar-benar valid dan reliable untuk diedarkan kepada pelajar. Kat sini pentingnya bagi seorang guru untuk memahami bagaimana nak ukur validity dan reliability...renung2kan

KPD3016 AT16(A) TEAM 7

Wednesday, 7 December 2011

Validity and Realibility

FACTORS THAT LOWER THE RELIABILITY OF ASSESSMENTS

1 comment:

mOtTo

iKrAR

AbOut mE