Main Article Content
Multiple choice tests have been used to measure students learning achievement for most of the courses at Universitas Terbuka (UT). To ensure the quality of the tests, test item analysis based on classical test theory has been employed regularly. This paper aims to analyze multiple-choice items of the End Semester Examination of UT using the program ITEMAN. The data used were the answers sheets of students taking eight courses in the first and second semester of 2009. Courses analyzed in this study were MKDU4111, PEMA4210, MKDU4109, ISIP4215, EKMA4214, ESPA4112, BIOL4110, and BIOL4119. The results showed that the test items used have had a pretty good quality. Average test item difficulties were fair. This was indicated by the mean value of P which ranged from 0.328 to 0.461. Discrimination index for both semester tests were good in about 75% of the courses measured. Its value ranges from 0.304 to 0.451 for the first semester of 2009 tests and 0.343 to 0.382 for the second semester of 2009 tests. Meanwhile, the reliability of the test items could be considered good except for the courses PEMA4210 (first semester 2009) and MKDU4111 (second semester 2009).
Azwar, S. (2003). Tes prestasi: Fungsi dan pengembangan pengukuran prestasi belajar.Yogyakarta: Pustaka Pelajar.
Cohen, R.J., Swerdlik, M.E., & Smith, D.K. (1992). Psycological testing and assessment: an introduction to test and measurement, (2nd ed). California: Mayfield Publishing Company.
Crocker, L. & Algina, J. (1986). Introduction to classical and modern test theory. New York: Prentice Hall.
Ebel, R. L. (1979). Essential of educational measurement (3rd ed). New Jersey: Prentice-Hall, Inc., Englewood Cliffs.
Embretson, E. & Reise, S.P. (2000). Item response theory for psychologists. Mahwah: NJ Publications, Lawrence Erlbaum Associates.
Hambleton, R. K., Swaminathan. H., & Rogers, H. J. (1991). Fundamentals of item response theory. California: Sage Publications, The International Professional Publishers.
Linn, R.L. & Gronlund, N.E. (1995). Measurement and Assessment in Teaching. (Seventh Edition). Ohio: Prentice-Hall, Inc.
Messick, S. (1993). Validity, educational measurement. (3rd ed). Robert L. Linn. New York: American Council on Education and Macmillan Publishing Company, A Division of Macmillan, Inc.
Nitko, A. J. (1996). Educational Assessment of Students, (2nd ed). Ohio: Merrill an imprint of Prentice Hall Englewood Cliffs.
Sanaky, H. (1998). Teknik menyusun alat evaluasi belajar mata pelajaran Al-Islam dan bahasa arab. Makalah disajikan pada Acara Pembinaan Guru Madrasah Muallimat Muhammadiyah, tanggal 26 September 1998. Diambil 20 Februari 2010, dari http://www.docstoc.com/docs/18529273.
Sudijono, A. (2005). Pengantar evaluasi pendidikan. Jakarta: Raja Grafindo Persada.
Verschoor, A.J. (2007). A multiple objective test assembly approach for exposure control problems in computerized adaptive testing. Measurement and Research Department Reports. Cito, Arnhem.
Zainul, A. & Nasoetion, N. (1997). Penilaian hasil belajar. Bahan Ajar Program Pengembangan Keterampilan Teknik Instruksional (Pekerti). Jakarta: PAU-PAAI Universitas Terbuka.
Jurnal Pendidikan Terbuka Dan Jarak Jauh is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.