Evaluating assessment performance: Difference between revisions
Juha Villman (talk | contribs) No edit summary |
(related exercises added) |
||
(10 intermediate revisions by 2 users not shown) | |||
Line 1: | Line 1: | ||
[[category:Open assessment]] | |||
{{Lecture}} | |||
Evaluating performance is | '''Evaluating assessment performance''' is a lecture about the factors that constitute the performance, the ''goodness'', of assessments and they can be evaluated within a single general framework. | ||
This page was converted from an encyclopedia article to a lecture as there are other descriptive pages already existing about the same topic. The old content was archived and can be found [http://en.opasnet.org/en-opwiki/index.php?title=Evaluating_assessment_performance&oldid=4383 here]. | |||
==Scope== | |||
'''Purpose:''' To summarize what assessments are about overall, what are the factors that constitute the overall performance of assessment, how the factors are interrelated, and how the performance, the ''goodness'' of assessment can be evaluated. | |||
: | '''Intended audience:''' Researchers (especially at doctoral student level) in any field of science (mainly natural, not social scientists). | ||
'''Duration:''' 1 hour 15 minutes | |||
==Definition== | |||
In order to understand this lecture it is recommended to also acquaint oneself with the following lectures: | |||
* [[Open assessment in research]] | |||
* [[Assessments - science-based decision support]] | |||
* [[Variables - evolving interpretations of reality]] | |||
* [[Science necessitates collaboration]] | |||
==Result== | |||
Evaluating performance | [[Image:Evaluating assessment performance.ppt|slides]] | ||
* Introduction through analogy: what makes a mobile phone good? | |||
** chain from production to use | |||
** goodness from whose perspective, producer or user? Can they be fit into one framework? | |||
** phone functionalities (qofC) | |||
** user interface, appearance design, use context, packaging, logistics, marketing, sales (applicability) | |||
** mass production/customization: components, code, assembly (efficiency) | |||
* Assessments | |||
** serve 2 masters: truth (science) and practical need (societal decision making, policy) | |||
** must meet the needs of their use | |||
** must strive for truth | |||
** both requirements must be met, which is not easy, but possible | |||
** a business of creating understanding about reality | |||
*** making right questions, providing good answers, getting the (questions and) answers where they are needed | |||
*** getting the questions right (according to need) is primary, getting the answers right is only conditional to the previous | |||
* Contemporary conventions of addressing performance: | |||
** quality assurance/control: process approach | |||
** uncertainty assessment: product approach | |||
* Performance in the context of assessments as science-based decision support: properties of good assessment | |||
** takes the production point of view | |||
** quality of content | |||
*** informativeness, calibration, relevance | |||
** applicability | |||
*** availability, usability, acceptability | |||
** efficiency | |||
** different properties have different points of reference and criteria, | |||
** a means of managing design and execution or evaluating past work | |||
** not an orthogonal set: applicability conditional on quality of content, efficiency conditional on both quality of content and applicability | |||
* Methods of evaluation | |||
** informativeness and calibration - uncertainty analysis, discrepancy analysis between result and another estimate (assumed as a golden standard) | |||
** relevance - scope vs. need | |||
** usability - (assumed) intended user opinion, participant rating for (technical) quality of information objects | |||
** availability - observed access to assessment information by intended users | |||
** acceptability of premises - (assumed) acceptance of premises by intended users, or others interested, or affected | |||
** acceptability of process - scientific acceptability of definition, given scope → peer review | |||
** efficiency - estimation of spent effort (given outcome) | |||
* How much addressing of these properties is built in to open assessment and opasnet? | |||
Exercises: | |||
{|{{prettytable}} | |||
! Topic | |||
! Task description | |||
|----- | |||
| Quality of content | |||
| Consider the concepts informativeness, calibration and relevance. What is their point of reference, practical need or truth? Which attributes and sub-attributes of assessments and variables they relate to ?Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about their informativeness, calibration and relevance? Then consider the case assessment (ET-CL). What can you say about its informativeness, calibration and relevance? What implications this gives about future work regarding the variable and the assessment? Give suggestions for improving their quality of content. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concepts, evaluation of the quality of content of the variable and the assessment, practical means of evaluation against these properties, and suggestions for improving the quality of content, as well as reasoning to back-up your suggestions. | |||
|----- | |||
| Usability and availability | |||
| Consider the concepts usability and availability. What is the reference point they relate to, practical need or truth? Assuming that the information content of a variable or an assessment were complete, correct and exact, what would make it usable and available or not usable or available for someone who would need that particular chunk of information? Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about its usability and availability given the purpose and users defined in the scope of the case assessment (ET-CL)? Then consider the case assessment. What can you say about its usability and availability given the purpose and users defined in the scope. What implications this gives about future work regarding the variable and the assessment? Give suggestions for improving their usability and availability. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief presentation of the concept, why is it an important aspect of assessment performance, evaluation of the usability and availability of the variable and the assessment, practical means of evaluation against these properties, and suggestions for improving the usability and availability, as well as reasoning to back-up your suggestions. | |||
|----- | |||
| Acceptability | |||
| Consider the concept acceptability and its disaggregation to acceptability of premises and acceptability of process. Which attributes and sub-attributes of variables and assessments does acceptability relate to? Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about its acceptability given the scope of the variable? Then consider the case assessment. What can you say about its acceptability given the scope of the assessment. What aspects are most crucial regarding the acceptability of the assessment? Give suggestions for improving the acceptability of the assessment. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concept and why it is an important aspect of assessment performance, an evaluation of the acceptability of the variable, practical means of evaluation against these properties, and the assessment, and suggestions for improving the acceptability, as well as reasoning to back-up your suggestions. | |||
|----- | |||
| Efficiency | |||
| Consider the concept efficiency. How is efficiency dependent on quality of content and applicability? What else influences efficiency of assessments? Now explore the case assessment, how it is designed and being executed, and think about its efficiency. If you were responsible for making the assessment and provided with scarce resources, what would you do in order to get the best outcome? How about if you were responsible for making a somewhat related an partially overlapping assessment, e.g. [[Bioher]], [[Claih]], TAPAS, some other, and again with scarce resources, what would you tell the manager of ET-CL in order to maximize the efficiency of your assessment. How about if you were responsible for a group of overlapping assessments, say ET-CL, [[Claih]] and [[Bioher]] for example, would it change your management decisions? Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concept and why it is an important aspect of assessment performance, practical means of evaluation against these properties, an explanation of how you as an assessment manager would try maximize the efficiency in the above mentioned different situations, as well as reasoning to back-up your hypothetical management decisions. | |||
|----- | |||
| Peer review | |||
| Acquaint yourself with how peer review is defined in the context of open assessment. Consider its similarities with and deviations from your prior perception of what peer review is. Which attributes of variables and assessment does peer review (in OA) address and how? Which properties of good assessments does peer review relate to? Take the position of a peer reviewer and review one variable and the case assessment. Would you accept? Prepare a presentation including a brief introduction to the method and how it is used in evaluating assessment performance | |||
|} | |||
Latest revision as of 12:02, 26 March 2009
This page is a lecture.
The page identifier is Op_en2082 |
---|
Moderator:Nobody (see all) Click here to sign up. |
|
Upload data
|
Evaluating assessment performance is a lecture about the factors that constitute the performance, the goodness, of assessments and they can be evaluated within a single general framework.
This page was converted from an encyclopedia article to a lecture as there are other descriptive pages already existing about the same topic. The old content was archived and can be found here.
Scope
Purpose: To summarize what assessments are about overall, what are the factors that constitute the overall performance of assessment, how the factors are interrelated, and how the performance, the goodness of assessment can be evaluated.
Intended audience: Researchers (especially at doctoral student level) in any field of science (mainly natural, not social scientists).
Duration: 1 hour 15 minutes
Definition
In order to understand this lecture it is recommended to also acquaint oneself with the following lectures:
- Open assessment in research
- Assessments - science-based decision support
- Variables - evolving interpretations of reality
- Science necessitates collaboration
Result
File:Evaluating assessment performance.ppt
- Introduction through analogy: what makes a mobile phone good?
- chain from production to use
- goodness from whose perspective, producer or user? Can they be fit into one framework?
- phone functionalities (qofC)
- user interface, appearance design, use context, packaging, logistics, marketing, sales (applicability)
- mass production/customization: components, code, assembly (efficiency)
- Assessments
- serve 2 masters: truth (science) and practical need (societal decision making, policy)
- must meet the needs of their use
- must strive for truth
- both requirements must be met, which is not easy, but possible
- a business of creating understanding about reality
- making right questions, providing good answers, getting the (questions and) answers where they are needed
- getting the questions right (according to need) is primary, getting the answers right is only conditional to the previous
- Contemporary conventions of addressing performance:
- quality assurance/control: process approach
- uncertainty assessment: product approach
- Performance in the context of assessments as science-based decision support: properties of good assessment
- takes the production point of view
- quality of content
- informativeness, calibration, relevance
- applicability
- availability, usability, acceptability
- efficiency
- different properties have different points of reference and criteria,
- a means of managing design and execution or evaluating past work
- not an orthogonal set: applicability conditional on quality of content, efficiency conditional on both quality of content and applicability
- Methods of evaluation
- informativeness and calibration - uncertainty analysis, discrepancy analysis between result and another estimate (assumed as a golden standard)
- relevance - scope vs. need
- usability - (assumed) intended user opinion, participant rating for (technical) quality of information objects
- availability - observed access to assessment information by intended users
- acceptability of premises - (assumed) acceptance of premises by intended users, or others interested, or affected
- acceptability of process - scientific acceptability of definition, given scope → peer review
- efficiency - estimation of spent effort (given outcome)
- How much addressing of these properties is built in to open assessment and opasnet?
Exercises:
Topic | Task description |
---|---|
Quality of content | Consider the concepts informativeness, calibration and relevance. What is their point of reference, practical need or truth? Which attributes and sub-attributes of assessments and variables they relate to ?Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about their informativeness, calibration and relevance? Then consider the case assessment (ET-CL). What can you say about its informativeness, calibration and relevance? What implications this gives about future work regarding the variable and the assessment? Give suggestions for improving their quality of content. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concepts, evaluation of the quality of content of the variable and the assessment, practical means of evaluation against these properties, and suggestions for improving the quality of content, as well as reasoning to back-up your suggestions. |
Usability and availability | Consider the concepts usability and availability. What is the reference point they relate to, practical need or truth? Assuming that the information content of a variable or an assessment were complete, correct and exact, what would make it usable and available or not usable or available for someone who would need that particular chunk of information? Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about its usability and availability given the purpose and users defined in the scope of the case assessment (ET-CL)? Then consider the case assessment. What can you say about its usability and availability given the purpose and users defined in the scope. What implications this gives about future work regarding the variable and the assessment? Give suggestions for improving their usability and availability. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief presentation of the concept, why is it an important aspect of assessment performance, evaluation of the usability and availability of the variable and the assessment, practical means of evaluation against these properties, and suggestions for improving the usability and availability, as well as reasoning to back-up your suggestions. |
Acceptability | Consider the concept acceptability and its disaggregation to acceptability of premises and acceptability of process. Which attributes and sub-attributes of variables and assessments does acceptability relate to? Now choose one variable that was developed in Tuesday and Wednesday exercises. What can you say about its acceptability given the scope of the variable? Then consider the case assessment. What can you say about its acceptability given the scope of the assessment. What aspects are most crucial regarding the acceptability of the assessment? Give suggestions for improving the acceptability of the assessment. Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concept and why it is an important aspect of assessment performance, an evaluation of the acceptability of the variable, practical means of evaluation against these properties, and the assessment, and suggestions for improving the acceptability, as well as reasoning to back-up your suggestions. |
Efficiency | Consider the concept efficiency. How is efficiency dependent on quality of content and applicability? What else influences efficiency of assessments? Now explore the case assessment, how it is designed and being executed, and think about its efficiency. If you were responsible for making the assessment and provided with scarce resources, what would you do in order to get the best outcome? How about if you were responsible for making a somewhat related an partially overlapping assessment, e.g. Bioher, Claih, TAPAS, some other, and again with scarce resources, what would you tell the manager of ET-CL in order to maximize the efficiency of your assessment. How about if you were responsible for a group of overlapping assessments, say ET-CL, Claih and Bioher for example, would it change your management decisions? Think of ways how assessments are or could be evaluated against these properties in practical terms. Prepare a presentation including a brief introduction to the concept and why it is an important aspect of assessment performance, practical means of evaluation against these properties, an explanation of how you as an assessment manager would try maximize the efficiency in the above mentioned different situations, as well as reasoning to back-up your hypothetical management decisions. |
Peer review | Acquaint yourself with how peer review is defined in the context of open assessment. Consider its similarities with and deviations from your prior perception of what peer review is. Which attributes of variables and assessment does peer review (in OA) address and how? Which properties of good assessments does peer review relate to? Take the position of a peer reviewer and review one variable and the case assessment. Would you accept? Prepare a presentation including a brief introduction to the method and how it is used in evaluating assessment performance |