16 Jun 1997 ............... Length about 1600 words (11000 bytes).
This is a WWW version of a document. You may copy it. How to refer to it.

Developing evaluation for MANTCHI

Contents (click to jump to a section)

A Technical Report
Stephen W. Draper
Department of Psychology
University of Glasgow
Glasgow G12 8QQ U.K.
email: steve@psy.gla.ac.uk
WWW URL: http://www.psy.gla.ac.uk/~steve


This is an interim note written in May 1997 on the evaluation methods being developed as part of MANTCHI. That version was distributed to external assessor. Then revised as working account of our eval. method ideas.


Part of what we proposed in MANTCHI was to do a significant amount of evaluation of the effects on learners of interventions related to the project. As the project plans to use the MANs to deliver tutorial material, we began by studying some existing tutorial delivery to develop our evaluation methods and to provide a basis for later comparisons with new interventions during the project.

This was possible as we could appoint an experienced evaluator, Margaret Brown, from the start of the project in early 1997. She has studied parts of the tutorial provision in at least 4 of the HCI courses delivered in the period January-March 1997. (Two sizeable draft evaluation reports are being circulated.)

Our aim is to develop a method for evaluting tutorial provision of all kinds, including innovations developed during the project; to apply the method; and to report findings from this application. We expect this work in addition to illuminate what the issues of tutorial provision in general are, at least in the teaching we do.

What is meant by "tutorial"

Our definition is very broad: it excludes primary exposition such as lectures, and probably lab. classes, but includes all other kinds of feedback and other input to students from the teaching staff. Since the project concerns possible automation of tutorial provision, we obviously do not want to define "tutorial" in terms of human contact hours. Similarly, there is no strong prior agreement about the amount of tutorial provision necessary, so we need to study all the channels and needs that may underlie the wide variations in provision commonly observed.

Our starting point for an evaluation method

Our starting point was the method of Integrative Evaluation (Draper et al. 1996) developed during the TILT project. This features open-ended observation in classrooms, and also pre- and post-tests based on learning objectives elicited from the teachers.

Tutorial provision rationales (TPR)

In integrative evaluation, one of our first steps is to elicit explicit learning objectives from teachers. In MANTCHI, we are attempting to elicit from teachers a rationale for the tutorial provision they make. (Two are appended.) Not only are teachers in HE currently even less practised at articulating TPRs than learning objectives for courses, but they seem to raise different issues. Teachers often emphasise aims concerning the experiences they want students to have, rather than testable learning objectives. We should probably therefore try to devise suitable tests to indicate how well these aims are being achieved (e.g. questions about attitudes or experience).

See a separate TR in which we'll write up TPRs.

New ideas on evaluation method

In reviewing the first drafts of the evaluation studies so far, a number of new issues about how we should do evaluation have already emerged.

One issue is having to test the success of 3 things:
Learning objectives
Learning aims

1. Testing learning aims

As mentioned, general learning aims seem important to the HCI teachers sampled so far. That is, instead of saying things such as "have a detailed knowledge of what the Theory of action is, and be able to apply it" (a typical examinable learning objective), they may say "be able to adopt a user-centered perspective on any new situation". Could we devise (with a lot of help from the teacher concerned) a short answer quiz question that would nevertheless be diagnostic of such aims? Teachers are sceptical, but it seems worth trying. After all, they say that they can typically tell from the kinds of things students say to them whether or not that student has "got it". One approach will be questions on their attitude to the subject or to a specific problem.

2. Evaluate against the success of the "occasion"

We used to evaluate mainly against the teachers' learning objectives. As discussed above, broader aims seem too important to omit in tutorial provision, and we need to evaluate against these as well. Furthermore, some of the occasions we have already observed in MANTCHI seem to require us to distinguish learning outcomes from a notion of whether the occasion or activity was itself a "success". The use of Superjanet videoconferencing on one occasion was an important case. The conference (like in fact a lecture or tutorial) is a social and in a sense theatrical occasion. If it is badly stage-managed it can feel like a failure, an aimless occasion; yet this may be independent of the learning outcomes. (A lecture that feels aimless may nevertheless turn out to make you think about the subject.)
Another example would be how I felt after many of my PhD supervisions with my supervisor: I would feel depressed and as if it hadn't been productive; but next morning I would wake up thinking about different issues i.e. in fact it I had learned from these occasions.

We can test for this with questions such as "did it feel a success"? "Can you say what the point of the occasion was? What you learned?"

This issue of occasion is or may be related to:

3. Learning Management/admin

Laurillard (1993, p.103) offers a model of the learning and teaching process. One of its features is the idea that any academic subject has two levels: that of public, formal, conceptual description, and that of personal experience (typically "taught" in labs). I argue (Draper 1997) that there is a third important "layer": that of managing or administering the process. If you analyse the content of questions from students (whether in class, in tutorials, over email) a large proportion are about practical administration: when an assignment is due, what the question means, where the next lecture will be. It is likely that a key function of tutorials (that mustn't be lost if they are automated by internet delivery of some kind) is to deliver such information. We are moving towards asking in every tutorial situation we evaluate, what proportion of the time or of exchanges are to do with the management layer, and what to do with the actual subject matter.

4. Major and minor feedback

Intuitively, tutorials are there to provide students with "feedback". Laurillard's model emphasises iteration: one activity is the teacher giving an exposition (e.g. a lecture), a second is the student re-expressing this (e.g. an essay), a third is the teacher commenting on this (e.g. marks and comments on the essay), a fourth the learner having another go at expression. However a lot of learner-teacher exchanges, particularly in tutorials, is on a much shorter timescale than this: asking for clarification on what is wanted for one of those major activities (e.g. asking what kind of essay is required, what the teacher meant by this comment on the last assignment). Although related to the point above about a management layer in the model, some of this minor (short timescale) feedback is about conceptual content. It is probably worth distinguishing major and minor feedback, particularly if it is true that it is sensitive to timescale, since some alternatives to face to face tutorials will have much longer round-trip response times.

5. Elicit aims etc. from students as well as teachers

We thus have 3 types of criteria against which to evaluate: aims, objectives, activities/occasions. In addition, we have already seen cases where different participants have markedly different views on each of these. Thus we should make a systematic effort (as we have not before) to elicit such criteria from the students as well as from the teachers. For example, students may learn something they value even if the teacher did not mention this as an objective. In addition to the teachers' and learners' views, we as evaluators may sometimes have a third perspective. For instance, the video conference felt like a failure to both teachers and learners, but to me as an observer I felt that I was learning about what the limitations and pre-requisites for conferencing are, and that the students probably learned that too.

We thus now have 3 by 3 = 9 types of criteria to consider in evaluating particular "tutorial" activities.

Summary: new methods/instruments

At least in principle, we should elicit criteria from not 1 but 3 kinds of stakeholder (teachers, learners, evaluators).
Foreach, use the TPR as the instrument.

At/after each occasion, as well as possibly measuring learning objectives by say a quiz, measure:


Draper,S.W., Brown, M.I., Henderson,F.P. & McAteer,E. (1996) "Integrative evaluation: an emerging role for classroom studies of CAL" Computers and Education vol.26 no.1-3, pp.17-32 and Computer assisted learning: selected contributions from the CAL 95 symposium Kibby,M.R. & Hartley,J.R. (eds.) (Pergamon: Oxford) pp.17-32. And a web document URL: http://www.psy.gla.ac.uk/~steve/IE.html

Draper, S.W. (1997) "Adding (negotiated) management to models of learning and teaching??" Itforum (email list: invited paper) Published 18 April 1997. And a web document URL: http://www.psy.gla.ac.uk/~steve/TLP.management.html