16 Jun 1997 ............... Length about 1600 words (11000 bytes).
This is a WWW version of a document. You may copy it.
How to refer to it.
Developing evaluation for MANTCHI
A Technical Report
by
Stephen W. Draper
Department of Psychology
University of Glasgow
Glasgow G12 8QQ U.K.
email: steve@psy.gla.ac.uk
WWW URL: http://www.psy.gla.ac.uk/~steve
This is an interim note written in May 1997 on the evaluation methods
being developed as part of MANTCHI. That version was distributed to external
assessor. Then revised as working account of our eval. method ideas.
Part of what we proposed in MANTCHI was to do a significant amount of
evaluation of the effects on learners of interventions related to the project.
As the project plans to use the MANs to deliver tutorial material, we began by
studying some existing tutorial delivery to develop our evaluation methods and
to provide a basis for later comparisons with new interventions during the
project.
This was possible as we could appoint an experienced evaluator, Margaret Brown,
from the start of the project in early 1997. She has studied parts of the
tutorial provision in at least 4 of the HCI courses delivered in the period
January-March 1997. (Two sizeable draft evaluation reports are being
circulated.)
Our aim is to develop a method for evaluting tutorial provision of all kinds,
including innovations developed during the project; to apply the method; and
to report findings from this application. We expect this work in addition to
illuminate what the issues of tutorial provision in general are, at least in
the teaching we do.
Our definition is very broad: it excludes primary exposition such as
lectures, and probably lab. classes, but includes all other kinds of feedback
and other input to students from the teaching staff. Since the project
concerns possible automation of tutorial provision, we obviously do not want to
define "tutorial" in terms of human contact hours. Similarly, there is no
strong prior agreement about the amount of tutorial provision necessary, so we
need to study all the channels and needs that may underlie the wide variations
in provision commonly observed.
Our starting point was the method of Integrative Evaluation (Draper et
al. 1996) developed during the TILT project. This features open-ended
observation in classrooms, and also pre- and post-tests based on learning
objectives elicited from the teachers.
In integrative evaluation, one of our first steps is to elicit explicit
learning objectives from teachers. In MANTCHI, we are attempting to elicit
from teachers a rationale for the tutorial provision they make. (Two are
appended.) Not only are teachers in HE currently even less practised at
articulating TPRs than learning objectives for courses, but they seem to raise
different issues. Teachers often emphasise aims concerning the experiences
they want students to have, rather than testable learning objectives. We
should probably therefore try to devise suitable tests to indicate how well
these aims are being achieved (e.g. questions about attitudes or experience).
See a separate TR in which we'll write up TPRs.
In reviewing the first drafts of the evaluation studies so far, a number
of new issues about how we should do evaluation have already emerged.
One issue is having to test the success of 3 things:
Learning objectives
Learning aims
"Occasions"
As mentioned, general learning aims seem important to the HCI teachers
sampled so far. That is, instead of saying things such as "have a detailed
knowledge of what the Theory of action is, and be able to apply it" (a typical
examinable learning objective), they may say "be able to adopt a user-centered
perspective on any new situation". Could we devise (with a lot of help from
the teacher concerned) a short answer quiz question that would nevertheless be
diagnostic of such aims? Teachers are sceptical, but it seems worth trying.
After all, they say that they can typically tell from the kinds of things
students say to them whether or not that student has "got it". One approach
will be questions on their attitude to the subject or to a specific problem.
We used to evaluate mainly against the teachers' learning objectives.
As discussed above, broader aims seem too important to omit in tutorial
provision, and we need to evaluate against these as well. Furthermore, some of
the occasions we have already observed in MANTCHI seem to require us to
distinguish learning outcomes from a notion of whether the occasion or activity
was itself a "success". The use of Superjanet videoconferencing on one
occasion was an important case. The conference (like in fact a lecture or
tutorial) is a social and in a sense theatrical occasion. If it is badly
stage-managed it can feel like a failure, an aimless occasion; yet this may be
independent of the learning outcomes. (A lecture that feels aimless may
nevertheless turn out to make you think about the subject.)
Another example would be how I felt after many of my PhD supervisions with my
supervisor: I would feel depressed and as if it hadn't been productive; but
next morning I would wake up thinking about different issues i.e. in fact it I
had learned from these occasions.
We can test for this with questions such as "did it feel a success"? "Can you
say what the point of the occasion was? What you learned?"
This issue of occasion is or may be related to:
- Stage management aspect of teaching
- Sense of closure at end. Telling students what they learned and why.
- The factor for "organised" in course feedback questionnaires.
- The panic lecturers often feel about managing the occasion: being there, not
having a riot, not running out of things to day, being unable to answer some
question.
- An occasion is a mathemagenic activity. (Teachers have to learn how to "do"
each type of activity i.e. how to manage each type of occasion.)
Laurillard (1993, p.103) offers a model of the learning and teaching
process. One of its features is the idea that any academic subject has two
levels: that of public, formal, conceptual description, and that of personal
experience (typically "taught" in labs). I argue (Draper 1997) that there is a
third important "layer": that of managing or administering the process. If you
analyse the content of questions from students (whether in class, in tutorials,
over email) a large proportion are about practical administration: when an
assignment is due, what the question means, where the next lecture will be. It
is likely that a key function of tutorials (that mustn't be lost if they are
automated by internet delivery of some kind) is to deliver such information.
We are moving towards asking in every tutorial situation we evaluate, what
proportion of the time or of exchanges are to do with the management layer, and
what to do with the actual subject matter.
Intuitively, tutorials are there to provide students with "feedback".
Laurillard's model emphasises iteration: one activity is the teacher giving an
exposition (e.g. a lecture), a second is the student re-expressing this (e.g.
an essay), a third is the teacher commenting on this (e.g. marks and comments
on the essay), a fourth the learner having another go at expression. However a
lot of learner-teacher exchanges, particularly in tutorials, is on a much
shorter timescale than this: asking for clarification on what is wanted for one
of those major activities (e.g. asking what kind of essay is required, what the
teacher meant by this comment on the last assignment). Although related to the
point above about a management layer in the model, some of this minor (short
timescale) feedback is about conceptual content. It is probably worth
distinguishing major and minor feedback, particularly if it is true that it is
sensitive to timescale, since some alternatives to face to face tutorials will
have much longer round-trip response times.
We thus have 3 types of criteria against which to evaluate: aims,
objectives, activities/occasions. In addition, we have already seen cases
where different participants have markedly different views on each of these.
Thus we should make a systematic effort (as we have not before) to elicit such
criteria from the students as well as from the teachers. For example, students
may learn something they value even if the teacher did not mention this as an
objective. In addition to the teachers' and learners' views, we as evaluators
may sometimes have a third perspective. For instance, the video conference
felt like a failure to both teachers and learners, but to me as an observer I
felt that I was learning about what the limitations and pre-requisites for
conferencing are, and that the students probably learned that too.
We thus now have 3 by 3 = 9 types of criteria to consider in evaluating
particular "tutorial" activities.
At least in principle, we should elicit criteria from not 1 but 3 kinds
of stakeholder (teachers, learners, evaluators).
Foreach, use the TPR as the instrument.
At/after each occasion, as well as possibly measuring learning objectives by
say a quiz, measure:
- Aims
- Success of the occasion (management)
- Content as conceptual vs. admin.
- What actually got learned as opposed to what was supposed to be learned i.e.
OEO of unexpected benefits.
- Short and long timescales to feedback loops.
Draper,S.W., Brown, M.I., Henderson,F.P. & McAteer,E. (1996)
"Integrative evaluation: an emerging role for classroom studies of CAL"
Computers and Education vol.26 no.1-3, pp.17-32 and Computer
assisted learning: selected contributions from the CAL 95 symposium
Kibby,M.R. & Hartley,J.R. (eds.) (Pergamon: Oxford) pp.17-32. And a web
document URL: http://www.psy.gla.ac.uk/~steve/IE.html
Draper, S.W. (1997) "Adding (negotiated) management to models of learning and
teaching??" Itforum (email list: invited paper) Published 18 April
1997. And a web document URL:
http://www.psy.gla.ac.uk/~steve/TLP.management.html