28 Sep 2002 ............... Length about 900 words (6000 bytes).
This is a WWW document maintained by
Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/grumps/minips.html.
Web site logical path:
List of miniprojects for Grumps
This is my view of the list of REDDIs (rapidly evolving data driven
investigation) miniprojects / client projects, as of the date at the top of
this page, which are now in prospect for Grumps. They are roughly in order of
definiteness i.e. remoter hopes towards the end.
- Richard Thomas has some specific hypotheses about how to explain
keystroke time distributions. He plans to test them on the level1 DCS lab
data to be collected starting Oct 2002.
- Quintin and Steve: feeding back to individual students some stats on
their own data. The software to do this was created in an IT project, and so
already exists. Adding or changing the stats displayed should be fairly easy.
There are two motives for this miniproject a) to return something to the
students for agreeing to participate in giving data; b) develop their possible
use in promoting "reflection" by the students on their own activity, and
learning on the course.
- Steve and Quintin: collecting data from the PRS lecture theatre handsets
across uses, and developing their use. Part of this will be collecting and
presenting records of use. Part will be linking the use in level 1 lectures
to other data on the same set of students.
- Rebecca Black's student project will be looking at various variables in
the level 1 DCS class, and testing whether they predict exam performance.
This is not using Grumps data collection (mostly) BUT is an important example
of a REDDI: fishing in a big pool of data.
- Gregor is interested in looking for misconceptions, and in the
hypothesised cumulative nature of the course material, where once a student
has fallen off the bandwagon they don't recover.
- Gregor is interested in relating type of motivation or "goal"
(basically, for interest vs. to get the qualification) and persistence in
- Karen Renaud is interested in studying interruptions in the level 1 lab.
- Dag at Simula in Norway has expressed strong interest and may use Grumps
in a future experiment there.
- We plan to select a single Bioinformatics tool with a substantial user
community, implemented in Java. We hope to develop further both a generic
"bytecode" tool for extracting data, and a particular implementation for the
selected tool; and then collect data from its user community. The "client" may
be the developer/maintainer of the tool, although it would be feasible to act
as clients ourselves, collecting data for rather generic user tool design
purposes i.e. measuring what parts are used, relative command frequencies,
common sequences of commands.
- It would be good for grumps if we persuaded computing services to act as
clients, and use grumps to collect usage data in CSCE clusters with a view to
informing students about current and predicted usage in the clusters.
- Cara McNish from UWA may be interested in using Grumps in connection with
her software for automatic submission of assignments.
Range of investigation types
We may be able to illustrate a range of types of investigation.
One type of variation is in the sense of "data mining": whether an
investigation defines a set of data and goes "fishing" by induction: seeking
patterns that just emerge, versus coming with a specific hypothesis about what
pattern to look for and testing for that. Induction is likely to require more
Another type of variation is in the number of kinds of data combined.
We may be able to illustrate this, and furthermore its importance in some
cases for getting meaning out of the data. Grumps itself just gives software
usage or user input data. We may also use: PRS handset data, stills
from CCTV, human observation to produce hypotheses which are then tested (as
Kate Gilmore did), deliberate artificial testing of users to produce reference
or calibration data, other databases on the same users (e.g. class exam
Another type of variation is in the bottlenecks. One bottleneck can be
statistics: finding the right way to express a question in terms of
(statistical) functions of the data.
Another bottleneck can be artificial testing or calibration. Whether
designing the extraction tool, or designing a data filter, or finding what
indicators in the data correspond to a particular meaning (such as the user
switching tasks), a key method is likely to be active experimentation where
someone imitates a user to generate data of known meaning, while the resulting
data is watched (preferably in real time) to establish correspondences, test
data cleaning validity, etc.
Web site logical path:
[Top of this page]