Last changed 6 Apr 1998 ............... Length about 2,000 words (13,000 bytes).
This is a WWW document maintained by Steve Draper, installed at http://www.psy.gla.ac.uk/~steve/miraplans.html. You may copy it. How to refer to it.

MIRA plans

Contents (click to jump to a section)

Preface
Invitees
Decisions needed soon
Dublin workshop
Final conference

by
Stephen W. Draper

Preface

Started April 1998.
This sketches plans for the last 2 MIRA workshops; and other things.

Invitees

These people might be invited. They need to be cleared with EC; and then written to to ask them (get them to put it in their diaries).
Miche (not at city)
Victoria (at city; should be encouraged to come?)
Clare Harvey
Photo woman (GETI? company)
Stephano Mizzaro
Annelise
Josianne? (now come 3 times; but not an official member)
Pia Borlund
Iain Campbell (to talk about his program)
Smeaton's yankee

Decisions needed soon

Date and call for papers for final conference. And in fact therefore decisions about the format and themes are probably needed.

Dates, hence length, hence content of Dublin workshop.

Fund 4th working group (me and Dunlop)

To spend more or less on inviting people.

Dublin workshop

Last week, Oct. 1998, Dublin. (?Wed-Fri 28-30 Oct 1998)
Schedule largely set by a) reports from working subgroups b) rehearsal of activities for final workshop/conference

A. Reports from the working groups. 1 hour each? Certainly a report; perhaps ask for a trial interactive exercise.
A1 Joemon / Finland: photo retrieval domain study
A2 Fabio: relevance consensus tests
A3 Miche: designing a MMTC proposal
A4 Dunlop&Draper: start work on a demo illustrating the onion evaluation model w.r.t. versions of a single piece of IR software. Do evaluation tests, and save the videos of users failing.
B. CVR: 1 hour session: planning for final conference. Presumably the aims are to get: ideas, promises to do work preparing, consensus in MIRA. Might not be necessary if enough of the planning below looks acceptable and/or can be agreed over email within MIRA.
C. Sessions prompting people to write papers for the final conference. ?Perhaps get MIRA people to have the idea and present a 10 min. outline?
D. A debate / panel on interactive TreC. To what extent do we think it is adequate?? Get Smeaton's Yank to present what it is; and perhaps defend it. Dunlop&Draper, if no-one else, could concoct an "against" position.
E. 30 mins on senses of information need. E.g. 10 min. talk by Pia Borlund; possibly ditto by Stephano; demo (examples? exercise?) by Dunlop&Draper to illustrate all this.
E2 ?Pair this with a similar one on senses of relevance judgement? Try to balance: abstract statement, one or more good examples, interactive exercises e.g. to show degree of consensus. And to relate it forward to possible implications for the possibility of an MMTC.
F1. 30 min. session, plus available as demo over evenings etc.
Get Iain Campbell to give a talk on his software. Its importance is that it is a) image retrieval b) radical no-query all-relevance-feedback technique. Not just interesting as an approach, but of course invalidates old TC tests as an approach.
F2 30 min. talk/session. Clare Harvey: missed her this time. Continue our education on eval. techniques.
F3 30 min. talk/session. Invite Dunlop's GETY woman to re-give talk on a particular domain of image retrieval.

Final conference

Final one; Easter 1999, at an island near Glasgow. [Cumbrae; Aran; Bute; Orkney? Trossachs and a cruise on Loch Katrine?]

It was decided to have 2 half-weeks (one for IRSG), back to back with a free weekend in between.

Two aims:

1. Dissemination. Any gesture, particularly with a wider audience, would do to look as if we are disseminating. If however we really want to change minds, then 2 things are needed:
1a Getting a range of influential yet susceptible attendees
1b Putting on exercises and demos rather than just a few monologues to get the key points across.
2. Motivate wider attendance: probably means a conference and publication format. However some of the topics proposed below would probably motivate DARPA officials to attend if we notify them, as they relate to what could be a sensible way forward for testbeds in IR.

A. Paper sessions. Could go for parallel sessions: more time and more discussion for each speaker; Could go for high-discussion format: speakers warned to design a 30 min. session led by only 5 min. monologue ....
A2 [NONE] It would be entirely possible to invite and publish papers without having the author give a talk at all.
B. Possible gimmick. Video all sessions; get revelation to mount this video archive ASAP on web both for its own sake for the IR community and as a multimedia test set (modern counterpart to a collection of academic papers). But video multiple copies live (e.g. one super VHS for the records plus a chain of 4 domestic VHS recorders); have the tape instantly available during the workshop for replay, viewing the sessions you missed, continuing the discussion; give one of the tapes to each speaker as a leaving present.
C. Demo/ exercises. Have one from each of our working groups:
C1 Fabio (also Yves/Fermi??): image retieval relevance tests. The exercise would be to get every attendee to rate each of a set of test images for relevance (then pool answers and show degree of (lack of) consensus.
C2 Joemon??
C3 Miche: proposal for MMTC. At the least, some example simulated information needs, which participants take and then use a test program to try to satisfy; followed by scoring of results using previously obtained domain user relevance judgements????
C4. Dunlop&Draper: A demo illustrating the onion model of evaluation with reference to a piece of IR software. For each onion layer, identify at least one specific IR design issue, and show 2 or more variants of the software with and without specific support for it. For instance, a) Raya's study mentioned that her pupils failed to recognise a search engine form/box on screen if it wasn't labelled in a way they could recognise b) failing to have the software highlight and auto-scroll documents to the words matching the query has often been reported to result in users being able to recognise a selected document as in fact useful to them.
*C5. Re-run Annelise exercise. Seems a pity to waste all the work that went into it. This would mean inviting her. Would it be too heavyweight for the conference? Still, also impressive as a demo of what serious evaluation could be like.
D. Panels (debates). Something you can get out of workshops and conferences which you can't usually get from journals is the opinions of experienced researchers about issues they don't research themselves (particularly, their reasons for NOT working on those issues). Panels allow the organisers to dictate the question and elicit interesting views. Some sample panel topics [this is only a very preliminary list]:
- Interactive TREC / non-interactive TreC: useless? ....
- What kind of relevance should we measure? (Mizzaro lists dozens of kinds)
- Interactive system performance depends on the work context so no standard test bed is of any use or generality
- All kinds / levels of evaluation are needed. Each panellist speaks briefly about a different level. [This could be a talk session rather than a debate; but might be a good way to drive home our point about the diversity of approaches actually needed if IR evaluation is to get real.]