Last changed 29 Sep 2002 ............... Length about 900 words (6000 bytes).
This is a WWW document maintained by Steve Draper, installed at

Web site logical path: [] [~steve] [grumps] [this page]

Grumps Client needs

Grumps clients are pursuing REDDIs (rapidly evolving data driven investigation). These depend upon two kinds of software (besides whatever software is being studied):

For a Grumps client to carry out a REDDI, investigating data for some purpose of their own, they need to cover many other functions in addition. There are basically three ways of covering each of these functions: by human expertise the client has (in addition to their domain knowledge), by human expertise supplied by Grumps as additional support, or by pre-fabricated solutions already created and accumulated that can be re-used by clients. (For instance, clients using the level 1 computing lab data can probably share some data retrieval and data cleaning work; the display software written to support Quintin's project might turn out to be a default skeleton other clients could use; external data mining software packages could turn out to be a help for some of these functions.) In many cases these types of support or expertise may not require much effort as measured in expert-person-hours, yet without every bit being supplied in one way or another, a REDDI will probably wither. Also, the speed of turnround on these calls on expertise may be important too, since REDDIs are meant to be highly iterative. These types are:

  1. Maintainence / installation of the grumps software. It is unlikely within the project ever to be so robust and well engineered that it can be used without any help.
  2. Database management. The grumps software can typically produce a big flood of data, to be collected in some sizeable relational database. Someone has to decide on a machine to host this, and to set it up and do "sysadmin" type management of it.
  3. A client to drive the investigation, with an active goal or theory or other purpose for using the data. At the least it is from them that decisions flow about whether that purpose is achieved already, or if not then in what way not.
  4. Someone to direct collection: decide on what data to collect. Typically also the client.
  5. A data retriever: someone who can write (and debug) successful SQL expressions to extract data of interest to the investigation from the repository.
  6. A data cleaner: someone who writes the code to clean up the data for the investigation. Essentially the same skills as the retriever, but the function of cleaning is likely to take much greater amounts of time. On the other hand, data cleaning actions may be re-usable across miniprojects to a useful extent.
  7. A software writer for producing displays of the (retrieved and cleaned) data. Without a means of presenting results effectively to the investigators, nothing is achieved.
  8. A statistician: turning a hypothesis from a hunch into an English spec., and then the spec. into a precise question that can be asked of the data / numbers requires at least some statistical expertise.
  9. A user liason or RA: in many cases, the human users being studied will need to be "handled", from informing and persuading them, to collecting consents, to conducting controlled experiements where some users do prescribed tasks.
  10. Running calibration experiments. There is often a need to run mini-experiments with the data collection on, where you (or users taking your instructions for a set task) do specific actions, and then you look at the generated data to find, or validate, associations between human actions and patterns in the data. This may be used to adjust the collector, or a filter, or a data cleaning operation.

Web site logical path: [] [~steve] [grumps] [this page]
[Top of this page]