Intelligent Tutoring Systems

Abstract:
This section gives an introductory overview on Intelligent Tutoring Systems and describes the tasks some of the main modules in a general ITS have to fulfill. However the actual learner modelling with a special focus on language learning is given in the next subchapter.
Some parts of this section are based on Schulmeister, Rolf, 1997 and Beck, Joseph ; Stern, Mia et al., 1996. An introduction to ITS can also be found in e.g. chapter 6 of Schulmeister's book. With the development of Artificial Intelligence in the 60s there was an immediate adaptation of AI-techniques for ITSs. One of the first systems considered an ITS was developed by J.R. Carbonell and was called SCHOLAR Carbonell, Jaime R., 1970. Its aim was to teach South American geography. The system had a tutoring-module which was able to infere the appropriate steps through the program from the student's responses. One important development in this case was the clear division between the inference engine and the domain knowledge. These two modules are already some of the necessary components of an ITS. Usually one considers the following parts:
Figure 1: ITS-Modules

Domain Knowledge

The domain knowledge usually consists of declarative, procedural and sometimes heuristic knowledge: Basically two models for the domain knowledge have been assumed in research. There is a black-box-model on the one hand which makes no claim about human behaviour regarding inference. The glass-box-model on the other hand claims that it reproduces intelligent reasoning and structuring of information. More about this topic can be found in e.g. Goerz, Günther ; Rollinger, Claus-Rainer et al. (Ed.), 2000.
In an ICALL-system the knowledge would cover mainly the linguistic knowledge necessary to be able to use a language. This might include a grammar and a lexicon, but also a corpus might be added to this list.

Expert Model

The expert model is similar to the domain knowledge. It is different in that it is a model of how an expert would use the knowledge contained in the domain knowledge. Thus it is more than just the representation of knowledge. For the assessment of a solution from the learner the solution can be compared to the solution the expert model provides on the same domain knowledge. Some models of ITS leave out this part and consider it included in the domain knowledge.

Student Model (or learner model)

The learner model is build in order to allow the system to refer to the probable current knowledge of the learner. The pedagogical module mainly uses this information. The system tries to determine the current knowledge either on the basis of a subset of the domain knowledge or of the deviation of the learner-performance from the expert performance in the same situation (overlay model). Sometimes Bayesian networks are used to determine the student's knowledge probabilistically, where each node in the network has a probability to indicate the likelihood of the student knowing this. In the learning-process any input from the learner is analysed and from this the knowledge of the learner is concluded. This does not take into account the general learning behaviour, as the actions from the student can only be observerd in "a narrow channel". Sometimes it will be difficult to assess whether the learner does not know something or whether a different strategy than the expert model suggests was used to acquire the knowledge.
The knowledge contained in the learner model can theoretically vary between the very general and simple statement "the learner knows this domain" and a detailed listing of all the actions the learner has taken. Of course most systems lie inbetween, i.e. some general and some specific information is used to represent the learner's knowledge. They mostly use the same granularity which is used in the domain knwoledge base.
Self, John A. in: O'Shea, Tim ; Sgurev, Vasil (Ed.), 1988 determines 6 groups of functions a learner model can have.
For diagnosing errors most system rely either on error-libraries or some method of machine-learning to be able to identify errors. A special problem is that errors from learners tend to occur not as single errors but compounds adding difficulties to the analysis process. Usually some heuristics are developed to minimize the search-space of hypotheses for the errors.
Schulmeister, Rolf, 1997 explicitly mentions the missing psychological principles used to evaluate the psychological plausibility of a solution. This is something which he sees as an important step towards reliable performance-evaluation.

Pedagogical Module (or tutor module)

The tutor model is responsible for the presentation of the right learning material at the right time. It therefore simulates the teacher in deciding on the basis of the expert knowledge, which action to take next. Since the student model is also used as input to this module the different actions reflect the different needs of each student.
Two main tutoring strategies are used in most systems.
After the decision on a "meta-strategy", low-level aspects must be considered. Following Beck, Joseph ; Stern, Mia et al., 1996 this includes:
What is missing in this respect in a lot of systems is what Schulmeister calls the "grammar of interaction". The system usually does not have knowledge about the structure of the situation and of general rules of interaction. Finally criticism is brought forward because "students are never involved in the development of tutor models", which leads to rather abstract models of students.
In this module the various task-types and their content are organized. For example in an ICALL system this module should decide whether a vocabulary test or a grammar exercise at which level is to be presented to the learner.

Communication Model

The communication model is the model that appears as the "intelligent" model to the user. The system reacts flexible in the presentation of learning materials and adaptive with regard to the assumed knowledge of the learner. Different forms of interaction are possible:
In general there are usually two types of systems. There is the system which follows the instruction-concept. This is the more guiding system which does not allow the learner to move around freely in the learning environment. It has a lot of interaction with the learner in order to guide the learner through units and exercises. The microworld-concept (or construction-concept) on the other hand is more free and does not much interfere with the learner's actions. This calls for making hypotheses by the learner about the material and experimenting with these hypotheses in the learning process.
It is not clear whether a natural-language-interface is really necessary to make up an ITS. Some authors claim that an ITS should at least realize a dialog which comes as close as possible to a natural dialog. This point must not be confused with language learning. Language is simply the main instrument for communication and therefore should be used as much as possible to allow the learner to express himself in the way he is used to.
The main reason for choosing an ITS is still the argument of individualization. An ITS should be able to adapt itself so that the learner is ideally supported for his learning needs. "Learning errors" like flaundering or overlooking learning opportunities are at the aim of an ITS to be avoided.
A major difficulty of ITSs is the gap between cognitive concepts which can be modelled according to the needs of the computer and pedagogical and psychological theories which have a different methodological status according to Schulmeister, Rolf, 1997. Hence they can not be implemented in a computerprogram. One major problem is the meaning of "understanding". Usual definitions of "understanding" can not be applied to ITS. True cognitive systems which might be able to "understand", have not been developed yet. The difference can also be described as "cognition" being on the one hand in cognitive psychology and on the other hand in cognitive science.
The use of simpler models of learning in ITS than the ones in cognitive psychology is simply the hope for theories, which can one day be implemented in a computer.

Adaptivity - Dimensions for categorization

Some parts of this section are based on Beck, Joseph ; Stern, Mia et al., 1996. One dimension can be the level of simulation a program tries to achieve. Some systems are simulations, which try to cover a most realistic working part of the real world whereas other programs might be further away from reality. Some programs even teach in a "decontextualized" manner. This then constitutes the opposite extreme in this dimension.
Another dimension is the type of knowledge, which is being taught. There have been several attemps of classifying "educational objectives" cf Bloom, Benjamin (Ed.), 1956. One could also ask what the student will be able to do when he has learned all the material from the Unit. This is a similar question but is not equal to the previous one. On the one hand the learner might be able to perform skills and on the other hand an abstract theory is now known to the learner. The most common type of knowledge is some kind of procedural skill. The learner should be able to perform a particular task. Since some systems are based on research in the cognitive psychology of human skill acquisition they may be called cognitive tutorsBeck, Joseph ; Stern, Mia et al., 1996. This is usually the case, when in some way the analysis of the learner action is based on a "cognitively" plausible expert model.
Knowledge based tutors in contrast to the former type are systems which aim to teach things like concepts and "mental models" (frameworks). Since there is less knowledge of how concepts are acquired by humans these systems tend to have a lot more domain knowledge. General teaching strategies are used to present the right material to the student at the right time. With the help of the knowledge base explanations to the student's action are generated. In both types of dimension the actual systems must be seen along a continuum. There are e.g. sytems which teach how to use Email-programs in UNIX, which are procedural skills. But then a large knowledge base is used to develop the explanations and the feedback.
There have been a series of conferences called ITS 'Year, for which proceedings were printed. The last one was 2000 in Montreal in Canada.
Books, articles etc.: Sleeman, D. ; Brown, J. S. (Ed.), 1982, Woolf, B. P. in: Kearsley, G.P. (Ed.), 1987, Sama Nwana, Hyacinth (Ed.), 1993, Yazdani, Masoud (Ed.), 1993, Greer, Jim (Ed.), 1994