Case-based Reasoning: Rough Sets vs Bayesian Networks
 

Case-based  reasoning

            Case-based reasoning is using previous solutions to problems to solve new ones.  When faced with a new problem, the idea is to find a similar past case and use the solution to solve the new problem.  Each time a problem has been solved, it is retained for future reference, increasing the number of cases that are available (AICom Paper).  This is actually something people do naturally without even thinking about it.  For example, a doctor that examines a patient may be reminded of a previous patient in terms of their symptoms.  He may or may not decide to use the same treatment based on the outcome of the first patient.  However, using case-based reasoning allows someone who has not experienced the same situations to use the information of others’ successes and failures to solve their own problem.

            Case-based reasoning began in the early 1980’s with Roger Schank’s model of dynamic memory.  This became the basis for the first CBR systems: Janet Kolodner’s CYRUS and Michael Lebowitz’s IPP.  Its popularity grew in the 1990’s, especially in the international community (Nupedia).  Currently the number of research projects and activities is steadily growing.  Now there are several uses for CBR, including diagnosis, help desk, decision support, and architectural and industrial design (aiai.ed).

CBR relies on the amount and quality of the collected data, the amount of background knowledge, and how the cases are compared.  CBR is used mainly for frequently changing domains.  The underlying processes governing the domain is unclear.

A case refers to a problem situation A past case, or stored case contains a description of a problem, its solution, outcome, and notes about how the solution was derived.  A new or unsolved case simply contains the description of the problem to be solved.

Case-based reasoning is a cycle.  There are four major steps in the process of case-based reasoning, each of which can be further broken down.  They are:

1. Retrieve the most relevant cases from the case library

2. Reuse the solution from the chosen case to solve the current problem

3. Revise the solution if necessary.

4. Retain the final solution as a new case in memory.

            The retrieve step begins with a problem description, and its end result should be the best matching previous case.  It involves 4 subtasks: Identify Features, Initially Match, Search, and Select (AICom Paper).  One item of concern in this step is how to judge similarity between cases.  Some systems use surface, syntactical similarities among problem descriptors, while others use deeper, semantical similarities (AICOM Paper).  In identification, the goal is to understand the problem rather than to simply use the input descriptors.  To identify the relevant problem features, the system must filter out unimportant problem descriptors, infer other relevant problem features, and check whether the feature values make sense.  Initially matching retrieves a set of reasonable candidates.  This is done by using the problem descriptors as indexes to the case memory.

            The reuse step focuses on the differences among the past and current case, and what part of a retrieved case can be transferred to the new case.  In simple tasks the differences are considered non relevant and the solution of the retrieved case is directly transferred to the new case as its solution class.  However, most systems take into account the differences and have to go through an adaptation process.  There are two ways to reuse cases:       

1. reuse the past case solution(transformational reuse), or

2. reuse the past method that constructed the solution(derivational reuse).

Transformational reuse takes the old solution and applies some transformational operators to transform it into a new solution for the new case.  Derivational reuse takes the method to solve the old problem and tries it on the new problem.

            The revise step can be further broken down into the subtasks of evaluating the solution and repairing the fault.  The evaluation task must be done outside the CBR system.  It involves applying the solution in a real environment.  Case repair involves detecting the errors of the current solution and retrieving or generating explanations for them.  Then the failure explanations are used to modify the solution so the failures do not occur.

            The retain step is important because this is how the case base is enlarged.  It  involves extracting, indexing, and integrating.  In extracting, a decision must be made about what to use as the source of learning.  Some examples are the relevant problem descriptors, problem solutions, and explanation of why a solution is a solution to a problem.  Failures may also be extracted and retained.  Indexing can be done in more than one way, and is a much focused problem in case-based reasoning.  The final step of integrating is to update the case base with the new knowledge.  The index strengths or importances for a particular case  are adjusted due to the success or failure of using the case to solve the input problem.

CBR depends on the use of a widespread library of cases.  Therefore the representation of these cases is especially important in being able to solve new cases.  Two of the influential case memory models are the Dynamic Memory Model and the Category and Exemplar Model.

Dynamic Memory Model

Memory Organization Packets are the basic unit in dynamic memory. There are two kinds of  Memory Organization Packets which are used to represent information about classes of events. Instances representing cases, events or objects, and abstractions representing generalized versions of instances or of other abstractions.

The basic idea is to Organizing specific cases which share similar properties under a more general structure is the idea behind a generalized episode. A GE contains three different types of objects: norms, cases and indices. Norms are one of the objects contained in a GE which features characteristics common to all cases indexed under a GE. Indices are features which discriminate between a GE’s cases. An index may point to a more specific generalized episode or to a case.

The primary role of a GE is as an indexing structure for storing, matching and retrieval of cases. During case storage when a feature (i.e., index name and index value) of a new case matches a feature of an existing case a new GE is created. The two cases are then discriminated by indexing them under different indices below the new GE (assuming the cases are not identical). Thus, the memory is dynamic in that similar parts of two cases are dynamically generalized into a new GE, the cases being indexed under the GE by their differences.

Category and Exemplar Model

Cases are associated with a category. Different case features are assigned different importance in describing a case's membership to a category. Three types of indices are used. Feature links point from features to a case or category, case links  point from categories to its associated cases, and difference links point from categories to cases that only differ in a small number of features.

A feature is described by a name-value pair. A new case is stored by searching for a matching case and by establishing the relevant feature indices. If a case is found with to be very similar to a new case, the new case may not be retained, or the two cases may be merged.

 
Resources

http://www.aiai.ed.ac.uk/links/cbr.html

http://www.nupedia.com/article/short/Case-Based+Reasoning/

Klaus-Dieter Althof, Eric Auriol, Ralph Barlette, and Michel Manago. A Review of Industrial Case-Based   Reasoning Tools. AI Intelligence, 1995.

David B. Leake. Case-Based Reasoning – Experiences, Leasons and Future Directions. AAAI/MIT Press, 1996.

Main | Case-Based | Rough Sets | Bayesian | Results