It was identified above that there are four possible types of features that can be derived as outputs from the MCRDR component of the RM algorithm. Integration of the MCRDR and ANN components is carried out by codifying the relevant features taken from MCRDR and converting these into a single input array of values, x , which is to be provided to the second component for processing. There were five association methods developed for integrating the two portions of the hybrid system. With the exception of the last method, each of them used a discrete on/off (1 and 0, respectively) input sequence.
5.2.4.1 Class Association (CA)
The class association (CA) method analyses every path returned from the
MCRDR component and determines the identified classes. Each classification that has been identified by the user is given an index number identifying which input neuron is its associated point of connection. From this information an input array can be formed, where each class currently identified by the expert has a single input into the network. If that class was found in a terminating node from a path taken from the MCRDR inferencing process, then this is set to be on, otherwise it is off.
The primary advantage of this association method is that it reduces the number of neurons being created in the ANN, thereby, also reducing the size of the input space. However, the reduced input space also means a reduced amount of information about the case, thereby, potentially limiting its learning ability and maybe allowing for less generalization. CA was rarely used in this thesis’s final results, as the other methods usually outperformed it in most situations.
5.2.4.2 Attribute Association (AA)
The attribute association (AA) method analyses every path returned from the MCRDR component and extracts all the attributes that were used in making each rule fire. Each attribute identified by the user was given an index number identifying which input neuron is its associated point of connection. From this information an input array can be formed, where each attribute currently identified by the expert has a single input into the network. If that attribute was found in a node from a path taken from the MCRDR inferencing process then this is set to be on. An added complication can come about from the possibility of a rule containing a ‘not’ attribute condition. This can be resolved by either having a negativeon, -1, or by treating it as a different attribute. If treated as a different attribute then it is given its own index and associated network input. The second option was the only method used in this thesis. A further variation is to only treat an attribute as being on if it was used in the terminating rules condition, instead of looking at them all through the path.
One interesting aspect of this method is that it takes contextually located symbols and uses them in a globalised environment. It is like the commonly tried keyword method except the keywords are selected online by the user, which may have interesting applications in some areas, such as information filtering. Another advantage of this method is that if a path, not usually followed, is taken resulting in similar attributes in many of the rules then the system could potentially still achieve a good generalised estimate. However, this association method is subject to the situation being used. The greatest problem area for this method is when the firing of a particular attribute is highly contextually dependant. In such situations inputs would fire globally and, therefore, not when required. Once again it was found in this study that it was rarely better than other methods.
5.2.4.3 Rule Path Association (RPA)
The rule path association(RPA) method analyses every path returned from the MCRDR component and determines the rules that fired. Each rule has an index number identifying which is the associated input neuron. From this information an input array can be formed, where each rule currently identified by the expert
has a single input into the network. If that rule is found in the path taken then it must have fired during the inferencing process, therefore, the input neuron is set to be on.
This method was consistently one of the better performing association methods. While it does produce significantly more input nodes, it allows significantly more information to be passed on from the first component. One advantage is that if a case only diverges slightly from a previous case, then the input to the network only changes slightly. Therefore, a lot of contextual information is passed on to the network.
5.2.4.4 Terminating Rule Association (TRA)
The terminating rule association (TRA) method is essentially a subset of the RPA method. It analyses every path returned from the MCRDR component and determines the rules that fired at the end of the path. Therefore, an input for each neuron is switched on only if the associated rule both fired and was either a leaf node or none of its children fired. This method creates neurons at the same time as the RPA integration method but does change the number of inputs that fire significantly. This technique did occasionally work well, however, it does not convey any contextual information. Therefore, it is better suited to situations where the path information inhibits learning. For instance, when many rules have been added as contradictions then the paths may not contribute as well. The RPA method works well when the rules created are specializations of the parent instead.
5.2.4.5 Decreasing Rule Path Association (DRPA)
The decreasing rule path association (DRPA) method is the only method
used in this thesis that does not use the on/off approach. This method could be viewed as a combination of the RPA and TRA association methods. For each rule that is encountered in a classification path an input is given for the associated neuron. The value of the neuron’s input, however, is measured by the distance from the terminating rule. Generally, this involves giving the terminating rule’s neuron an activation of 1, and a decreasing activation for each input that connects a rule higher up in the MCRDR tree. The amount of decrease
the static approach you may simply subtract some amount, such as 0.25, for each level. The relative approach alters the decrease according to how many rules are in the path. One simple relative approach is a linear decrease from 1 to 0 from the terminating rule to the root using equation 5-1.
( )
( )
* , , ,r P P R R R P r r d i v i = ∈ ∈℘ ∈℘ (5-1)Where P is the set of rules in an individual path; rid is the depth of the ith
rule, r, in the path (rd=0 at the root node and rd=||P|| at the terminating rule) and
v i
r is the activation value for the ith rule. Alternatively, a non-linear method
could be used such as a sigmoidal or exponential function. In this thesis only the linear method was used.
The original idea behind this method was to introduce a means of removing some degree of discreteness from the inputs. This was expected to help the ANNs develop; as such algorithms often perform better with continuous inputs. This method was occasionally found to produce a slightly better response than the RPA method; however, generally it did not significantly aid learning. One problem was the varied level of a particular node relative to the terminal node. Thus, sometimes a node would strongly contribute to the input space and other times it would have little effect at all. A second issue occurred when a node was the parent of multiple firing branches. If one branch was short and the other was long then the node could have either a high or low activation value. In this implementation it was always set to the highest possible value.