0. PRESENTACIÓN Y ANTECEDENTES
1.3. Estado Forestal
1.3.3 Diseño del inventario
permanent residents. This publicly-funded health insurance scheme is administered by a federal government department called the Health Insurance Commission (HIC). In addition, the Australian Department of Health and Ageing (DoHA), after consultation with the medical fraternity, publishes a manual called Medicare Benefit Schedule (MBS) in which it details each medical treatment procedure and its associated rebate to the medical service providers who provide such services. When a patient visits a medical service provider, the HIC will refund or pay the medical service provider at the rate published in the MBS1 (the MBS is publicly available online from http:// www.health.gov.au/pubs/mbs/mbs/css/index.htm).
Therefore, the description of medical treatment procedures in the MBS should be clear and unambiguous to interpretation by a reasonable medical service provider as ambiguities would lead to the wrong medical treatment procedure being used to invoice the patient or the HIC. However, the MBS has developed over the years, and is derived through extensive consultations with medical service providers over a lengthy period. Consequently, there may exist inconsistencies or ambiguities within the schedule. In this chapter, we propose to use text mining methodologies to discover if there are any ambiguities in the MBS.
Figure 1. An overview of the MBS structure in the year of 1999
Group I1 Ultrasound Category 5 Diagnostic Imaging Group O1 Consultation Category 4 Oral Services Group I5 Magnetic Resonance Group I2 Tomography ... Group O9 Nerve Blocks Group O2 Assistance ... Group P1 Heamatology Category 6 Pathology Services Group P11 Specimen referred Group P2 C h e m i c a l . . . Group C3 Prosthodontic Category 7
Cleft Lip and Cleft Pallate Services Group C1 Orthodontic Group C2 Maxilloacial Group A15 Medical (Emergency) Category 1 Professional Attendance Group A1 General Practitioner Group A2 Other Non-preferred ... Category 2 Diagnostic Procedures Group D1 Misc. Group D2 Nuclear Group T9 Amesthesia Category 3 Therapeutic Procedures Group T1 Misc. Group T2 Radiation ... M B S 1999
The MBS is divided into seven categories, each of which describes a collection of treatments related to a particular type, such as diagnostic treatments, therapeutic treatments, oral treatments, and so on. Each category is further divided into groups. For example, in category 1, there are 15 groups, A1, A2, …, A15. Within each group, there are a number of medical procedures which are denoted by unique item numbers. In other words, the MBS is arranged in a hierarchical tree manner, designed so that it is easy for medical service providers to find appropriate items which represent the medical proce- dures provided to the patient.2 This underlying MBS structure is outlined in Figure 1.
This chapter evaluates the following:
•
Hypothesis — Given the arrangement of the items in the way they are organised in the MBS (Figure 1), are there any ambiguities within this classification? Here, ambiguity is measured in terms of a confusion table comparing the classification given by the application of text mining techniques and the classification given in the MBS. Ideally, if the items are arranged without any ambiguities at all (as measured by text mining techniques), the confusion table should be diagonal with zero off diagonal terms.•
Optimal grouping — Assuming that the classification given in MBS is ambiguous (as revealed in our subsequent investigation of the hypothesis), what is the “optimal” arrangement of the item descriptions using text mining techniques (here “optimal” is measured with respect to text mining techniques)? In other words, we wish to find an “optimal” grouping of the item descriptions together such that there will be a minimum of misclassifications.The benefits of this work are as follows:
•
From the DoHA point of view, it will allow the discovery of any existing ambiguities in the MBS. In order to make procedures described in the MBS as distinct as possible, the described methodology can be employed in evaluating the hypoth- esis in designing the MBS such that there would not be any ambiguities from a text mining point of view. This will lead to a better description of the procedures so that there will be little misinterpretation by medical service providers.•
From a service provider’s point of view, the removal of ambiguities would allow efficient computer-assisted searching. This will limit misinterpretation, and allow the implementation of a semi-automatic process for the generation of claims and receipts.•
While the “optimal grouping” process is mainly derived from a curiosity point of view, this may assist the HIC in re-grouping some of their existing descriptions of items in the MBS, so that there will be less opportunities for misinterpretation. Obviously, the validity of the described method lies in the validity of text mining techniques in unambiguously classifying a set of documents. Unfortunately, this may not be the case, as new text mining techniques are constantly being developed.However, the value of the work presented in this paper lies in the ability to use existing text mining techniques and to discover, as far as possible, any ambiguities within the MBS. This is bound to be a conservative measure, as we can only discover ambiguities as far as possible given the existing tools. There will be other ambiguities
which remain uncovered by current text mining techniques. But at least, using our approach will clear up some of the existing ambiguities. In other words, the text mining techniques do not claim to be exhaustive. Instead, they will indicate ambiguities as far as possible, given their limitations.
The structure of this chapter is as follows: In the next section, we describe what text mining is, and how our proposed techniques fall into the general fabric of text mining research. In the following section, we will describe the “bag of words” approach to text mining. This is the simplest method in that it does not take any cognizance of semantics among the words; each word is treated in isolation. In addition, this will give an answer to the hypothesis as stated above. If ambiguities are discovered by using such a simple