Pattern Attribute is a sophisticated tool for the extraction of information from free text. Consider the task of analysing clinical notes to determine whether or not a patient is a diabetic. A typical clinical note might be:
family hist diabetes
There are really two main phrases in this note: “family hist” and “diabetes”. Together, these mean that a relative of the patient has diabetes, but the patient may or may not have the disease.
Now consider another clinical note:
fh diabetes. ? anaemia
The fullstop indicates that there are two independent pieces of information in this note. The first is that there is a family history of diabetes (“fh” is short for “family history”) and the second is that the patient might be anaemic.
These two examples demonstrate three complications of text analysis:
- Key terms such as ‘diabetes’ have many synonyms and are often mis-spelled.
- The meaning of a key term in a block of text depends on what other key terms are nearby.
- There are generally several important pieces of information contained within a piece of text.
Pattern Attributes address these issues with a three-phase analysis of text.
- The text is converted to a list of keywords.
- The converted text is searched for meaningful patterns.
- Attributes are put into the case to represent these patterns.