MEDICAL VALIDATION OF THE ISABEL DIAGNOSTIC REMINDER SYSTEM (Isabel DRS)
Isabel is a novel computerised system that aims to support clinical decision making. The Isabel
DRS is intended to provide diagnostic assistance to clinicians in situations of uncertainty by suggesting prompts based on the patient‘s clinical features.
Isabel diagnostic advice is derived from searching unformatted medical textual content (statistical natural language processing). This powerful approach permits users to search by concept matching as well as word matching, and is significantly different from previous diagnostic expert systems. The clinical features input into the Isabel
system are also in unstructured free text language, in order to facilitate ease of use by clinicians in busy environments. These features enable user-friendliness; a number of important questions about the operational consequences of its usage have been evaluated in clinical practice.
Validation of the Isabel DRS has been conducted both by independent researchers, as well as system developers and other academic collaborators. These studies explored Isabel‘s accuracy, its dependence on user input, and its impact when used in a real life clinical environment. No expert diagnostic system has previously been validated in a real-life controlled study.
Research question 1:
How well does the Isabel DRS perform for diagnostically challenging cases when whole text data entry is used?
Time period: 2005 Setting: VA Medical Center, Northport, NY and the Department of Medicine, SUNY at Stony Brook, NY Investigators: Mark L. Graber, MD and Ashlei Mathews
50 cases from the “Case Records of Massachusetts General Hospital” were selected from a total of 61 cases from 2004/2005. Each case had a documented ‘correct’ diagnosis. Case histories (history, physical examination findings and laboratory test results, but not data from tables and figures) were pasted en bloc as natural text data entry. Findings were compared to the recommended strategy of entering discrete key findings.
Using whole text entry, the correct diagnoses was suggested in 37 of the 50 cases (74%). Using key findings entered manually, the correct diagnosis was suggested in 48 of the 50 cases (96%). The 2 missed diagnoses (progressive multifocal encephalopathy, and nephrogenic fibrosing dermopathy) were not included in the database at the time of the study.
Research question 2:
What is the impact of the Isabel DRS on diagnostic errors in a simulated environment?
Time period: 2005 Setting: Department of Pediatrics, University of Virginia, Charlottesville, VA and Dept of Health Evaluation Sciences, University of Virginia, Charlottesville, VA Investigators: Stephen M Borowitz, M.D., Larissa R Amy, M.S., Jason A Lyman, M.D., Patrick A Brown, M.D. and Mark J Mendelsohn, M.D
25 resident physicians were presented with a set of six simulated cases of differing difficulty. For each case, participants developed a list of likely diagnoses and an initial management plan before and after using the Isabel system. The quality of responses was compared to responses of a panel of three expert pediatric clinicians. Primary outcome measures were a change in the number of clinically important diagnoses included in the differential diagnosis, and a change in a previously validated diagnostic quality score (DQS).
In 15 of the 150 cases completed (10%), Isabel caused the user to include a major diagnosis they had not considered and should have. For each of the six cases, the mean diagnostic quality score increased significantly after residents consulted the Isabel system (0.028 + 0.049, 95% CI 0.020 - 0.036, p<0.001).
Research question 3:
How accurate is the Isabel DRS when used in a paediatric critical care in a developing nation?
Time period: January 2000July 2002 Setting: Department of Pediatrics, Seth GS Medical College & KEM Hospital,
Mumbai, India Investigators: SB Bavdekar and M Pawar
Resident medical officers extracted key clinical and laboratory findings on the basis of admission notes and results of investigations carried out within 30min of admission for patients admitted to a pediatric intensive care unit in a metropolitan hospital in India. The list of diagnoses generated by the Isabel DRS after submission of these terms was collected. The outcome measure studied was the presence of the final diagnosis in the list generated by the Isabel DRS.
200 subjects (boys 111, girls 89, aged 28 days-12 years) were analyzed. Congenital heart disease, respiratory tract infections, meningitis, tetanus and septicemia were the most frequently encountered diagnoses. The Isabel DRS was accurate in 80.5% of the cases.
Results were published in the Indian Pediatrics Journal in November 2005.
Research question 4:
Does the Isabel DRS have an educational impact on the differential diagnosis generated by medical students?
Time period: 2003-2004 Setting: Golisano Children's Hospital at Strong, Rochester, NY Investigators: FA Maffei, EB Nazarian, P Ramnarayan, NJ Thomas, JS Rubenstein
Medical students were randomly assigned to either the Isabel group (in which the students used Isabel in addition to their standard resources to generate differential diagnoses for patients) or a control group. Quantitative assessment of benefit was examined by using a standardised post-rotation test given to all students. User feedback and qualitative information was collected by means of a questionnaire and interviews.
43 students were randomised (22 Isabel group, 21 control). 15 students in the Isabel group completed the trial. 12/15 students found the Isabel DRS to be more useful than standard resources; 10/15 students reported that Isabel often or always provided an additional diagnosis not initially considered. All students interviewed agreed that the use of web-based tools should be incorporated into medical education after formal evaluation of such tools. There was no difference in post-rotation test scores between the two groups (ISABEL 73 vs. control 74, p = 0.67), perhaps due to small numbers.
These results were presented at the 15th Annual Pediatric Critical Care Colloquium, NYC in October 2004.
Maffei FA, Nazarian EB, Ramnarayan P, Thomas NJ, Rubenstein JS. Use of a Web-based Tool to Enhance Medical Student Learning in the Pediatric Intensive Care Unit and Inpatient Wards. Ped Crit Care Med. 6(1):109.
MEDICAL VALIDATION OF THE ISABEL DRS BY SYSTEM DEVELOPERS OR ACADEMIC COLLABORATORS
Research question 1:
Does the Isabel paediatric DRS suggest ‘clinically relevant’ diagnoses for a wide variety of cases, real and hypothetical?
Time period: AugustDecember 2000
99 hypothetical case scenarios, and clinical data from
100 real patients, were used to test the performance of the Isabel
pediatric DRS. The ‘correct’ or final diagnosis was known
for all cases. Hypothetical cases were provided by 12 different pediatricians,
and clinical data from real patients were collected from emergency departments
at 4 NHS sites. Isabel suggested the ‘correct’
or final diagnosis in 91% of the hypothetical cases, and 95% of the real
cases.
A two-person expert panel also provided an ‘optimal’
set of 2-3 diagnoses, for each real case, that juniors would have needed
to work up in order to ensure safe decision making. In 73% cases, Isabel
displayed all such diagnoses.
Results of this scientific study were peer-reviewed and published in
the Archives of Disease in Childhood.
Ramnarayan P, Tomlinson
A, Rao A, Coren M, Winrow A, Britto J. ISABEL: a web-based differential
diagnostic aid for pediatrics: results from an initial performance evaluation.
Arch Dis Child. 2003 May;88(5):408-13
Research question 2:
Does the Isabel DRS suggest ‘clinically relevant’ diagnoses in an adult emergency medicine setting?
Time period: September 2004August 2005 Setting: 3 NHS emergency departments Sponsor: Department of Health, Skipton House, London
Clinical data collected from patients presenting to three emergency departments with an acute
medical problem were entered by a research assistant into the diagnostic system. The displayed
results were assessed against final discharge diagnoses from patients who were admitted to hospital
(diagnostic accuracy) and against a set of 'appropriate' diagnoses for each case provided by an expert
panel (potential utility).
Data were collected from 594 patients (53.4% of screened attendances).
Mean age was 49.4 years (95% CI 47.7-51.1) and most patients had significant past illnesses.
The majority were assessed first by junior doctors (70%). 266/594 (44.6%) were admitted to hospital. Overall,
the diagnostic system displayed the final discharge diagnosis in 95% of inpatients and 90% of 'must-not-miss'
diagnoses suggested by the expert panel. The discharge diagnosis appeared within the first ten suggestions in 78% of cases.
Research question 3:
How does the Isabel paediatric DRS influence clinicians’ decision-making in a controlled environment?
Time period: January 2002August 2002 Funding & Sponsor: NHS R&D Unit, London Academic advisers: Centre for Health Informatics (CHIME), London
Dr Jeremy Wyatt, Knowledge Management Centre (UCL), currently Associate Director, R&D, NICE, UK
Since Isabel depends on users providing free text input, and relies on them to effectively process the advice suggested to make changes to their decisions, it is important to establish the impact of Isabel’s advice on ordinary clinicians. Even though Isabel may be extremely accurate, it is possible that users may not benefit from its use, due to poor quality input, or lack of confidence in the advice offered. It was important that extraneous factors such as lack of access to the Internet did not affect the study. Hence, it was conducted in an experimental setting.
76 clinicians of different grades (consultants, registrars, SHOs and medical students) assessed 24 paediatric case simulations and made clinical decisions both before and after receiving Isabel advice for the case. Changes in their decisions were recorded.
A two-person consultant panel independently provided ‘gold standard’ decisions for each case against which the user’s decisions were judged.
751 sets of decisions (before and after Isabel) were available at the end of this study.
In 95 cases, at least one ‘gold standard’ diagnosis was considered by the user only after Isabel advice (1 in 8 cases, 12.5%).
In 141 cases (19%), clinicians failed to include any of the ‘gold standard’ diagnoses before Isabel advice. This reduced to 101 cases after Isabel advice (13%).
70 important tests were ordered by the users only after Isabel advice was provided.
No adverse diagnoses were considered due to Isabel advice. Only 7 costly (unnecessary) tests were performed post-Isabel.
Results of this study has been published in BMC Medical Informatics & Decision Making in May 2006. They have also been presented in abstract form at various conferences including the Royal College of Paediatrics and Child Health annual meeting 2003, UK.
Research question 4:
Does free text data entry into Isabel affect the quality of advice provided?
Due to the presence of a customised thesaurus that converts most medical abbreviations, slang and inaccurate terminology provided by users into appropriate medical terms, Isabel provides users the ability to express clinical features in free natural language text. At the back-end, Isabel’s powerful natural language processing software also extracts key ideas from medical text accurately. These facilities allow users to perform rapid searches. Studies of previous diagnostic systems suggested that users took 20-40 minutes to enter clinical data using a controlled vocabulary.
The time taken to enter clinical data into Isabel was measured in the previous study (question 3). Participants took a median time of 6 minutes [total time taken to read and analyze the case history, elicit and enter key clinical features, and enter their differential diagnosis, investigations and treatment before looking at the Isabel results] and a further minute to process the advice provided.
Since each case was assessed by an average of 30 users in the study, providing 30 different expressions of user variability in input per case (24 x 30 combinations), Isabel’s advice was examined for each combination. For each case, irrespective of the user, ‘gold standard’ diagnoses were displayed by Isabel consistently. This indicates that the concern regarding inconsistent advice resulting from the free text entry of data is unfounded.
Results from this study are being written up and are due for submission for peer-review.
Research question 5:
Is it possible to objectively measure changes in clinical decision-making quality invoked by decision-support tools?
Since most of Isabel’s effect is on clinical decision-making in an acute setting, it is vital that an objective and sensitive instrument is used to measure these changes. Due to the lack of such a tool in the literature, a study was undertaken to develop and validate a new metric. Using a combination of old scoring systems, and novel concepts, a reliable and valid score was developed. This tool would be useful to measure the impact of Isabel’s advice on clinical decisions in a real life study.
Time period: September 2001-January 2002
Academic advisers: Centre for Health Informatics (CHIME), London
Dr Jeremy Wyatt, Knowledge Management Centre (UCL), currently Associate Director, R&D, NICE, UK
Results were peer-reviewed and published in the Journal of the American Medical Informatics Association
Ramnarayan
P, Kapoor RR, Coren M, Nanduri V, Tomlinson AL, Taylor PM, Wyatt JC, Britto
JF. Measuring the impact of diagnostic decision support on the quality
of clinical decision making: development of a reliable and valid composite
score. J Am Med Inform Assoc. 2003 Nov-Dec;10(6):563-72. Epub 2003 Aug
04.
This article was also discussed by Eta Berner in
an excellent editorial in the same issue, and details the difficulties
traditionally associated with the evaluation of the impact of diagnostic
aids.
Berner ES. Diagnostic decision support
systems: how to determine the gold standard? J Am Med Inform Assoc. 2003
Nov-Dec;10(6):608-10
Research question 6:
Does the Isabel pediatric DRS improve diagnostic decision-making
in the NHS?
Time period: July 2002-May 2003 Funding & Sponsor: Department of Health, Skipton House, London Academic advisers: Center for Health Informatics (CHIME), London
Dr Jeremy Wyatt, Knowledge Management Center (UCL), currently Associate
Director, R&D, NICE, UK
In this study, the impact of Isabel on decision-making
by junior doctors in 4 NHS pediatric units was measured in their natural
work environment. Juniors chose to use Isabel on cases
that they needed diagnostic advice on, rather than on all cases. Their decisions,
before Isabel advice was provided, and after, were recorded.
Doctors attempted to access Isabel
>500 times, but due to slow connection speeds on local computers
or NHS network problems, abandoned their attempt.
The computer/doctors ratio averaged 1/10 in
the centers. These few computers were not dedicated to Isabel
use; they were also used for checking lab results, accessing patient
admin systems etc.
Doctors recorded complete data on 125 patients.
104 available medical records were examined in this study.
4 consultants independently examined the medical
records to provide ‘gold standard’ decisions for safe
and appropriate patient management. They were blinded to the doctors’
decisions (both before and after Isabel advice).
In 47 cases (45%), the doctors failed
to consider all ‘important’ diagnoses for the patient
(as judged by the panel), implying unsafe diagnostic assessment
in nearly half the cases.
In 14 out of these 47 cases, Isabel
advice prompted the doctor to include all ‘important’
diagnoses (28%), rendering the assessment safe. This implies that
Isabel produced a meaningful change in the quality of diagnostic
decision making in 14/104 (13.5%) cases.
In a further 5 cases, Isabel
advice contained the appropriate suggestions, but they were ignored
by the doctors. Thus, Isabel had the potential
to convert 19 unsafe diagnostic assessments to safe diagnostic workups
(17% of the total).
Isabel advice prompted the doctors to perform 6
‘important’ tests.
Median extra time taken by the doctors to process Isabel
advice was 1 min 38 sec.
Results of this study has been published in BMC Medical Informatics & Decision Making in November 2006.
They have also been presented in abstract form at various conferences
including the Royal College of Pediatrics and Child Health annual meeting
2004, UK.
Research question 7:
Is the Isabel paediatric DRS useful in other settings other than acute paediatrics?
The impact of the Isabel DRS on decision-making
in acute general pediatrics prompted an examination of the potential
utility of the system in a critical care setting. In this setting, where
diagnostic decision making was not a large component of practice, it was
unclear whether Isabel would be of any benefit.
Clinical data was collected from 5 pediatric
critical care units (3 in the USA, 2 in the UK) on patients admitted
in a 3 month period in 2003.
The admitting team’s initial diagnostic assessment was also
recorded.
Isabel advice based on the
patients’ clinical features was shown to a consultant intensive
care physician at each unit. They identified clinically important
diagnoses for patient management, some of which were present in
the advice provided by Isabel.
Overall, in 40% of the 206 cases, Isabel
suggested ‘clinically important’ diagnostic alternatives
that were missed by the admitting team.
Research question 8:
Is the Isabel DRS useful in primary care practice?
Although Isabel was primarily developed
for use in a secondary care setting, there has been considerable interest
in the use of the DRS in primary care. One study that assessed the potential
utility of Isabel in primary care, conducted by Dr Claire
Scott, examined 1000 consecutive cases of negligence claims.
151 pediatric cases were identified; 104 were
labeled ‘failure/delay in diagnosis’ by a medical expert
working for the MPS, done previously, separate from this study.
Clinical features of the child at each GP assessment
were gathered from expert reports, case précis written by
medico-legal advisers, or original claim letters.
These clinical features were entered into Isabel
DRS, and the results compared to the final patient outcome.
Complete data was available in 88 cases. The
average number of GP consultations was 5, and the average number
of GPs involved in each case was 3.
Isabel displayed the correct
diagnosis (as judged from the patient’s outcome) in 69% of
the cases. However, since in 17% the GP was thought to have acted
appropriately, Isabel could have altered the patient’s
outcome in 52% cases by suggesting the correct diagnosis.
This study was reported in the Medical Protection Society, UK Casebook Journal:Click here to view
Research question 9:
Does the provision of handheld computers connected to the Internet increase accessibility to the Isabel DRS?
Primary researcher: Dr Richard Paget Funding & Sponsor: The Mercers, London
It is widely accepted that the potential utility of decision support systems
is limited by their accessibility. Since the Isabel DRS
is served on the Internet, and Internet access via desktop computers is
scarce, this study examined if the provision of personal handheld computers
connected to the Internet using wireless technology would increase access
to the Isabel DRS.
Junior doctors at 4 NHS hospitals were provided
access to XDA devices that connected to the Internet using GPRS
technology after a control period.
During the initial control period of 2 months,
Isabel access via standard desktop use was monitored
at each centre After an introductory training to the XDAs and a
‘run-in’ period, Isabel DRS access
via handheld computers was monitored for a further 2 months.
There was a 500% increase in access to the
Isabel DRS during the handheld period.