Studies on SRT Errors

Error Studies in Speech Recognition

Written by Laura Bryan, MT (ASCP), CHDS, AHDI-F

An exhaustive search of peer-reviewed literature published in the past 10 years demonstrates a paucity of research on the use of speech recognition technology to capture clinician dictation.  The following are excerpts from studies involving speech recognition technology used for medical record documentation. 


1) Reports dictated with voice recognition took 50% longer to dictate despite being 24% shorter than those conventionally transcribed, 2) There were 5.1 errors per case, and 90% of all voice recognition dictations contained errors prior to report signoff while 10% of transcribed reports contained errors. 3). After signoff, 35% of VR reports still had errors. (Pezzullo et al., 2008)


Despite the frequent introduction of voice recognition (VR) into radiology departments, little evidence still exists about its impact on workflow, error rates and costs. …42 % and 30% of the finalized VR reports for each of the radiologists investigated contained errors. (Strahan & Schneider-Kolsky, 2010)


Despite the potential to dominate radiology reporting, current speech recognition technology is thus far a weak and inconsistent alternative to traditional human transcription. This is attributable to poor accuracy rates, in spite of vendor claims, and the wasted resources that go into correcting erroneous reports. (Voll, Atkins, & Forster, 2008)


Automatic speech recognition technology has a high frequency of transcription errors, necessitating careful proofreading and report editing. …  (22%) contained errors. Report error rates by individual radiologists ranged from 0% to 100%. ….  The most frequent types of errors were wrong-word substitution, nonsense phrases, and missing words. Fifty-five of 88 radiologists (63%) believed that overall error rates did not exceed 10%, and 67 of 88 radiologists (76%) believed that their own individual error rates did not exceed 10%…  More than 20% of our reports contained potentially confusing errors…   (Quint, Quint, & Myles, 2008)


…a consistent claim across the speech recognition industry is that SRTs can reduce costs due to faster turn-around times of medical documentation, higher efficiency, and increased accuracy. Little research exists on the impact of SRT technologies on the actual work of creating medical records. (David, Chand, & Sankaranarayanan, 2014)


Twenty respondents had been selected to test the system and the outcome from the experiment showed that speech recognition application has faster time compared to text writing. However, data captured through speech recognition once translated to health records is always inaccurate. It was noted that the human factors such as accent and tone in speaking affect the translation of speech recognition into medical records. (Abd Ghani & Dewi, 2012)


Our key recommendation from this study is that as the QA function is removed through the implementation of new technologies, more attention needs to be paid on the potential impacts of this decision, on the quality of the documentation produced  (David et al., 2014)


In the physician-as-editor model, it is assumed that the physician will find errors, edit the

document, and do the proper formatting.  There is evidence, however, that this assumption does not necessarily hold, and that  physicians do not take the time to proof-read and edit their records. (David et al., 2014)


Furthermore, hospital administrators need to consider how to best maintain QA functions when the method of medical record production undergoes drastic transformation as when once-and-done production technologies are introduced. (David et al., 2014)


(UACs) fell into nine major categories (in order of decreasing frequency): 1) more/new work for clinicians; 2) unfavorable workflow issues; 3) never ending system demands; 4) problems related to paper persistence; 5) untoward changes in communication patterns and practices; 6) negative emotions; 7) generation of new kinds of errors; 8) unexpected changes in the power structure; and 9) overdependence on the technology. (Ash, Sittig, Dykstra, & Campbell, 2009)


The results demonstrated that on the average, on the order of 315,000 errors in one million dictations were surfaced.  This shows that medical errors occur in dictation and quality assurance measures are needed in dealing with those errors….Anecdotal evidence points to the belief that records created directly by physicians alone will have fewer errors and thus be more accurate.  This research demonstrates this is not necessarily the case when it comes to physician dictation.  As a result, the place of quality assurance in the medical record production workflow needs to be carefully considered before implementing a "once-and-done” (ie, physician-based) model of record creation. (David et al., 2014)


At least one major error was found in 23% of ASR reports, as opposed to 4% of conventional dictation transcription reports (p < 0.01). Major errors were more common in breast MRI reports (35% of ASR and 7% of conventional reports), the lowest error rates occurring in reports of interventional procedures (13% of ASR and 4% of conventional reports) and mammography reports (15% of ASR and no conventional reports) (p < 0.01). (Basma, Lord, Jacks, Rizk, & Scaranelo, 2012)


Errors were divided into two categories, significant but not likely to alter patient management and very significant with the meaning of the report affected, thus potentially affecting patient management (nonsense phrase). Three hundred seventy-nine finalized CR (plain film) reports and 631 non-CR (ultrasound, CT, MRI, nuclear, interventional) finalized reports were examined. Eleven percent of the reports in the CR group had errors. Two percent of these reports contained nonsense phrases. Thirty-six percent of the reports in the non-CR group had errors and out of these, 5% contained nonsense phrases. (Chang, Strahan, & Jolley, 2011)

"My reports -- and I try to be careful -- average seven errors per report, which go from punctuation to ludicrous," said Dr. Michael McNamara Jr. from Case Western Reserve University School of Medicine. "[Voice recognition software] inserts a ''no,'' it drops a ''no'' -- it''s a very dangerous weapon and we have to use it very, very carefully," he said.



Note: To the author’s knowledge, this list represents the extent of peer-reviewed research papers published in the last 10 years studying the use of speech recognition in healthcare. 

Abd Ghani, M. K., & Dewi, I. N. (2012). Comparing speech recognition and text writing in recording patient health records. In 2012 IEEE-EMBS Conference on Biomedical Engineering and Sciences (pp. 365–370). IEEE. doi:10.1109/IECBES.2012.6498100

Ash, J. S., Sittig, D. F., Dykstra, R., & Campbell, E. (2009). The Unintended Consequences of Computerized Provider Order Entry: Findings From a Mixed Methods Exploration. International Journal of Medical Informatics,78(Suppl 1), 1–14. doi:10.1016/j.ijmedinf.2008.07.015.The

Basma, S., Lord, B., Jacks, L. M., Rizk, M., & Scaranelo, A. M. (2012). Error Rates in Breast Imaging Reports: Comparison of Automatic Speech Recognition and Dictation Transcription. American Journal of Roentgenology,197(4). Retrieved from

Chang, C. A., Strahan, R., & Jolley, D. (2011). Non-clinical errors using voice recognition dictation software for radiology reports: a retrospective audit. Journal of Digital Imaging, 24(4), 724–8. doi:10.1007/s10278-010-9344-z

David, G. C., Chand, D., & Sankaranarayanan, B. (2014). Error rates in physician dictation: quality assurance and medical record production. International Journal of Health Care Quality Assurance, 27(2), 99–110. doi:10.1108/IJHCQA-06-2012-0056

Pezzullo, J. A., Tung, G. A., Rogg, J. M., Davis, L. M., Brody, J. M., & Mayo-Smith, W. W. (2008). Voice recognition dictation: radiologist as transcriptionist. Journal of Digital Imaging, 21(4), 384–9. doi:10.1007/s10278-007-9039-2

Quint, L. E., Quint, D. J., & Myles, J. D. (2008). Frequency and spectrum of errors in final radiology reports generated with automatic speech recognition technology. Journal of the American College of Radiology : JACR,5(12), 1196–9. doi:10.1016/j.jacr.2008.07.005

Strahan, R. H., & Schneider-Kolsky, M. E. (2010). Voice recognition versus transcriptionist: error rates and productivity in MRI reporting. Journal of Medical Imaging and Radiation Oncology, 54(5), 411–4. doi:10.1111/j.1754-9485.2010.02193.x

Voll, K., Atkins, S., & Forster, B. (2008). Improving the utility of speech recognition through error detection. Journal of Digital Imaging, 21(4), 371–7. doi:10.1007/s10278-007-9034-7


Document prepared by Laura Bryan, MT (ASCP), CHDS, AHDI-F,