Background: A lack of studies with large sample sizes of patients with rotator cuff tears is a barrier to performing clinical and genomic research. Objective: To develop and validate an electronic medical record (EMR)–based algorithm to identify individuals with and without rotator cuff tear. Design: We used a deidentified version of the EMR of more than 2 million subjects. A screening algorithm was applied to classify subjects into likely rotator cuff tear and likely normal rotator cuff groups. From these subjects, 500 likely rotator cuff tear and 500 likely normal rotator cuff were randomly chosen for algorithm development. Chart review of all 1000 subjects confirmed the true phenotype of rotator cuff tear or normal rotator cuff based on magnetic resonance imaging and operative report. An algorithm was then developed based on logistic regression and validation of the algorithm was performed. Results: The variables significantly predicting rotator cuff tear included the number of times a Current Procedural Terminology code related to rotator cuff procedures was used (odds ratio [OR] = 3.3; 95% confidence interval [CI]: 1.6-6.8 for ≥3 vs 0), the number of times a term related to rotator cuff lesions occurred in radiology reports (OR = 2.2; 95% CI: 1.2-4.1 for ≥1 vs 0), and the number of times a term related to rotator cuff lesions occurred in physician notes (OR = 4.5; 95% CI: 2.2-9.1 for 1 or 2 times vs 0). This phenotyping algorithm had a specificity of 0.89 (95% CI: 0.79-0.95) for rotator cuff tear, area under the curve (AUC) of 0.842, and diagnostic likelihood ratios (DLRs), DLR+ and DLR− of 5.94 (95% CI: 3.07-11.48) and 0.363 (95% CI: 0.291-0.453). Conclusion: Our informatics algorithm enables identification of cohorts of individuals with and without rotator cuff tear from an EMR-based data set with moderate accuracy.
ASJC Scopus subject areas
- Physical Therapy, Sports Therapy and Rehabilitation
- Clinical Neurology