Abstract
Artificial intelligence technology is being widely developed in dermatology. However, there remains a lack of comprehensive data analyzing the diagnostic performance of artificial intelligence in skin cancer. We aimed to evaluate the diagnostic accuracy of artificial intelligence in skin cancer detection. MEDLINE, Embase, Cochrane library, Web of Science, and Scopus were searched from database inception to 9 April, 2025. Studies were included if they exclusively assessed the diagnostic accuracy of artificial intelligence for primary cutaneous malignancies. The artificial intelligence performance in skin cancer diagnosis was evaluated using accuracy, area under the curve value, sensitivity, and specificity. Twenty-eight systematic reviews and meta-analyses were included. Across the studies, reported sensitivity ranged from 83.7 to 94.4% for basal cell carcinoma, 57.0-90.1% for squamous cell carcinoma, and 48-100% for melanoma. Specificity ranged from 77.9 to 96% for basal cell carcinoma, 92.6-98% for squamous cell carcinoma, and 36-100% for melanoma. Area under the curve values extracted from the reviews varied widely, generally ranged from 0.61 to 0.99. Narrative comparisons within the included studies suggested that deep learning models frequently demonstrated diagnostic performance non-inferior or superior to human clinicians, although prospective validation in real-world clinical workflows remains limited. Current evidence suggests that artificial intelligence technologies have demonstrated potential for skin cancer diagnosis, but with important limitations. Variability in diagnostic metrics, driven largely by data heterogeneity and differing validation strategies, poses significant challenges. Emerging evidence suggests future research should transition toward multimodal artificial intelligence systems that integrate structured clinical metadata with image analysis. This will require methodological standardization and validation in real-world settings.
Citation
ID:
9544
Ref Key:
jeon2026evaluating