The Digital Dermatoscope: AI’s Accuracy in Detecting Lesions in Skin of Colour

Artificial Intelligence (AI) is rapidly reshaping healthcare, from streamlining administrative workflows to improving clinical diagnostics. One of its most promising applications lies in dermatology, where AI models trained on skin lesion images can outperform general practitioners (GP) in diagnostic accuracy for visually assessed conditions [1]. As we look more closely at the UK context, the need for such innovation becomes even more apparent [2].

Melanoma skin cancer remains the fifth most common cancer in the UK and there are approximately 100,000 non-melanoma skin cancer cases diagnosed in the UK every year [3,4]. In the UK alone, over 1.2 million dermatology referrals occur annually, with approximately 60% flagged as urgent suspected cancer [2]. However, only about 6% of these referrals result in a cancer diagnosis, demonstrating a severe over-triage problem, consuming specialist resources and contributing to delays in care [2]. This increasing demand places pressure on an already strained system and highlights the requirement for more efficient triage solutions.

The strain on dermatology services is part of a larger NHS capacity crisis, which AI could help alleviate. As of January 2025, 7.4 million NHS patients were awaiting treatment, with only 58.9% seen within the 18-week standard, far beneath the 92% NHS benchmark [2]. Dermatology is particularly affected due to the large and growing patient base with a small, specialised workforce in senior doctors and nurses. There is a concern regarding lack of specialist knowledge in GPs as there is only a maximum of 6 days of dermatology teaching in undergraduate training, even though 25% of GP appointments are concerning skin [5]. This lack of training leads to the low number of consultants to advise them, making delays in diagnosis, leading to potentially life-threatening consequences [5].

Here, AI offers transformative potential. Machine learning algorithms, especially deep learning classifiers, show promise in identifying malignant lesions earlier, streamlining triage by prioritising high-risk cases and reducing unnecessary referrals. They can act as clinical decision aids, particularly for less experienced practitioners, supporting equitable and efficient care. Figure 1 displays how AI can have many different inputs in various steps of the referral pathway [6]. AI can be implemented in the pre-primary care pathway, primary care, triage and secondary care [6]. 

Figure 1: Illustration of how AI technologies can be integrated at various stages of the skin cancer diagnostic pathway. Figure from a study in 2025 by Jones et al. [6]

In addition to improving clinical workflow, AI could reduce healthcare costs. Cost analysis illustrates that teledermatology saves £47.97 and 5 minutes per consultation in comparison to in-person evaluations [7]. Triage through AI could reduce unnecessary interventions, conserving both time and resources. Despite these advantages, a critical question persists: Can these AI systems serve everyone equally, regardless of skin tone? Emerging research suggests the answer is a cautious not yet [8-11].

AI’s current blind spots risk long-standing disparities in dermatology care, particularly in skin of colour (SOC) or type IV-VI individuals on the Fitzpatrick scale – a classification from Type I (very fair) to Type VI (deeply pigmented dark brown or black skin) [12]. Addressing the inequalities for SOC individuals is essential in ensuring that AI serves as a truly inclusive and effective healthcare tool. To fully harness AI’s potential, we must assess its historical development in dermatology and how we arrived at these challenges.

A Brief History of Dermatology AI

AI entered dermatology in 1987, with early microcomputers assisting dermatopathology diagnosis [13]. Progress was initially slow due to limited labelled data and hardware constraints. Over time, artificial neural networks (ANNs) emerged to help distinguish benign from malignant lesions. Advancement in deep learning, especially convolutional neural networks (CNNs), revolutionized AI’s ability to analyse complex image data [14]. By the 2000s, neural networks were being applied to conditions like melanoma, non-melanoma skin cancer, vitiligo, and atopic dermatitis, laying the foundation for today’s AI tools [13]. Understanding how these systems analyse skin is key to addressing the equity concerns.

AI dermatology tools primarily rely on visual data, clinical images labelled by clinicians or confirmed via biopsy to learn how to identify patterns associated with malignancy [18]. AI frequently uses machine learning (ML), where algorithms detect patterns in labelled data to make predictions [19. 20]. Many ML systems are built off ANNs, computational models inspired by the structure of the human brain [20]. For image classification, CNNs are particularly effective. Classification of images begins by passing the image’s pixel data through many convolutional layers, which act like filters to detect specific patterns or features in the image, for example the blue-grey veil seen in melanomas (Figure 2). These filters build up a categorised understanding, detecting more complex patterns [19]. Recent developments have led to the creation of certified AI tools designed for dermatological diagnostics.

Figure 2: This diagram illustrates how Convolutional neural networks (CNNs) function in distinguishing a melanoma from a benign naevus. Figure from a 2020 study by Du‐Harpur et al. [19]

The UK’s National Institute for Health and Care Excellence (NICE) has reviewed AI applications such as the Deep Ensemble for Recognition of Malignancy (DERM) and Moleanalyzer Pro [2]. DERM is a UKCA-certified Class IIa medical device that analyses dermoscopic images for malignancy [15, 16]. Moleanalyzer Pro uses CNNs to assess melanoma risk via clinical images, with or without dermoscopy [17].These tools show promise, but broad implementation awaits further clinical validation, particularly across diverse patient populations.

Real Data, Real Disparities

Visible symptoms of skin disease often present differently depending on the patient’s skin tone, impacting AI performance. For instance, erythema may appear pink on lighter skin (types I-III) but deep purple or even invisible on darker tones (types IV-VI) [21]. Despite this, most publicly available datasets, for example the International Skin Imaging Collaboration (ISIC) archive, are heavily biased toward lighter skin types I – III [9]. This lack of diversity reduces model accuracy and generalisability across SOC, particularly types IV – VI [9]. To fully understand how these disparities translate into real world outcomes, we must examine how models perform on representative datasets.

Studies testing AI models on more diverse datasets reveal stark performance gaps when applied to SOC. For example, the Diverse Dermatology Images (DDI) dataset was developed to assess the performance of AI models on SOC. When state-of-the-art models were tested on DDI, their performance dropped by 27-36% compared to benchmark results on lighter skin tones [10].This substantial decline underscores how limited training diversity creates critical blind spots even in the most advanced algorithms.

Similarly, the DERM-003 tool, evaluated in a large European clinical trial, demonstrated high overall diagnostic accuracy. However, 96.9% of participants had lighter skin types, while only 2.2% had darker skin [16]. Notably, no cancers were detected in the darker-skinned subgroup, rendering any meaningful assessment of performance on SOC infeasible. This lack of demographic diversity severely limits the generalizability of the tool and highlights a common shortcoming in AI validation studies. Yet, even when datasets are balanced by skin tone, challenges persist in model performance.

Balancing datasets by skin tone does not necessarily resolve performance disparities in AI models. A 2024 study trained separate AI models on equal numbers of images for melanoma and basal cell carcinoma (BCC), stratified by skin tone [6]. Despite equivalent dataset sizes, the model trained on darker skin underperformed (AUROC = 0.500) compared to its counterpart trained on lighter skin (AUROC = 0.598). These findings suggest that even with balanced data, intrinsic challenges in disease presentation across skin tones may affect model efficacy, further complicating efforts toward equity in dermatologic AI.

Diversity needs to be incorporated but there is difficulty in the frequency and quality of SOC data. The ISIC archive lacks adequate skin tone metadata [10]. Additionally, the Fitzpatrick17k includes diverse tones but relies on crowdsourced labels and lacks biopsy confirmation [9]. Furthermore, melanoma incidence is significantly lower in SOC populations: 1.0 per 100,000 in Black Americans versus 23.5 in white Americans; making it harder to build balanced datasets. Still, Black melanoma patients have worse survival rates (73% vs. 88% for white patients) due to late detection [9]. This contrast, rarer disease but worse outcomes, demands AI that can detect early warning signs effectively across all skin tones, regardless of rarity. These differences in presentation across skin tones are vital and need to be addressed in medical education.

Humans struggle too, It’s not just AI. There are racial inequalities in dermatology as 47% of dermatologists felt their teaching was inadequate for diagnosing skin diseases in darker skin tones [22], with a 6% drop in diagnostic accuracy among clinicians when diagnosing conditions on dark skin versus light [11]. Dermatology textbooks include just 4 – 18% images of dark skin leaving clinicians less prepared to recognize signs in SOC [24]. This disparity can lead to misdiagnosis; for example, psoriasis can present differently on different skin types. On type VI skin, psoriatic plaques appear dark brown with pink-white areas whilst type II, plaques present as classic bright red and erythematous lesions. Type IV can present as small, violaceous plaques with minimal scale, similarly to lichen planus, (shown in Figure 3) leading to incorrect management. Lighter skin is treated as the ‘norm’ and darker skin is often underrepresented. Addressing this disparity requires not only improving representation in textbooks and training materials but also confronting the structural racism that has historically shaped medical knowledge and practice. Enhancing diverse skin tone training in doctors and in AI is critical to achieving equitable care and reducing disparities in diagnosis and treatment. Fortunately, technical methods to mitigate AI bias are being actively explored.

Figure 3: Psoriatic lesions on multiple skin tones. Figure from a study in 2023 by Chatrath et al. [24]

Debiasing the Machine

Several technical strategies aim to reduce skin bias in AI models, though ethical and practical challenges remain. This is because bias in AI reflects bias in data. Constructive Latent Group Reweighting (CLGR) and contrastive learning teaches models to “unlearn” skin tone during training. CLGR narrowed the performance gap between light and dark skin by 1-2%, and contrastive learning improved dark-skin classification performance by 2.2%, a small but meaningful improvement in clinical AI [9]. However, these techniques raise ethical questions: Should models learn to ignore skin tone altogether, or be altered to account for it? What happens when these debiasing methods introduce new errors? The effectiveness of these solutions also hinges on accurate labelling of skin tone.

Accurately labelling skin tone in AI datasets is difficult, and labelling errors can reinforce algorithmic bias. Many studies use an automated metric called Individual Typology Angle (ITA) to classify images. ITA often misclassifies by over one Fitzpatrick type in 30% of cases [25]. Though dermatologist input may be better, it’s time-intensive and subjective. New automated labelling methods need better validation, especially for clinical use. Nevertheless, when diversity is incorporated correctly, AI can perform even better than human clinicians.

Regardless of these limitations of AI, it has been proven to be very effective when trained on diverse data. With the diversity, AI can match or exceed human diagnostic accuracy, even in SOC. A 2024 study found an AI model achieved Top 1 diagnostic accuracy of 86.5% on 163 images from Fitzpatrick IV–VI patients, outperforming previously reported clinician accuracy of 44.3% [23, 26]. This shows that equitable AI is possible, but only if diversity is built into the training and testing pipeline.

AI holds immense potential to improve dermatology care, especially in overstretched health systems like the NHS. But as it stands, many AI tools work well for lighter skin and falter on darker tones. To address this, we need to build on the issues discussed in the article.

First, we need to address the issues within the AI database. We need to curate more biopsy-confirmed images of all skin tones and improve accuracy in labelling skin tone [9, 25].Secondly, advancing debiasing methods, such as emerging techniques like CLGR and contrastive learning, is vital. These methods show promise but further development and ethical evaluation is needed to ensure these methods reduce disparities without introducing unintended new risks [9]. Finally, we must address structural inequalities in dermatology education. The underrepresentation of skin types IV-VI in textbooks and training contributes to gaps in clinician knowledge and perpetuates unconscious bias. Enhancing SOC education for both current and future clinicians is vital to delivering safe and equitable care [11, 22, 23]. In conditions such as melanoma, where early detection significantly impacts outcomes [9], equitable AI performance becomes not only a technical challenge, but a moral imperative.

References

1. Li Z, Koban KC, Schenck TL, Giunta RE, Li Q, Sun Y. Artificial intelligence in dermatology image analysis: current developments and future trends. Journal of clinical medicine. 2022 Nov 18;11(22):6826

2. National Institute for Health and Care Excellence (NICE). Artificial intelligence (AI) technologies for assessing and triaging skin lesions within the urgent suspected skin cancer pathway: early value assessment [Internet]. NATIONAL INSTITUTE FOR HEALTH AND CARE EXCELLENCE; 2024 Aug [cited 2025 Jul 17] p. 1–17. Available from: https://www.nice.org.uk/guidance/hte24/documents/accessible-version

3. National Institute for Health and Care Excellence (NICE). How common is melanoma? [Internet]. National Institute for Health and Care Excellence (NICE). 2022 [cited 2025 Jul 18]. Available from: https://cks.nice.org.uk/topics/melanoma/background-information/prevalence/

4. National Institute for Health and Care Excellence (NICE). Skin cancers – recognition and referral: How common is it? [Internet]. National Institute for Health and Care Excellence (NICE). 2025 [cited 2025 Jul 18]. Available from: https://cks.nice.org.uk/topics/skin-cancers-recognition-referral/background-information/prevalence/

5. Eedy D. Dermatology: a specialty in crisis. Clinical medicine. 2015 Dec;15(6):509.

6. Jones OT, Matin RN, Walter FM. Using artificial intelligence technologies to improve skin cancer detection in primary care. The Lancet Digital Health. 2025 Jan 1;7(1):e8-10.

7. Marsden H, Kemos P, Venzi M, Noy M, Maheswaran S, Francis N, Hyde C, Mullarkey D, Kalsi D, Thomas L. Accuracy of an artificial intelligence as a medical device as part of a UK-based skin cancer teledermatology service. Frontiers in Medicine. 2024 Mar 22;11:1302363.

8. Liu Y, Primiero CA, Kulkarni V, Soyer HP, Betz-Stablein B. Artificial intelligence for the classification of pigmented skin lesions in populations with skin of color: a systematic review. Dermatology. 2023 Aug 3;239(4):499-513.

9. Bevan PJ, Atapour-Abarghouei A. Detecting melanoma fairly: Skin tone detection and debiasing for skin lesion classification. InMICCAI Workshop on Domain Adaptation and Representation Transfer 2022 Sep 15 (pp. 1-11,. Cham: Springer Nature Switzerland.

10. Daneshjou R, Vodrahalli K, Novoa RA, Jenkins M, Liang W, Rotemberg V, Ko J, Swetter SM, Bailey EE, Gevaert O, Mukherjee P. Disparities in dermatology AI performance on a diverse, curated clinical image set. Science advances. 2022 Aug 12;8(31):eabq6147.

11. Diao JA, Adamson AS. Representation and misdiagnosis of dark skin in a large-scale visual diagnostic challenge. Journal of the American Academy of Dermatology. 2022 Apr 1;86(4):950-1.

12. Cohen PR, DiMarco MA, Geller RL, Darrisaw LA, Geller R, Darrisaw L. Colorimetric scale for skin of color: a practical classification scale for the clinical assessment, dermatology management, and forensic evaluation of individuals with skin of color. Cureus. 2023 Nov 1;15(11)

13. Rundle CW, Hollingsworth P, Dellavalle RP. Artificial intelligence in dermatology. Clinics in Dermatology. 2021 Jul 1;39(4):657-66.

14. Li CX, Shen CB, Xue K, Shen X, Jing Y, Wang ZY, Xu F, Meng RS, Yu JB, Cui Y. Artificial intelligence in dermatology: past, present, and future. Chinese medical journal. 2019 Sep 5;132(17):2017-20.

15. Thomas L, Hyde C, Mullarkey D, Greenhalgh J, Kalsi D, Ko J. Real-world post-deployment performance of a novel machine learning-based digital health technology for skin lesion assessment and suggestions for post-market surveillance. Frontiers in Medicine. 2023 Oct 31;10:1264846.) 

16. Marsden H, Morgan C, Austin S, DeGiovanni C, Venzi M, Kemos P, Greenhalgh J, Mullarkey D, Palamaras I. Effectiveness of an image analyzing AI-based Digital Health Technology to identify Non-Melanoma Skin Cancer and other skin lesions: Results of the DERM-003 study. Frontiers in Medicine. 2023 Oct 6;10:1288521.

17. Yazdanparast T, Shamsipour M, Ayatollahi A, Delavar S, Ahmadi M, Samadi A, Firooz A. Comparison of the Diagnostic Accuracy of Teledermoscopy, Face-to-Face Examinations and Artificial Intelligence in the Diagnosis of Melanoma. Indian journal of dermatology. 2024 Jul 1;69(4):296-300.

18. Li Z, Koban KC, Schenck TL, Giunta RE, Li Q, Sun Y. Artificial intelligence in dermatology image analysis: current developments and future trends. Journal of clinical medicine. 2022 Nov 18;11(22):6826.

19. Du‐Harpur X, Watt FM, Luscombe NM, Lynch MD. What is AI? Applications of artificial intelligence to dermatology. British Journal of Dermatology. 2020 Sep 1;183(3):423-30.

20. De A, Sarda A, Gupta S, Das S. Use of artificial intelligence in dermatology. Indian journal of dermatology. 2020 Sep 1;65(5):352-7..

21. Finlay AY, Griffiths TW, Belmo S, Chowdhury MM. Why we should abandon the misused descriptor ‘erythema’. The British Journal of Dermatology. 2021 Sep 30;185(6):1240.

22. Lester JC, Taylor SC, Chren MM. Under‐representation of skin of colour in dermatology images: not just an educational issue. British Journal of Dermatology. 2019 Jun 1;180(6):1521-2.

23. Gangal A, Stoff B. Costly savings: An ethical analysis of copay assistance programs in dermatology. Journal of the American Academy of Dermatology. 2021 Dec 1;85(6):1665-6.

24. Chatrath S, Bradley L, Kentosh J. Dermatologic conditions in skin of color compared to white patients: similarities, differences, and special considerations. Archives of Dermatological Research. 2023 Jul;315(5):1089-97.

25. Groh M, Harris C, Soenksen L, Lau F, Han R, Kim A, Koochek A, Badri O. Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition 2021 (pp. 1820-1828)

26. Schneider JG, Mamelak AJ, Tejani I, Jarmain T, Moy RL. Diagnosis of Skin Disease in Moderately to Highly Pigmented Skin by Artificial Intelligence. Journal of Drugs in Dermatology: JDD. 2023 Jul 1;22(7):647-52.

How interesting was this post?

Average rating 4.7 / 5. Vote count: 30

No votes so far! Be the first person to rate this post.

Jasmin Luo

Third Year Medical Student
University of Sunderland Medical School

5 thoughts on “The Digital Dermatoscope: AI’s Accuracy in Detecting Lesions in Skin of Colour”

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top

Get exclusive discount codes by signing up to our newsletter

By registering to our free newsletter you’ll receive discount codes to medical platforms such as Pastest, Quesmed, MRCP, UKMLA and MSRA question banks.