WileyJournal Paper2026

OpthaNet: Attention-Integrated Architecture for High-Precision Multi-Class Ophthalmic Image Classification

Abstract

This study investigated the efficacy of pre-trained deep learning models for multi-class classification of eye diseases, namely cataract, diabetic retinopathy, and glaucoma, using fundus images. Although CNN and transformer-based models have been extensively explored separately in ophthalmic diagnostics, a direct comparative analysis remains limited. Moreover, recent high-performing systems frequently rely on heavy backbones, ensembles, or large-scale domain pretraining, which can be impractical for resource-constrained screening pipelines. We evaluated three models, EfficientNetB3, MobileNetV2 and vision Transformer, with tailored modifications. An attention-enhanced feature refinement module and the OpthaHead custom classifier enhanced EfficientNetB3 and MobileNetV2, while META customization optimized vision Transformer. The proposed design explicitly targets two practical bottlenecks observed in ophthalmic transfer learning, insufficient feature selectivity for subtle lesions and structural regions, and overfitting or instability in the final decision layers when training data are limited. The optimized EfficientNetB3 achieved a 10.84% improvement over its baseline with 96.04% accuracy, and MobileNetV2 improved by 11.26%, balancing accuracy and computational efficiency. META customization boosted vision Transformer performance by over 18%, showing that reducing model complexity benefits transformers on limited medical data. This study demonstrates strong performance for AI-driven eye disease classification and highlights the potential of AI tools for early detection, improving clinical decision-making and patient outcomes.

Key Achievements

Optimized EfficientNetB3 achieved 96.04% accuracy
EfficientNetB3 improved by 10.84% over its baseline
MobileNetV2 improved by 11.26% while maintaining computational efficiency
META customization boosted Vision Transformer performance by over 18%
Classifies cataract, diabetic retinopathy, and glaucoma from fundus images

Topics

Deep LearningEye DiseaseDiabetic RetinopathyGlaucomaEfficientNetVision Transformer