Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images

Anuja Vats; Marius Pedersen; Ahmed Mohammed; Øistein Hovde

Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images

Anuja Vats, Marius Pedersen, Ahmed Mohammed, Øistein Hovde

TL;DR

The paper tackles the scarcity of diverse, annotated WCE image atlases for training and education. It introduces a StyleGAN2-based pipeline that learns clinically meaningful attributes from unlabeled multi-institution data and enables controllable image generation and synthetic disease progression. Through three expert-driven subjective tasks, the study demonstrates that synthetic WCE images closely resemble real ones and that generated disease progressions are plausibly monotonic, albeit with some subjectivity. The resulting synthetic atlas offers a privacy-preserving, broadly accessible resource for training, evaluation, and domain adaptation in WCE-related AI and medical education.

Abstract

Wireless Capsule Endoscopy (WCE) is being increasingly used as an alternative imaging modality for complete and non-invasive screening of the gastrointestinal tract. Although this is advantageous in reducing unnecessary hospital admissions, it also demands that a WCE diagnostic protocol be in place so larger populations can be effectively screened. This calls for training and education protocols attuned specifically to this modality. Like training in other modalities such as traditional endoscopy, CT, MRI, etc., a WCE training protocol would require an atlas comprising of a large corpora of images that show vivid descriptions of pathologies and abnormalities, ideally observed over a period of time. Since such comprehensive atlases are presently lacking in WCE, in this work, we propose a deep learning method for utilizing already available studies across different institutions for the creation of a realistic WCE atlas using StyleGAN. We identify clinically relevant attributes in WCE such that synthetic images can be generated with selected attributes on cue. Beyond this, we also simulate several disease progression scenarios. The generated images are evaluated for realism and plausibility through three subjective online experiments with the participation of eight gastroenterology experts from three geographical locations and a variety of years of experience. The results from the experiments indicate that the images are highly realistic and the disease scenarios plausible. The images comprising the atlas are available publicly for use in training applications as well as supplementing real datasets for deep learning.

Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images

TL;DR

Abstract

Paper Structure (8 sections, 1 equation, 10 figures, 3 tables)

This paper contains 8 sections, 1 equation, 10 figures, 3 tables.

Related Work
Methodology.
Task 1: Visual Turing Test
Task 2: Ranking realism
Task 3: Synthetic Disease Progression
Results and Analysis :
Applications
Conclusion

Figures (10)

Figure 1: Figure illustrates synthetic images from the atlas with the following variations- a.) Vascular: Variations in the vascular pattern underneath the mucosa, used as an indicator of tissue health b.) Abnormal: variations that are pathological in nature such as development of inflammation, ulcer, edema etc. c.) Debris: variations that simulate various levels and types of occlusions expected in WCE images d.) View and Rotation: Variations simulating different view points as well as free capsule rotation as it traverses e.) Anatomical : variations relating to peristalsis as well as those arising from different parts of the tract f.) WCE modality: variations that reflect changes from one capsule modality to another (such as change in organ colors, illumination etc.)
Figure 2: The figure illustrates the setup used for subjective online evaluation. (a.) Screen as observed during the visual Turing test for each image. (b.) Screen as observed in the ranking realism task. (c.) Screen as seen in synthetic disease progression task. Please zoom in for clarity, best viewed in color.
Figure 3: Krippendorff agreement coefficients among eight users for visual Turing test. Only the upper triangular portion of this symmetric matrix is presented for clarity.
Figure 4: Figure shows the distribution of difficulty/easiness of the Turing test as rated by users. Although individual opinions on the difficulty level of images vary between users (example UserID 2 and 6), the overall distribution of the score averages out with a slightly longer tail towards easy than difficult.
Figure 5: Figure illustrates images that were labeled wrongly by the experts with overwhelming majority. All experts labeled image (a.) as real and 7/8 experts labeled images (b.) and (c.) as real while they are actually generated. Similarly, all experts labeled image (d.) to be a generate and 7/8 experts labeled images (e.) and (f.) to be generates while they are actually real.
...and 5 more figures

Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images

TL;DR

Abstract

Evaluating clinical diversity and plausibility of synthetic capsule endoscopic images

Authors

TL;DR

Abstract

Table of Contents

Figures (10)