Diffusion Probabilistic Model for Histopathology Images, WACV2023

Table of Contents

This post introduces the paper “A Morphology Focused Diffusion Probabilistic Model for Synthesis of Histopathology Images” written by Moghadam et al¹.

(Figures and tables in this post are from the paper¹)

Overview of This Paper

Authors¹ applied the diffusion probabilistic model to histopathology images to produce hi quality synthetic images.
The goals of this paper:
- To explore the utility of the diffusion probabilistic model for synthesizing histopathology images.
- To compare the performance of the diffusion model with other generative AI model.
According to authors, this is the first study to introduce the diffusion model to histopathology images.
Authors applied the U-net based diffusion model architecture proposed by Dhariwal et al.² to histopathology images synthesizing task.

Which Points Are Better Than Others?

Performance:
- The proposed diffusion model outperforms in all metrics (IS, FID, sFID, Improved Precision and Recall) compared to ProGan.
- Synthesized images by diffusion model have features of specific cell types.
Experiments:
- Pathologists participated in the experiment to identify real images and synthetic images generated by the diffusion model.

Proposed Architecture

Backbone of proposed method is a neural network similar to the Unet based model improved by Dhariwal et al.²
- This² is inspired from the Unet model by Ho et al.³ for diffusion model.
- This model uses three types of image resolutions, which allows the model to treat some local and global features in images.
For reducing artefacts like checker box or aliasing on the synthesized images, it uses BIGGAN downsampling/upsampling residual blocks⁴.
- These kinds of artefacts are critical noises for especially histopathological images.
Embedding layer is used to inject timestep to the diffusion network.
- All the time steps can access to the rest of the weights of the model.
Genotypes of the images are utilized for learning with a separated embedding layer.

Performance Evaluation

Two networks are applied to generate synthetic histopathology images.
- Proposed method based on the proposed diffusion model¹.
- Related work’s network⁵ utilized ProGAN⁶.
  - Authors slightly modified ProGAN by referring to cGANs proposed by Miyato et al.⁷.
Two pathologists rate the quality of synthesized images by two networks.

Dataset

Dataset from the Cancer Genome Atlas (TCGA) archive⁸.
- 344 whole slides images (WSIs) of low grade gloom as representative of its three major genomic subtypes.
- Size of images are ~100K x 100K pixels.
Annotations:
- Each slide Images are pixel-wise annotated by a professional pathologist by using author’s annotation tool. These annotation will be publicly available.
Patches:
- A maximum 100 tumor patches with size of 512x512 pixels are collected from each slide.
- They are scaled to size of 128x128.
- Total number of patches are 33,777 images (128x128).

Results

Quality of synthesized images

Subjective evaluation:
- Authors stated that synthesized images generated by proposed method are higher quality than ProGAN’s ones in Fig.5.
Inception Score (IS), Frechet Inception Distance (FID), and sFID
- Diffusion model shows better scores of Inception Score and sFID in Table. 2.
- Authors mentions that proposed diffusion method outperforms ProGAN across three metrics.
- Authors also said that diffusion model shows lower values on FID and sFID, which are more capable to robustly generate perceptual features than ProGAN.

Improved Precision and Recall Metrics⁹
- Inproved Recall: The percentage of real data features are included in the manifold of synthesized data features.
- Improved Precision: The ratio of synthesized data features locate in the manifold of real data features.
- Proposed diffusion model performed better than ProGAN results on both metrics.

Rating by pathologists

Authors selected the equal number of images from real and synthesized images.
Two pathologists were showed images and asked two questions with each image:
1. Do you think this image is real or fake?
2. How confident are you in your answer?
Authors concluded that the synthesized images by their proposed diffusion model look extremely similar to real images:
- Pathologists could not identify real images from synthesized images.
- They had less confidence level when they distinguished synthetic images.

Visual Observation

Diffusion model can generate images that has features of specific cell types while ProGan’s images have unclear features, which suggests that the diffusion model has a capability to learn the specific features of each cell type.

What I Learned

Merits of diffusion models:
- Stable training, easy model scaling, and good distribution coverage.
- Higher stability:
  - Denoising steps and strong conditions on input images soften the data distribution.
- More diverse images:
  - With better distribution coverage.
- Less overfitting¹⁰.
Demerits of GAN:
- According to this paper, model collapse and instabilities.
  - Directly producing images from complex latent spaces in a single shot.
  - Easily overfitting of their discriminator¹⁰.
- These causes make the network unsuitable or generate samples from rare conditions or imbalanced datasets.
Inception score (IS)¹¹:
- IS is defined by using Kullback-Leibler (KL) divergence to mesure the difference between two propability distributions.
- A paper¹² mentioned that IS may not be a suitable metric for generative models trained by using dataset other than the ImageNet.
Frechet inception distance (FID)¹³:
- FID is a metric to compare the distribution of training dataset with the one of synthesized data.
- Low FID values mean that these distributions are similar.
- FID utilizes Inception-V3 latent space.
- Both data are fed into the inception V3 model, and the mean and the standard deviation of “pool_3 layer” is utilized to calculate the FID.
sFID¹⁴:
- A modified version of FID.
- FDI has less sensitivity to spatial heterogeneity because FDI uses “pool_3 Layer” that compresses spatial information.
- On the other hands, sFDI employs the initial channels from intermidiate layer. It means sFDI can extract infomation of spatial similarity bettar than FDI in some situations¹⁵.

Puria Azadi Moghadam, Sanne Van Dalen, Karina C. Martin, Jochen Lennerz, Stephen Yip, Hossein Farahani and Ali Bashashati. A Morphology Focused Diffusion Probabilistic Model for Synthesis of Histopathology Images. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 1999-2008, 2023. doi.org/10.1109/WACV56688.2023.00204 or arXiv:2209.13167. ↩︎ ↩︎ ↩︎ ↩︎
Prafulla Dhariwal and Alex Nichol. Diffusion Models Beat GANs on Image Synthesis. arXiv preprint arxiv:2105.05233. ↩︎ ↩︎ ↩︎
Jonathan Ho, Ajay Jain and Pieter Abbeel. Denoising Diffusion Probabilistic Models. arXiv preprint arxiv:2006.11239. ↩︎
Andrew Brock, Jeff Donahue and Karen Simonyan. Large Scale GAN Training for High Fidelity Natural Image Synthesis. arXiv preprint arxiv:1809.11096. ↩︎
Adrian B Levine, Jason Peng, David Farnell, Mitchell Nursey, Yiping Wang, Julia R Naso, Hezhen Ren, Hossein Farahani, Colin Chen, Derek Chiu, et al. Synthesis of diagnostic quality cancer pathology images by generative adversarial networks. The Journal of pathology, 252(2):178–188, 2020. doi.org/10.1002/path.5509. ↩︎
Tero Karras, Timo Aila, Samuli Laine and Jaakko Lehtinen. Progressive Growing of GANs for Improved Quality, Stability, and Variation. arXiv preprint arxiv:1710.10196. ↩︎
Takeru Miyato and Masanori Koyama. cGANs with Projection Discriminator. arXiv preprint arxiv:1802.05637. ↩︎
Robert L. Grossman, Allison P. Heath, Vincent Ferretti, Harold E. Varmus, Douglas R. Lowy, Warren A. Kibbe, and Louis M. Staudt. Toward a shared vision for cancer genomic data. New England Journal of Medicine, 375(12):1109– 1112, 2016. doi.org/10.1056/NEJMp1607591. ↩︎
Tuomas Kynka¨anniemi, Tero Karras, Samuli Laine, Jaakko Lehtinen, and Timo Aila. Improved precision and recall metric for assessing generative models. Advances in Neural Information Processing Systems, 32, 2019. link ↩︎
Zhisheng Xiao, Karsten Kreis, and Arash Vahdat. Tackling the generative learning trilemma with denoising diffusion gans. In International Conference on Learning Representations, 2021. ↩︎ ↩︎
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016. ↩︎
Shane Barratt and Rishi Sharma. A Note on the Inception Score. arXiv preprint arxiv:1801.01973. ↩︎
Tuomas Kynkäänniemi, Tero Karras, Miika Aittala, Timo Aila and Jaakko Lehtinen. The Role of ImageNet Classes in Fréchet Inception Distance. arXiv preprint arxiv:2203.06026. ↩︎
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2818–2826, 2016. ↩︎
Charlie Nash, Jacob Menick, Sander Dieleman and Peter W. Battaglia. Generating Images with Sparse Representations. arXiv preprint arxiv:2103.03841. ↩︎