[Paper Reading] MCANet: Medical Image Segmentation with Multi-Scale Cross-Axis Attention

Summary

Link: https://arxiv.org/abs/2312.08866
Medical image segmentation is one of the key challenges in the field of medical image processing and computer vision. Since diseased areas or organs vary in size and shape, it is crucial to effectively capture multi-scale information and establish long-range dependencies between pixels. This paper proposes a multi-scale cross-axis attention (MCA) method based on efficient axial attention to solve these problems. MCA calculates bidirectional cross-attention between two parallel axial attentions to better capture global information. In addition, in order to handle the significant changes in individual size and shape of diseased areas or organs, we also perform multiple convolutions using bar convolution kernels of different sizes in each axial attention path to improve encoding spatial information. s efficiency. We build the proposed MCA on the MSCAN backbone to form a network named MCANet. Using only 4M+ parameters, our MCANet outperforms most methods using heavy-duty backbones (such as Swin Transformer) on four challenging tasks, including skin lesion segmentation, cell nucleus segmentation, abdominal multi-organ segmentation, and polyp segmentation. Previous job. Code is available at https://github.com/haoshao-nku/medical seg.git.
Keywords: medical image segmentation, self-attention, cross-axis attention, multi-scale features.

Insert image description here

Summary of innovation points

  1. A new method Multi-scale Cross-axis Attention (MCA) is proposed to handle medical image segmentation tasks.

  2. The MCA method improves the traditional axial attention mechanism in two aspects to adapt to the characteristics of medical images. First, it utilizes bar shape convolution to introduce multi-scale features to better localize the target area. Secondly, it establishes a double cross-attention between two spatial axial attentions to better utilize multi-scale features and identify blurred boundaries of target regions.
    Insert image description here

  3. The MCA method is efficient at lightweight and its decoder is relatively lightweight. In Table I, we can see that the number of small model parameters of the MCA method is only 0.14M, which is more suitable for practical application scenarios.

  4. MCA methods can effectively encode global context while taking into account various sizes and shapes of diseased regions or organs, but further exploration is needed on how to handle these characteristics more effectively.
    Insert image description here

achieve effect

In medical image segmentation, MCANet significantly improves the accuracy and robustness of segmentation by introducing a multi-scale cross-axis attention mechanism. Compared with the traditional axial attention mechanism, MCANet pays more attention to the shape and size characteristics of the lesion area or organ at different scales, thereby locating the target area more accurately.

First, MCANet integrates multi-scale features through bar convolution to adapt to different sizes and shapes of diseased regions or organs. This helps improve the model's positioning accuracy in the target area.

Secondly, MCANet innovatively builds a dual-cross attention mechanism to cross-connect horizontal and vertical axis attention. This design can better utilize multi-scale information and enhance the model's awareness of global context, thereby segmenting medical images more accurately.
Insert image description here

Experimental results on the DSB2018 dataset show that MCANet has achieved significant performance improvements in medical image segmentation tasks. This method effectively solves the problems encountered by the traditional axial attention mechanism when processing medical images, and provides new ideas and methods for the development of the field of medical image segmentation.
Insert image description here

MCANet has achieved excellent performance in the field of medical image segmentation through a multi-scale cross-axis attention mechanism.
Insert image description here

Summarize

The article introduces MCANet, a multi-scale cross-axis attention model for medical image segmentation. This model utilizes directional information to overcome some challenges encountered in medical image segmentation tasks by establishing bidirectional cross-attention in two spatial dimensions. In addition, the article also mentioned that combining multi-scale convolutional features with axial attention can help solve the challenge of achieving long-distance interactions on smaller medical image data sets.

The article discusses the advantages and limitations of axial attention. Axial attention can capture global information more effectively and reduce computational complexity. However, for large segmented datasets, axial attention can learn position bias. In many medical image segmentation tasks, the data sets are relatively small, which makes it challenging to achieve long-range interactions. Therefore, the article proposes to establish bidirectional cross attention to better utilize direction information.

MCANet is an effective medical image segmentation model that overcomes some of the challenges encountered in processing small medical image datasets by combining multi-scale convolutional features and bidirectional cross-attention. This model has broad application prospects and can provide solutions for various medical image segmentation tasks. In addition, the article also mentions some potential application areas of MCANet, such as for 3D medical image segmentation or for solving other image segmentation problems.

Guess you like

Origin blog.csdn.net/hhhhhhhhhhwwwwwwwwww/article/details/135191070