Understanding of spacing, resampling, and downsampling

Suppose 
the size of the read 3D data is (75, 512,512) and the spacing is (0.703125, 0.703125, 5.0) (x, y, z). 
We need to adjust the spacing of the z-axis to 1mm 
, so we need to calculate the size of the real 3D data first: (75*5, 512*0.703125, 512*0.706125) = (375, 360, 360)mm 
The spacing of the x, y, and z axes is adjusted to 1mm, that is, the current 3D data size = (375 / 1, 360 / 1, 360 / 1) = (375, 360, 360) 

In other words, the spacing is (0.703125, 0.703125, 5.0) and the read 3D data size is (75, 512,512). 
When the spacing is (1, 1, 1), read The size of the 3D data is (375, 360, 360) 

and the size of the real 3D data remains unchanged, which is (375, 360, 360)mm. 

If we downsample the read 3D data, for example, after downsampling The size is (375, 256, 256) 
, then the spacing should be (375 / 375, 360 / 256, 360 / 256) to (1.40625, 1.40625, 1) The 

above is on the calculation level, but for the code level, we have no It's so cumbersome: 

we resample and downsample the 3D data, and we only need one sentence to complete it. Suppose 

the size (not real size) of the input ct image is (75, 512, 512), and the spacing is (0.703125, 0.703125, 5.0) you  
you want Adjust the z-axis spacing to 1mm, and downsample the image to 256*256
need ,
1. Get the spacing of ct, ÷ the spacing you want, ct.GetSpaing()[-1] / 1 
2. Get the downsampling rate, down_scale = 512 / 256 
execute 
ct_array = ndimage.zoom(ct_array, (ct.GetSpaing ()[-1] / 1, down_scale, down_scale)) 
run the above sentence 
size will become (75 * 5 / 1, 256, 256) = (375, 256, 256) 
at this time the size is what you want Yes, and the spacing of the z-axis has become 1mm, the spacing is (375/375, 360/256,360/256) = (1.40625, 1.40625, 1) Next, 


you will perform other preprocessing operations on the image, such as slicing, etc. Wait for 
new_ct_array = slice (ct_array) 
new_ct = sitk.GetArrayFromImage(new_ct_array) 
Execute sitk.GetArrayFromImage, your spacing will default to (1,1,1) (I don’t know why yet), 

so the sapcing of new_ct should be set to ( 1.40625, 1.40625, 1) 
new_ct. SetSpacing((1.40625, 1.40625, 1))

This completes the preprocessing

The preprocessing code of the lits challenge is attached below to help understand

"""
获取可用于训练网络的训练数据集
需要四十分钟左右,产生的训练数据大小3G左右
"""

import os
import sys
sys.path.append(os.path.split(sys.path[0])[0])
import shutil
from time import time
import numpy as np
from tqdm import tqdm
import SimpleITK as sitk
import scipy.ndimage as ndimage

import lits_para as para


if os.path.exists(para.training_set_path):
    shutil.rmtree(para.training_set_path)



new_ct_path = os.path.join(para.training_set_path, 'imgae')
new_seg_dir = os.path.join(para.training_set_path, 'label')

# os.mkdir(para.training_set_path)
# os.mkdir(new_ct_path)
# os.mkdir(new_seg_dir)



start = time()
os.listdir()
patients = os.listdir(para.train_ct_path)
patients.sort(key=lambda x:int(x.split('.')[0].split('-')[-1]))
for i in tqdm(range(len(patients))):

    # 将CT和金标准入读内存
    ct = sitk.ReadImage(os.path.join(para.train_ct_path, patients[i]), sitk.sitkInt16)
    ct_array = sitk.GetArrayFromImage(ct)

    seg = sitk.ReadImage(os.path.join(para.train_seg_path, patients[i].replace('volume', 'segmentation')), sitk.sitkUInt8)
    seg_array = sitk.GetArrayFromImage(seg)

    # 将金标准中肝脏和肝肿瘤的标签融合为一个
    seg_array[seg_array > 0] = 1

    # 将灰度值在阈值之外的截断掉
    # print(ct_array.max())
    # print(ct_array.min())
    # ct_array[ct_array > para.upper] = para.upper
    # ct_array[ct_array < para.lower] = para.lower

    # 对CT数据在横断面上进行降采样,并进行重采样,将所有数据的z轴的spacing调整到1mm
    print(ct.GetSpacing())
    # ct(0.703125, 0.703125, 5.0)
    print(ct.GetSpacing()[-1] / para.slice_thickness, para.down_scale, para.down_scale)
    # 5.0 0.5 0.5
    ct_array = ndimage.zoom(ct_array, (ct.GetSpacing()[-1] / para.slice_thickness, para.down_scale, para.down_scale), order=3)
    seg_array = ndimage.zoom(seg_array, (ct.GetSpacing()[-1] / para.slice_thickness, para.down_scale, para.down_scale), order=0)
    # ct_array size从(75,512,512)->(375,256,256)
    


    # 其他操作
    # 找到肝脏区域开始和结束的slice,并各向外扩张slice
    z = np.any(seg_array, axis=(1, 2))
    start_slice, end_slice = np.where(z)[0][[0, -1]]

    # 两个方向上各扩张slice
    start_slice = max(0, start_slice - para.expand_slice)
    end_slice = min(seg_array.shape[0] - 1, end_slice + para.expand_slice)

    # 如果这时候剩下的slice数量不足size,直接放弃该数据,这样的数据很少,所以不用担心
    if end_slice - start_slice + 1 < para.size:
        print('!!!!!!!!!!!!!!!!')
        print(patients[i], 'have too little slice', ct_array.shape[0])
        print('!!!!!!!!!!!!!!!!')
        continue

    ct_array = ct_array[start_slice:end_slice + 1, :, :]
    seg_array = seg_array[start_slice:end_slice + 1, :, :]

    # 最终将数据保存为nii.gz
    new_ct = sitk.GetImageFromArray(ct_array)

    new_ct.SetDirection(ct.GetDirection())
    new_ct.SetOrigin(ct.GetOrigin())
    print((ct.GetSpacing()[0] * int(1 / para.down_scale), ct.GetSpacing()[1] * int(1 / para.down_scale), para.slice_thickness))
    # (1.40625, 1.40625, 1)
    # 最后调整好spacing
    new_ct.SetSpacing((ct.GetSpacing()[0] * int(1 / para.down_scale), ct.GetSpacing()[1] * int(1 / para.down_scale), para.slice_thickness))

    new_seg = sitk.GetImageFromArray(seg_array)

    new_seg.SetDirection(ct.GetDirection())
    new_seg.SetOrigin(ct.GetOrigin())
    new_seg.SetSpacing((ct.GetSpacing()[0], ct.GetSpacing()[1], para.slice_thickness))

    sitk.WriteImage(new_ct, os.path.join(new_ct_path, 'volume-' + str(i) + '.nii.gz'))
    sitk.WriteImage(new_seg, os.path.join(new_seg_dir, 'segmentation-' + str(i) + '.nii.gz'))

Guess you like

Origin blog.csdn.net/Wjeana/article/details/127409525