- 来源:https://github.com/caffe2/tutorials/blob/master/Image_Pre-Processing_Pipeline.ipynb
- content
- resizing
- rescaling
- HWC to CHW
- RGB to BRG
- image prep for Caffe2 ingestion
- Caffe Prefers CHW Order
- H: Height
- W: Width
- C: Channel (as in color)
- For GPU processing, which is what Caffe2 excels at, this order needs to be CHW. For CPU processing, this order is generally HWC.
- And the reason points to cuDNN which is what helps accelerate processing on GPUs. It uses only CHW, and we'll sum it up by saying it is faster.
- Caffe Uses BGR Order
- Due to legacy support of OpenCV in Caffe and how it handles images in Blue-Green-Red (BGR) order instead of the more commonly used Red-Green-Blue (RGB) order, Caffe2 also expects BGR order.
1.导入模块
from __future__ import absolute_import from __future__ import division from __future__ import print_function from __future__ import unicode_literals %matplotlib inline import skimage import skimage.io as io import skimage.transform import sys import numpy as np import math from matplotlib import pyplot import matplotlib.image as mpimg print("Required modules imported.")
2.测试一个图像RGB->BGR
# You can load either local IMAGE_FILE or remote URL # For Round 1 of this tutorial, try a local image. IMAGE_LOCATION = 'images/cat.jpg' # For Round 2 of this tutorial, try a URL image with a flower: # IMAGE_LOCATION = "https://cdn.pixabay.com/photo/2015/02/10/21/28/flower-631765_1280.jpg" # IMAGE_LOCATION = "images/flower.jpg" # For Round 3 of this tutorial, try another URL image with lots of people: # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/1/18/NASA_Astronaut_Group_15.jpg" # IMAGE_LOCATION = "images/astronauts.jpg" # For Round 4 of this tutorial, try a URL image with a portrait! # IMAGE_LOCATION = "https://upload.wikimedia.org/wikipedia/commons/9/9a/Ducreux1.jpg" # IMAGE_LOCATION = "images/Ducreux.jpg" img = skimage.img_as_float(skimage.io.imread(IMAGE_LOCATION)).astype(np.float32) # test color reading # show the original image pyplot.figure() pyplot.subplot(1,2,1) pyplot.imshow(img) pyplot.axis('on') pyplot.title('Original image = RGB') # show the image in BGR - just doing RGB->BGR temporarily for display imgBGR = img[:, :, (2, 1, 0)] #pyplot.figure() pyplot.subplot(1,2,2) pyplot.imshow(imgBGR) pyplot.axis('on') pyplot.title('OpenCV, Caffe2 = BGR')
3.旋转与镜像
手机等电子设备拍摄图像的方向、是否镜像、拍摄模式是不一样的,如一下代码显示。有时我们根据照片的EXIF判断,有时需要旋转多个角度,生成衍生图像进行判断。
# Image came in sideways - it should be a portait image! # How you detect this depends on the platform # Could be a flag from the camera object # Could be in the EXIF data # ROTATED_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/8/87/Cell_Phone_Tower_in_Ladakh_India_with_Buddhist_Prayer_Flags.jpg" ROTATED_IMAGE = "images/cell-tower.jpg" imgRotated = skimage.img_as_float(skimage.io.imread(ROTATED_IMAGE)).astype(np.float32) pyplot.figure() pyplot.imshow(imgRotated) pyplot.axis('on') pyplot.title('Rotated image') # Image came in flipped or mirrored - text is backwards! # Again detection depends on the platform # This one is intended to be read by drivers in their rear-view mirror # MIRROR_IMAGE = "https://upload.wikimedia.org/wikipedia/commons/2/27/Mirror_image_sign_to_be_read_by_drivers_who_are_backing_up_-b.JPG" MIRROR_IMAGE = "images/mirror-image.jpg" imgMirror = skimage.img_as_float(skimage.io.imread(MIRROR_IMAGE)).astype(np.float32) pyplot.figure() pyplot.imshow(imgMirror) pyplot.axis('on') pyplot.title('Mirror image')
反转镜像图像
# Run me to flip the image back and forth imgMirror = np.fliplr(imgMirror) pyplot.figure() pyplot.imshow(imgMirror) pyplot.axis('off') pyplot.title('Mirror image')
旋转图像
# Run me to rotate the image 90 degrees imgRotated = np.rot90(imgRotated, 3) pyplot.figure() pyplot.imshow(imgRotated) pyplot.axis('off') pyplot.title('Rotated image')
4.改变大小(sizing)
caffe2管道中的图像应该为方形的,同时为了caffe2的性能,需要改变图像的大小,通常要缩小原图像
# Model is expecting 224 x 224, so resize/crop needed. # First, let's resize the image to 256*256 orig_h, orig_w, _ = img.shape print("Original image's shape is {}x{}".format(orig_h, orig_w)) input_height, input_width = 224, 224 print("Model's input shape is {}x{}".format(input_height, input_width)) img256 = skimage.transform.resize(img, (256, 256)) # Plot original and resized images for comparison f, axarr = pyplot.subplots(1,2) axarr[0].imshow(img) axarr[0].set_title("Original Image (" + str(orig_h) + "x" + str(orig_w) + ")") axarr[0].axis('on') axarr[1].imshow(img256) axarr[1].axis('on') axarr[1].set_title('Resized image to 256x256') pyplot.tight_layout() print("New image shape:" + str(img256.shape))
5.保持横纵比改变大小(scaling)
print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") print("Model's input shape is {}x{}".format(input_height, input_width)) aspect = img.shape[1]/float(img.shape[0]) print("Orginal aspect ratio: " + str(aspect)) if(aspect>1): # landscape orientation - wide image res = int(aspect * input_height) imgScaled = skimage.transform.resize(img, (input_height, res)) if(aspect<1): # portrait orientation - tall image res = int(input_width/aspect) imgScaled = skimage.transform.resize(img, (res, input_width)) if(aspect == 1): imgScaled = skimage.transform.resize(img, (input_height, input_width)) pyplot.figure() pyplot.imshow(imgScaled) pyplot.axis('on') pyplot.title('Rescaled image') print("New image shape:" + str(imgScaled.shape) + " in HWC")
此时,只有一个维度被设置为模型输入所需要的。我们还需要单边裁剪来做一个正方形。
6.裁剪
改变策略,直接从图像中间裁剪
- Just grab the exact dimensions you need from the middle!
- Resize to a square that's pretty close then grab from the middle.
- Use the rescaled image and grab the middle.
# Compare the images and cropping strategies # Try a center crop on the original for giggles print("Original image shape:" + str(img.shape) + " and remember it should be in H, W, C!") def crop_center(img,cropx,cropy): y,x,c = img.shape startx = x//2-(cropx//2) starty = y//2-(cropy//2) return img[starty:starty+cropy,startx:startx+cropx] # yes, the function above should match resize and take a tuple... pyplot.figure() # Original image imgCenter = crop_center(img,224,224) pyplot.subplot(1,3,1) pyplot.imshow(imgCenter) pyplot.axis('on') pyplot.title('Original') # Now let's see what this does on the distorted image img256Center = crop_center(img256,224,224) pyplot.subplot(1,3,2) pyplot.imshow(img256Center) pyplot.axis('on') pyplot.title('Squeezed') # Scaled image imgScaledCenter = crop_center(imgScaled,224,224) pyplot.subplot(1,3,3) pyplot.imshow(imgScaledCenter) pyplot.axis('on') pyplot.title('Scaled') pyplot.tight_layout()正如你所看到的那样,除了最后一个之外,效果还不太好。另一种策略是用真实数据重新调整到最佳大小,然后用图像中的信息来填充其余的图像。
7.按比例放大图像(upscaling)
imgTiny = "images/Cellsx128.png" imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) print("Original image shape: ", imgTiny.shape) imgTiny224 = skimage.transform.resize(imgTiny, (224, 224)) print("Upscaled image shape: ", imgTiny224.shape) # Plot original pyplot.figure() pyplot.subplot(1, 2, 1) pyplot.imshow(imgTiny) pyplot.axis('on') pyplot.title('128x128') # Plot upscaled pyplot.subplot(1, 2, 2) pyplot.imshow(imgTiny224) pyplot.axis('on') pyplot.title('224x224')
Original image shape: (128, 128, 4)Upscaled image shape: (224, 224, 4)此时多了一个通道:透明度。
imgTiny = "images/Cellsx128.png" imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) print("Image shape before HWC --> CHW conversion: ", imgTiny.shape) # swapping the axes to go from HWC to CHW # uncomment the next line and run this block! imgTiny = imgTiny.swapaxes(1, 2).swapaxes(0, 1) print("Image shape after HWC --> CHW conversion: ", imgTiny.shape) imgTiny224 = skimage.transform.resize(imgTiny, (224, 224)) print("Image shape after resize: ", imgTiny224.shape) # we know this is going to go wrong, so... try: # Plot original pyplot.figure() pyplot.subplot(1, 2, 1) pyplot.imshow(imgTiny) pyplot.axis('on') pyplot.title('128x128') except: print("Here come bad things!") # hands up if you want to see the error (uncomment next line) #raiseImage shape before HWC --> CHW conversion: (128, 128, 4)
Image shape after HWC --> CHW conversion: (4, 128, 128)
Image shape after resize: (224, 224, 128)
Here come bad things! 我们需要的是3个通道,现在却变成了128个。下面测试一个小于输入要求,但不是正方形的图像。
imgTiny = "images/Cellsx128.png" imgTiny = skimage.img_as_float(skimage.io.imread(imgTiny)).astype(np.float32) imgTinySlice = crop_center(imgTiny, 128, 56) # Plot original pyplot.figure() pyplot.subplot(2, 1, 1) pyplot.imshow(imgTiny) pyplot.axis('on') pyplot.title('Original') # Plot slice pyplot.figure() pyplot.subplot(2, 2, 1) pyplot.imshow(imgTinySlice) pyplot.axis('on') pyplot.title('128x56') # Upscale? print("Slice image shape: ", imgTinySlice.shape) imgTiny224 = skimage.transform.resize(imgTinySlice, (224, 224)) print("Upscaled slice image shape: ", imgTiny224.shape) # Plot upscaled pyplot.subplot(2, 2, 2) pyplot.imshow(imgTiny224) pyplot.axis('on') pyplot.title('224x224')
Stretch可能是个致命的错误。为了解决这个问题,在某些情况下,可以用白色、或黑色或可能的噪声填充图像的其余部分,或者甚至使用PNG和透明度,并为图像设置掩码,因此该模型忽略透明区域。
8.最后的预处理与批处理
RGB -> BGR,HCW -> CHW,然后向图像添加第四维(n)以跟踪图像的数目。# This next line helps with being able to rerun this section # if you want to try the outputs of the different crop strategies above # swap out imgScaled with img (original) or img256 (squeezed) imgCropped = crop_center(imgScaled,224,224) print("Image shape before HWC --> CHW conversion: ", imgCropped.shape) # (1) Since Caffe expects CHW order and the current image is HWC, # we will need to change the order. imgCropped = imgCropped.swapaxes(1, 2).swapaxes(0, 1) print("Image shape after HWC --> CHW conversion: ", imgCropped.shape) pyplot.figure() for i in range(3): # For some reason, pyplot subplot follows Matlab's indexing # convention (starting with 1). Well, we'll just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(imgCropped[i], cmap=pyplot.cm.gray) pyplot.axis('off') pyplot.title('RGB channel %d' % (i+1)) # (2) Caffe uses a BGR order due to legacy OpenCV issues, so we # will change RGB to BGR. imgCropped = imgCropped[(2, 1, 0), :, :] print("Image shape after BGR conversion: ", imgCropped.shape) # for discussion later - not helpful at this point # (3) (Optional) We will subtract the mean image. Note that skimage loads # image in the [0, 1] range so we multiply the pixel values # first to get them into [0, 255]. #mean_file = os.path.join(CAFFE_ROOT, 'python/caffe/imagenet/ilsvrc_2012_mean.npy') #mean = np.load(mean_file).mean(1).mean(1) #img = img * 255 - mean[:, np.newaxis, np.newaxis] pyplot.figure() for i in range(3): # For some reason, pyplot subplot follows Matlab's indexing # convention (starting with 1). Well, we'll just follow it... pyplot.subplot(1, 3, i+1) pyplot.imshow(imgCropped[i], cmap=pyplot.cm.gray) pyplot.axis('off') pyplot.title('BGR channel %d' % (i+1)) # (4) Finally, since caffe2 expect the input to have a batch term # so we can feed in multiple images, we will simply prepend a # batch dimension of size 1. Also, we will make sure image is # of type np.float32. imgCropped = imgCropped[np.newaxis, :, :, :].astype(np.float32) print('Final input shape is:', imgCropped.shape)
- Before and after of the HWC to CHW change. The 3, which is the number of color channels moved to the beginning.
- In the pictures above you can see that the color order was switched too. RGB became BGR. Blue and Red switched places.
- The final input shape, meaning the last change to the image was to add the batch field to the beginning, so that now you have (1, 3, 224, 224) for:
- 1 image in the batch,
- 3 color channels (in BGR),
- 224 height,
- 224 width.
Original image shape: (128, 128, 4)Upscaled image shape: (224, 224, 4)