【跑实验07】RuntimeError: Argument #6: Padding size should be less than the corresponding input dimension

When trying to run an experiment recently, part of our code is:

patch_h = 28
patch_w = 28
feat_dim = 768

transform = T.Compose([
    T.GaussianBlur(9, sigma=(0.1, 2.0)),
    T.Resize((patch_h * 14, patch_w * 14)),
    T.CenterCrop((patch_h * 14, patch_w * 14)),
    T.ToTensor(),
    T.Normalize(mean=(0.485, 0.456, 0.406), std=(0.229, 0.224, 0.225)),
])

dinov2_vitb14 = torch.hub.load('', 'dinov2_vitb14',source='local').cuda()

features = torch.zeros(4, patch_h * patch_w, feat_dim)
imgs_tensor = torch.zeros(4, 3, patch_h * 14, patch_w * 14).cuda()

img_path = f'/home/wangzhenkuan/val_cropped/cropped_(25, 140, 39, 143)_obj365_val_000000685822.jpg'
img = Image.open(img_path).convert('RGB')
imgs_tensor[0] = transform(img)[:3]
with torch.no_grad():
    features_dict = dinov2_vitb14.forward_features(imgs_tensor)
    features = features_dict['x_norm_patchtokens']

features = features.reshape(4 * patch_h * patch_w, feat_dim).cpu()
pca = PCA(n_components=3)
pca.fit(features)
pca_features = pca.transform(features)
pca_features[:, 0] = (pca_features[:, 0] - pca_features[:, 0].min()) / (pca_features[:, 0].max() - pca_features[:, 0].min())
new_pca_features = pca_features.flatten()
print(new_pca_features, new_pca_features.shape)

Encountered this error:

RuntimeError: Argument #6: Padding size should be less than the corresponding input dimension, but got: padding (4, 4) at dimension 2 of input 4

insert image description here

According to the code and error message provided, the problem is in the T.Resize() operation. When the T.Resize() operation resizes an image, the size of the padding exceeds the corresponding dimension of the input image.

In my code, I try to resize the image to (patch_h * 14, patch_w * 14), but the input image has dimensions of only 14x3. So results in a padding size of (4, 4)3 which exceeds the height dimension of the image.

To fix this, I need to make sure that the image's dimensions are large enough to accommodate patch_h * 14the and patch_w * 14. You can resize the input image, or modify the target size of the T.Resize() operation.

If you want to resize the input image, you can use T.Resize() to set an appropriate size. If you wish to change the target size of T.Resize(), make sure the target size is smaller than the input image size.

Let's take a look at the size of this photo:

from PIL import Image

image_path = "/home/wangzhenkuan/val_cropped/cropped_(25, 140, 39, 143)_obj365_val_000000685822.jpg"
img = Image.open(image_path)
width, height = img.size
print(f"图片尺寸:宽度 = {
      
      width}px,高度 = {
      
      height}px")

can get:

图片尺寸:宽度 = 14px,高度 = 3px

I try to resize the image to 14*14:

img = Image.open(img_path).convert('RGB').resize((14, 14))

In this case, the output will no longer report an error!

[ 7.44867728e-01  2.34980489e+00 -2.27559823e-02 ...  3.59724475e-01
  9.42175007e+00  2.56441818e+01] (9408,)

Guess you like

Origin blog.csdn.net/wzk4869/article/details/131857028