Java 下载远端图片到本地

版权声明:本文为博主原创文章,未经博主允许不得转载。 https://blog.csdn.net/wangchengming1/article/details/81949376

需求是这样的:

{
    "gid":"question:9002511534732700001-subject:0-qtype:2",
    "qtypeId":"2",
    "qtypeName":"单选题",
    "qtypeAlias":"danxuan",
    "difficulty":1,
    "subjectId":"0",
    "subjectName":"",
    "intro":"",
    "prompt":"<p>以下三个选项不是小动物的是?</p>",
    "score":"0",
    "tip":"可能是一种文具。",
    "parse":"<p>选项AB都是动物,选项C是铅笔。</p>",
    "comment":"",
    "kplist":[
        {
            "id":28,
            "name":"词义",
            "school_id":0
        }
    ],
    "cognition":[

    ],
    "book":[

    ],
    "parent_id":"0",
    "options":{
        "A":"<img width="300px" src="http://static.dev.anoah.com/uploads/onlinedocument/fd/fa/c20cea6c/a7e617d4/bedf966e6aef/original.png"/>",
        "B":"<img width="300px" src="http://static.dev.anoah.com/uploads/onlinedocument/57/d0/75ffdd8e/cf7d4e87/1396663d1ee4/original.jpg"/>",
        "C":"<img width="300px" src="http://static.dev.anoah.com/uploads/onlinedocument/02/d4/a839b912/bc06fef9/8f0d40c362e7/original.jpg"/>"
    },
    "isPicRes":0,
    "answer":"C"
}

返回的数据结构如上,我想下载这里面的三张图片到本地,作为资源本地化。

自己写的一个工具类:


import java.io.File;
import java.io.FileOutputStream;
import java.io.InputStream;
import java.net.URL;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
 * 下载图片工具
 * 
 * @author cm_wang
 *
 */
public class DownloadImgUtils {

  // 获取img标签的正则
  private static final String IMGURL_REG = "<img.*src=(.*?)[^>]*?>";
  // 获取src路径的正则
  private static final String IMGSRC_REG = "(http|https):\"?(.*?)(\"|>|\\s+)";

  /**
   * 获取img的url
   * 
   * @param url
   * @return
   */
  public static List<String> getImageUrl(String url) {
    Matcher matcher = Pattern.compile(IMGURL_REG).matcher(url);
    List<String> listImgUrl = new ArrayList<String>();
    while (matcher.find()) {
      listImgUrl.add(matcher.group());
    }
    return listImgUrl;
  }

  /**
   * 获取img的src路径
   * 
   * @param listImageUrl
   * @return
   */
  public static List<String> getImageSrc(List<String> listImageUrl) {
    List<String> listImgSrc = new ArrayList<String>();
    for (String image : listImageUrl) {
      Matcher matcher = Pattern.compile(IMGSRC_REG).matcher(image);
      while (matcher.find()) {
        listImgSrc.add(matcher.group().substring(0, matcher.group().length() - 1));
      }
    }
    return listImgSrc;
  }

  /**
   * 下载资源
   * 
   * @param listImgSrc
   * @param savePath
   */
  public static void Download(List<String> listImgSrc, String savePath) {
    try {
      for (String url : listImgSrc) {
        if (url.indexOf("http") >= 0) {
          String imageName = url.substring(url.indexOf("/") + 46, url.length()).replace("\\", "");
          URL uri = new URL(url.replace("\\", ""));
          InputStream in = uri.openStream();
          File file = new File(savePath, imageName);
          if(!file.exists()){
            if(!file.getParentFile().exists()){
              file.getParentFile().mkdirs();
            }
            FileOutputStream fo = new FileOutputStream(file);
            byte[] buf = new byte[1024];
            int length = 0;
            logger.info("开始下载:" + url);
            while ((length = in.read(buf, 0, buf.length)) != -1) {
              fo.write(buf, 0, length);
            }
            in.close();
            fo.close();
            logger.info(savePath + imageName + "下载完成");
          }
        }
      }
    } catch (Exception e) {
      logger.info("下载失败");
    }
  }

}

使用方法大概是这样子的:

	JSONObject data = JSONObject.parseObject(result);
    // 获取img
    List<String> imgList = DownloadImgUtils.getImageUrl(data.toString());
    // 判断json里有没有流媒体资源
    if(!CollectionUtils.isEmpty(imgList)) {
      // 获取img的src
      List<String> srcList = DownloadImgUtils.getImageSrc(imgList);
      String savePath = "D:/files/";
      // 开始下载
      DownloadImgUtils.Download(srcList, savePath);
    }
  • 思路大概就是:我首先去通过正则找到<img>这个标签,然后在找到src属性,根据src对应的url进行下载,可能写的比较粗糙,有些细节没有考虑的很全面,后面完善。

  • 还有就是有些时候不一定只是下载图片,还有一些别的流媒体资源,这个时候就不需要正则取匹配img,直接去匹配http或者https就可以了。

猜你喜欢

转载自blog.csdn.net/wangchengming1/article/details/81949376