python爬虫"Hello World"级入门实例（二）,使用json从中国天气网抓取数据

一、二话不说先上代码

python2.7版

#!/usr/bin/python2.7
#-*- coding=UTF-8 -*-

import urllib
import json

def get_dic(url):
    page = urllib.urlopen(url)
    html = page.read()
    page.close()
    dic=json.loads(html)
    return dic

dic = get_dic("http://www.weather.com.cn/data/cityinfo/101010100.html")

print dic['weatherinfo']['city']
print dic['weatherinfo']['ptime']
print dic['weatherinfo']['temp1']
print dic['weatherinfo']['temp2']
print dic['weatherinfo']['weather']

python3.5版

#!/usr/bin/python3.5
#-*- coding=UTF-8 -*-

import urllib.request
import json

def get_dic(url):
    page = urllib.request.urlopen(url)
    html = page.read().decode('utf-8')
    page.close()
    dic=json.loads(html)
    return dic

dic = get_dic("http://www.weather.com.cn/data/cityinfo/101010100.html")

print(dic['weatherinfo']['city'])
print(dic['weatherinfo']['ptime'])
print(dic['weatherinfo']['temp1'])
print(dic['weatherinfo']['temp2'])
print(dic['weatherinfo']['weather'])

看看效果
无描述

二、简单说一下方法（以2.7版本代码为例）

def get_dic(url):
    page = urllib.urlopen(url)
    html = page.read()
    page.close()
    dic=json.loads(html)
    return dic

该函数通过urllib库提供的urlopen和read函数获取网页中的数据，但是这个网页的数据和一般的是有区别的，数据的格式是json的，所以后面就是重点，json.loads函数将返回的json格式数据解码为python的字典格式。

好啦，到此为止，我们就通过该函数从网页中获得了一个包含了天气数据的python字典，剩下的就是用字典的key来访问字典中的数据了，是不是很简单。

至于我们如何知道有哪些key呢，可以在访问之前用

for key in dic['weatherinfo']:
    print key,dic['weatherinfo'][key]

来遍历字典，看看有哪些内容，然后选选一些自己觉得需要的就好了。
还是附上源代码下载链接吧
python2.7版
 python3.5版

python爬虫"Hello World"级入门实例（二）,使用json从中国天气网抓取数据

一、二话不说先上代码

二、简单说一下方法（以2.7版本代码为例）

猜你喜欢