关于python2和3版本不同引发的urllib报错及引出的字符串问题

在python2里有urllib和urllib2两个库，但是在python3里urllib2库没有了，因此代码从2移植到3会报一些错误。代码如下：

#!/usr/bin/env python
# -*- coding:UTF-8 -*-
import urllib
import urllib2
import json
deviceID="0000000666"
apikey = "a7e72c97-3aab-44db-af13-d9af0aee6506"
s = "s"
door = "door"
PIR = "pir"
Leak = "leak"
Smoke = "smoke"
Remote = "remote"
def http_post(data):
    try:
        url = 'http://www.linksprite.io/api/http'
        jdata = json.dumps(data)
        # print jdata
        req = urllib2.Request(url, jdata)
        req.add_header('Content-Type','application/json')
        print (req)
        print type(req)
        print(urllib2.urlopen(req))
        print(type(urllib2.urlopen(req)))
        response = urllib2.urlopen(req)
        print response.read()
        return response.read()
    except urllib2.URLError:
        print "connect failed"
        return "connect failed"
        pass

def http_post1(data):

    url = 'http://www.linksprite.io/api/http'
    jdata = json.dumps(data)
    # print jdata
    req = urllib2.Request(url, jdata)
    req.add_header('Content-Type','application/json')
    response = urllib2.urlopen(req)
    print response.read()
    return response.read()

values ={
    "action":"update",
    "apikey":apikey,
    "deviceid":deviceID,
    "params":
    {

    "d4": "0",
    "d3": "1",
    "d2": "1",
    "d1": "1",
    # "Door":door,
    # "PIR":PIR,
    # "Leak":Leak,
    # "Smoke":Smoke,
    # "Remote":Remote,
    # "SOS":s
    }}
print http_post(values)

这是一段向linkspriteIO平台发送post请求操作的python源码，通过post请求更新spriteIO平台上的数据，进而对LinkNode R4/R8进行有效的控制，实现能够远程操控LinkNode R4/R8的继电器开关的作用。
在python3中，Request函数从urllib2库里移到了urllib.request里；URLError从urllib2库里移到了urllib.error里。因此从python2.7搬到python3.5之后做出修改如下：

import urllib
from urllib import request
from urllib import error
# import urllib2
import json

def http_post(data):

    deviceID = "0000000666"
    apikey = "a7e72c97-3aab-44db-af13-d9af0aee6506"
    s = "s"
    door = "door"
    PIR = "pir"
    Leak = "leak"
    Smoke = "smoke"
    Remote = "remote"

    try:
        url = 'http://www.linksprite.io/api/http'
        jdata = json.dumps(data)
        # print jdata
        req = request.Request(url, jdata)
        req.add_header('Content-Type','application/json')
        # print(req)
        # print(type(req))
        # # print(request.urlopen(req))
        # print(type(request.urlopen(str.encode(req))))
        # response = request.urlopen(str.encode(req))
        response = request.urlopen(req)
        print (response.read())
        return response.read()
    except error.URLError:
        print ("connect failed")
        return "connect failed"
        pass

def http_post1(data):

    url = 'http://www.linksprite.io/api/http'
    jdata = json.dumps(data)
    # print jdata
    req = request.Request(url, jdata)
    req.add_header('Content-Type','application/json')
    response = request.urlopen(req)
    print (response.read())
    return response.read()

values ={
    "action":"update",
    "apikey":"a7e72c97-3aab-44db-af13-d9af0aee6506",
    "deviceid":"0000000666",
    "params":
    {

    "d4": "0",
    "d3": "0",
    "d2": "0",
    "d1": "0",
    # "Door":door,
    # "PIR":PIR,
    # "Leak":Leak,
    # "Smoke":Smoke,
    # "Remote":Remote,
    # "SOS":s
    }}
http_post(values)

出现如下错误：

TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

错误是在response = request.urlopen(req)这一句，对比urllib2的urlopen和urllib的request.urlopen，两者并无不同。上网搜索发现，问题出在python3的bytes类型和str类型的转换上。将jdata进行编码使它从str转成bytes之后就运行正常了。

jdata = json.dumps(data).encode("utf-8")

下面简单介绍一下python2和python3的字符串类型的变化。

与 Python2 相比，Python3 的字符串类型改成了 str 和 bytes，其中 str 相当于 Python2 的 unicode，bytes 相当于 Python2 的 str。从 redis 中拿回的数据是 bytes 类型，bytes 类型的与 list 中的 str 去比较则是永远都是 False。

在 Python2 中，unicode 和 str 的混合使用会有隐式的类型转换，Python3 中则是完全两种类型，不存在比较的可能性

print(u'' == '') # Python2 -> True
print(b'' == '') # Python3 -> False

Python2 中的 unicode 和 str 实际上都继承于 basestring

# python2
isinstance('', basestring) # True
isinstance(u'', basestring) # True

在 Python2 中处理字符串编码问题的时候，经常会让人感到疑惑，我究竟是要调用 decode 方法还是 encode 方法呢？哪怕你混用 decode 方法和 encode 方法都是没有问题的，不会有异常抛出。

# python2
s = ''
print(type(s)) # str
s.encode('utf-8') # 错误调用，不会报错
s.decode('utf-8') # 正确调用

但在 Python3 环境中，这两个类型就完全不同了。
Python3 中的正确用法

扫描二维码关注公众号，回复： 1001190 查看本文章

你如果去查看 Python3 中的 str 和 bytes 对象的方法，你会看到他们方法其实是大部分相同的，如 split, startswith 等等一类字符串的处理的方法两者都是有的。最重要的不同就是，str 只有 encode 方法，而 bytes 只有 decode 方法

# python3
s = ''
s.encode('utf-8')
e.decode('utf-8') # AttributeError: 'str' object has no attribute 'decode'

# 其对应的方法参数还是需要和原对象一致
b = b''
b.startswith('') # TypeError: startswith first arg must be bytes or a tuple of bytes, not str

除此之外，在 Python2 中，很多时候为了类型转换，可能就直接通过 str(obj) 来进行操作，之前这样处理是没问题的，但现在这种处理方式不可行了

# python3
b = b'hello world'
str(b) # b'hello world'

上述代码可以看到，通过 str 之后，bytes 的确是变成了 str 类型，但是其多出了一个 b 的前缀。这里的正确姿势是

# python3
if isinstance(b, bytes):
    b = b.decode('utf-8')
else:
    b = str(b)

除此以外，不少的标准库的函数接收的类型也限制了，例如 hashlib 中的方法只接收 bytes 类型，json.loads 只接收 str 类型等等。

Python3 的更新默认的 utf-8 编码解决了很多的问题。

相比于 Python2，可能 Python3 的处理要繁琐一点，但安全性会好很多，一些很奇怪的问题可以及时发现。例如 decode 和 encode 方法的明确。同时，因为这些变化，我们需要在 bytes 可能出现的地方留心（一般是程序外部来的数据），进行类型转换，数据交互的层面统一使用 str 进行处理。

与 Python2 相比，str 和 bytes 的命名其实也更贴近实际的情况。我是这样去记两者的关系的：str 是 unicode 的 code 的序列，可认为是该字符在世界的唯一标识（code point），而 bytes 则是 str 通过某种编码（utf-8）实际保存的二进制数据。unicode 是种协议，而 utf-8 是这种协议的某种实现方式。

参考网址：Python3 字符串问题

关于python2和3版本不同引发的urllib报错及引出的字符串问题

猜你喜欢