The json module in Python 3 uses

1 Overview

JSON (JavaScript Object Notation) is a widely used lightweight data format. jsonModules in the Python standard library provide functions for processing JSON data.

A very commonly used basic data structure in Python is a dictionary. Its typical structure is as follows:

d = {
    'a': 123,
    'b': {
        'x': ['A', 'B', 'C']
    }
}

And the structure of JSON is as follows:

{
    "a": 123,
    "b": {
        "x": ["A", "B", "C"]
    }
}

As you can see, Dictionary and JSON are very close, and jsonthe main function provided by the library in Python is the conversion between the two.

2. Read JSON

json.loadsThe method can convert a JSON data str, bytesor bytearrayobject, into a Python Dictionary. Its gestalt interface signature is as follows:

json.loads(s, *, encoding=None, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)

2.1 The simplest example

json.loadsThe most basic way to use it is to pass a string containing JSON data strto this method:

>>> json.loads('{"a": 123}')
{'a': 123}

Notice

In Python, stra value can be enclosed in a pair of single quotes or a pair of double quotes:

>>> 'ABC' == "ABC"
True

strTherefore, it is legal and equivalent to use single or double quotes when defining the keys and values ​​of the Dictionary type:

>>> {"a": 'ABC'} == {'a': "ABC"}
True

However, in JSON, string data can only be placed in double quotes, so json.loadsin the JSON content of the string processed by the method, the string must use double quotes. Otherwise, a decoding error will occur:

>>> json.loads("{'a': 123}")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 354, in loads
    return _default_decoder.decode(s)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 339, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/decoder.py", line 355, in raw_decode
    obj, end = self.scan_once(s, idx)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes: line 1 column 2 (char 1)

If the Python string being processed is enclosed in double quotes, the double quotes in JSON need to be escaped:

>>> json.loads("{\"a\": 123}")
{'a': 123}

2.2 bytesand bytearraydata

For sums where the content is JSON data, the bytesmethod bytearraycan json.loadsalso handle:

>>> json.loads('{"a": 123}'.encode('UTF-8'))
{'a': 123}
>>> json.loads(bytearray('{"a": 123}', 'UTF-8'))
{'a': 123}

2.3 Encoding format

json.loadsThe second parameter encodinghas no actual effect.

Since types in Python 3 stralways use UTF-8 encoding, when the sparameter is a strtype, the json.loadsmethod automatically uses UTF-8 encoding. Also, it strcannot start with a BOM byte.

When the sparameter is bytesor bytearray, the json.loadsmethod will automatically determine whether it is UTF-8, UTF-16 or UTF-32 encoding. By default, it is also converted into an strobject according to UTF-8 encoding for subsequent processing.

2.4 Data type conversion

JSON can represent four main types of data

  1. string string

  2. number number

  3. boolean class boolean

  4. null

and a two-junction data structure

  1. object object

  2. array array

In the default implementation, the data conversion correspondence between JSON and Python is as follows:

JSON Python
object dict
array list
string str
number (int) int
number (real) float
true True
false False
null None

The actual conversion is as follows:

>>> json.loads("""
... {
...     "obj": {
...             "str": "ABC",
...             "int": 123,
...             "float": -321.89,
...             "bool_true": true,
...             "bool_false": false,
...             "null": null,
...             "array": [1, 2, 3]
...     }
... }""")
{'obj': {'str': 'ABC', 'int': 123, 'float': -321.89, 'bool_true': True, 'bool_false': False, 'null': None, 'array': [1, 2, 3]}}

For data of type number in JSON, the following points need to be noted:

  1. The precision of the real number type in JSON cannot exceed the precision range of the float type in Python, otherwise there will be a loss of precision. The following example:

    >>> json.loads('3.141592653589793238462643383279')
    3.141592653589793
  2. The JSON standard does not include non-numeric NaN, positive infinity and negative infinity-Infinity, but the json.loadsmethod defaults to convert the NaN, Infinity, in the JSON string to the , and -Infinityin the Python string . Note that the , , in the JSON must be correctly cased and Spell complete. Example belowfloat('nan')float('inf')float('-inf')NaNInfinity-Infinity

    >>> json.loads('{"inf": Infinity, "nan": NaN, "ninf": -Infinity}')
    {'inf': inf, 'nan': nan, 'ninf': -inf}

2.5 Custom JSON object conversion type

json.loadsBy default, the object data in JSON is converted to Dictionary type, and object_hookparameters can be used to change the constructed object.

object_hookAccepts a function whose input parameter is a Dictionary object converted from object data in JSON, and its return value is a custom object. As shown in the following example:

>>> class MyJSONObj:
...     def __init__(self, x):
...             self.x = x
...
>>> def my_json_obj_hook(data):
...     print('obj_hook data: %s' % data)
...     return MyJSONObj(data['x'])
...
>>> result = json.loads('{"x": 123}', object_hook=my_json_obj_hook)
obj_hook data: {'x': 123}
>>> type(result)
<class '__main__.MyJSONObj'>
>>> result.x
123

When the objects in JSON are nested, the json.loadsmethod will traverse the object tree in a depth-first manner, and pass the object data of each layer to object_hook. The Python object constructed from the JSON object of the leaf node will be passed as a value of the parent node. The method given to the parent node object_hook. For example:

>>> class MyJSONObj:
...     def __init__(self, x, y):
...             self.x = x
...             self.y = y
...
>>> def my_json_obj_hook(data):
...     print('obj_hook data: %s' % data)
...     return MyJSONObj(**data)
...
>>> result = json.loads('{"x": {"x": 11, "y": 12}, "y": {"x": 21, "y":22}}', object_hook=my_json_obj_hook)
obj_hook data: {'x': 11, 'y': 12}
obj_hook data: {'x': 21, 'y': 22}
obj_hook data: {'x': <__main__.MyJSONObj object at 0x10417ef28>, 'y': <__main__.MyJSONObj object at 0x10417ed68>}

In addition to the object_hookparameter, there is also a object_pairs_hookparameter. This parameter can also be used to change json.loadsthe type of the Python object constructed by the method. The object_hookdifference between this parameter and the parameter is that the input data received by the incoming method is not a Dictionary, but an array containing tuple. listEach tuplehas two elements, the first element is the key in the JSON data, and the second element is the value corresponding to this key. Such as a JSON object

{
    "a": 123,
    "b": "ABC"
}

The corresponding input data is

[
    ('a': 123),
    ('b', 'ABC')
]

When calling the json.loadsmethod, specifying object_hookboth and object_pairs_hook, object_pairs_hookoverrides the object_hookparameters.

2.6 Custom JSON number conversion type

In the default implementation, real numbers in JSON are converted to Python floattypes, and integers are converted to intor longtypes. Similarly object_hook, we can specify custom conversion logic through parse_floatand parameters. The input parameters of these two methods are JSON real numbers or integers. parse_intString. In the following example, we convert real numbers to numpy.float64and integers to numpy.int64:

>>> def my_parse_float(f):
...     print('%s(%s)' % (type(f), f))
...     return numpy.float64(f)
...
>>> def my_parse_int(i):
...     print('%s(%s)' % (type(i), i))
...     return numpy.int64(i)
...
>>> result = json.loads('{"i": 123, "f": 321.45}', parse_float=my_parse_float, parse_int=my_parse_int)
<type 'str'>(123)
<type 'str'>(321.45)
>>> type(result['i'])
<type 'numpy.int64'>
>>> type(result['f'])
<type 'numpy.float64'>

2.6.1 Custom NaN, Infinityand -Infinityconversion types

Since standard JSON data does not support NaN, Infinityand -Infinity, parse_floatthese values ​​will not be received. When you need to customize the objects converted by these values, you need to use another interface parse_constant. For example, in the following example, these values Values ​​are also converted to numpy.float64types:

>>> def my_parse_constant(data):
...     print('%s(%s)' % (type(data), data))
...     return numpy.float64(data)
...
>>> result = json.loads('{"inf": Infinity, "nan": NaN, "ninf": -Infinity}', parse_constant=my_parse_constant)
<type 'str'>(Infinity)
<type 'str'>(NaN)
<type 'str'>(-Infinity)
>>> result['inf']
inf
>>> type(result['inf'])
<type 'numpy.float64'>

2.7 Non-object top-level values

According to the JSON specification, a JSON data can contain only one value, not a complete object. This value can be a string, a number, a boolean value, a null value, or an array. Except for these three JSON specifications The given type can also be NaN, Infinityor -Infinity:

>>> json.loads('"hello"')
'hello'
>>> json.loads('123')
123
>>> json.loads('123.34')
123.34
>>> json.loads('true')
True
>>> json.loads('false')
False
>>> print(json.loads('null'))
None
>>> json.loads('[1, 2, 3]')
[1, 2, 3]

2.8 Duplicate key names

In the JSON object of the same level, there should not be duplicate key names, but the JSON specification does not give the handling standard for this situation. In json.loads, when there are duplicate key names in the JSON data, the latter key values ​​will overwrite the former ones. :

>>> json.loads('{"a": 123, "b": "ABC", "a": 321}')
{'a': 321, 'b': 'ABC'}

2.9 Processing JSON data files

When the JSON data is stored in a file, the json.loadmethod can be used to read the data from the file and convert it to a Python object. json.loadThe first parameter of the method is the file type object pointing to the JSON data file.

For example /tmp/data.json, the file contains the following:

{
    "a": 123,
    "b": "ABC"
}

You can use the code in the following example to read and convert JSON data in a file:

>>> with open('/tmp/data.json') as jf:
...     json.load(jf)
...
{u'a': 123, u'b': u'ABC'}

In addition to the file type object, as long as it is a readfile-like object that implements the method, it can be used as a fpparameter, such as in the following example io.StringIO:

>>> sio = io.StringIO('{"a": 123}')
>>> json.load(sio)
{'a': 123}

json.loadThe meanings and usage methods of other parameters of the method are the json.loadssame as those above, and will not be repeated here.

3 Generate JSON

json.dumpsmethod to convert a Python object to a string representing JONS data. Its full interface signature is as follows:

json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)

Its first parameter objis the data object to be converted.

>>> json.dumps({'a': 123, 'b': 'ABC'})
'{"a": 123, "b": "ABC"}'

3.1 Encoding format

json.dumpsThe ensure_asciiparameter is used to control the encoding of the generated JSON string. The default value is True, at this time, all non-ASCII code words will be escaped. If you do not want to escape automatically, the original encoding will be maintained, limited to UTF-8 . As in the example below:

>>> json.dumps({'数字': 123, '字符': '一二三'})
'{"\\u6570\\u5b57": 123, "\\u5b57\\u7b26": "\\u4e00\\u4e8c\\u4e09"}'
>>> json.dumps({'数字': 123, '字符': '一二三'}, ensure_ascii=False)
'{"数字": 123, "字符": "一二三"}'

3.2 Data Type Conversion

In the default implementation json.dumps, the types of Python objects that can be processed, and all their attribute values, must be dict, list, tuple, str, floator int. The data conversion relationship between these types and JSON is as follows:

Python JSON
dict object
list, tuple array
str string
int, float, int-&float-derived emuns number
True true
False false
None null

The actual conversion situation is as follows:

>>> json.dumps(
...     {
...             'str': 'ABC',
...             'int': 123,
...             'float': 321.45,
...             'bool_true': True,
...             'bool_false': False,
...             'none': None,
...             'list': [1, 2, 3],
...             'tuple': [12, 34]
...     }
... )
'{"str": "ABC", "int": 123, "float": 321.45, "bool_true": true, "bool_flase": false, "none": null, "list": [1, 2, 3], "tuple": [12, 34]}'

Although the JSON standard does not support NaN, Infinityand -Infinity, json.dumpsthe default implementation converts float('nan'), float('inf')and float('-inf')to the constants NaN, Infinity, and -Infinity. As shown in the following example:

>>> json.dumps(
...     {
...             'nan': float('nan'),
...             'inf': float('inf'),
...             '-inf': float('-inf')
...     }
... )
'{"nan": NaN, "inf": Infinity, "-inf": -Infinity}'

Since these constants may cause the generated JSON string to not be processed by other JSON implementations, in order to prevent this from happening, json.dumpsthe allow_nanparameter can be set to True. At this point, when these values ​​appear in the processed Python object, the json.dumpsmethod will throw An exception occurred.

3.3 Circular references

json.dumpsThe method checks whether there are circular references in the Python object, and if a circular reference is found, an exception is thrown. As shown in the following example:

>>> circular_obj = {}
>>> circular_obj['self'] = circular_obj
>>> circular_obj
{'self': {...}}
>>> json.dumps(circular_obj)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
ValueError: Circular reference detected

If you don't want json.dumpsthe method to check for circular references, you can set the check_circularparameter Falseas :

>>> json.dumps(circular_obj, check_circular=False)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 238, in dumps
    **kw).encode(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
RecursionError: maximum recursion depth exceeded while encoding a JSON object

3.4 JSON string output format

json.dumpsThe indentparameters of the method can be used to control the wrapping and indentation of JSON strings.

indentThe default value of the parameter is None. At this time, the JSON string will not have line wrapping and indentation effects. As shown below:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}))
{"a": 123, "b": {"x": 321, "y": "ABC"}}

When indent0 or negative, JSON characters include newlines:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=-1))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
}
>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=0))
{
"a": 123,
"b": {
"x": 321,
"y": "ABC"
}
}

And when it indentis a positive integer, in addition to newlines, JSON will also indent the object hierarchy by the specified number of spaces:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=2))
{
  "a": 123,
  "b": {
    "x": 321,
    "y": "ABC"
  }
}

indentAlternatively str, at this point, JSON will strbe indented by content, such as tabs \t:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent='\t'))
{
        "a": 123,
        "b": {
            "x": 321,
            "y": "ABC"
        }
}

json.dumpsAnother parameter of separatorscan be used to set the output separator. The value of this parameter should be a two-element tupleseparator. The first value is the separator between members, and the second value is the separator between key values. . Its default value will also be indentaffected by the parameters above. When it is ,indent the default value is , that is, there is a space after the separator. When not , the default value is , that is, only after the separator between key values There will be a space, and the inter-element separator will have no space, because there will be a newline.Noneseparators(', ', ': ')indentNone(',', ':')

separatorsOne possible use case for parameters is to reduce the size of the JSON string by removing all non-essential formatting characters. In this case, you can separatorset it to (',', ';'), and not set indentthe parameter, or explicitly set it to None:

>>> print(json.dumps({'a': 123, 'b': {'x': 321, 'y': 'ABC'}}, indent=None, separators=(',', ':')))
{"a":123,"b":{"x":321,"y":"ABC"}}

3.5 Converting custom Python objects

json.dumpsThe default implementation of can only convert objects of type Dictionary. If you want to convert a custom object, you need to use a defaultparameter. This parameter accepts a function, the parameter of this function is a Python object to be converted, and the return value can represent the Python object. Dictionary object. defaultThe function starts from the top level of the object reference tree and traverses the entire object reference tree layer by layer. Therefore, instead of implementing the traversal logic of the object tree by yourself, you only need to process the objects at the current level. As shown in the following example:

>>> class MyClass:
...     def __init__(self, x, y):
...             self.x = x
...             self.y = y
...
>>> def my_default(o):
...     if isinstance(o, MyClass):
...             print('%s.y: %s' % (type(o), o.y))
...             return {'x': o.x, 'y': o.y}
...     print(o)
...     return o
...
>>> obj = MyClass(x=MyClass(x=1, y=2), y=11)
>>> json.dumps(obj, default=my_default)
<class '__main__.MyClass'>.y: 11
<class '__main__.MyClass'>.y: 2
'{"x": {"x": 1, "y": 2}, "y": 11}'

3.6 Non-string type key names

In Python, only hashable objects and data can be used as the keys of Dictionary objects, while in the JSON specification, only strings can be used as key names. Therefore, in json.dumpsthe implementation of this rule, the Check, but the allowed range of key names has been expanded, str, int, float, booland Nonetypes of data can be used as key names. However, when the key name is not strthe case, the key name will be converted to the corresponding strvalue. The following example:

>>> json.dumps(
...     {
...             'str': 'str',
...             123: 123,
...             321.54: 321.54,
...             True: True,
...             False: False,
...             None: None
...     }
... )
'{"str": "str", "123": 123, "321.54": 321.54, "true": true, "false": false, "null": null}'

And when other types of key names appear, an exception is thrown by default:

>>> json.dumps({(1,2): 123})
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
TypeError: keys must be a string

json.dumpsThe skipkeysparameter can change this behavior. When skipkeysset Trueto , when an illegal key type is encountered, no exception will be thrown, but the key will be skipped:

>>> json.dumps({(1,2): 123}, skipkeys=True)
'{}'

3.7 Generate JSON file

When you need to save the generated JSON data to a file, you can use a json.dumpmethod. This method json.dumpshas one more parameter fp, which is the file object used to save the JSON data. For example, the code in the following example

>>> with open('/tmp/data.json', mode='a') as jf:
...     json.dump({'a': 123}, jf)
...

The JSON data will be written to the /tmp/data.jsonfile. After the code is executed, the content of the file is

{"a": 123}

json.dumpMethods can also accept other file-like objects:

>>> sio = io.StringIO()
>>> json.dump({'a': 123}, sio)
>>> sio.getvalue()
'{"a": 123}'

json.dumpThe usage of other parameters json.dumpsis the same as that of , and will not be repeated here.

4 SON decoding and encoding class implementation

json.loads, json.load, json.dumpsand json.dumpthese four methods complete their respective tasks through json.JSONDecoderand json.JSONEncoderthese two classes. Therefore, these two classes can also be used directly to complete the functions described above:

>>> json.JSONDecoder().decode('{"a": 123}')
{'a': 123}
>>> json.JSONEncoder().encode({'a': 123})
'{"a": 123}'

json.loads, json.load, json.dumpsand json.dumpthe parameters of these four methods are mainly passed to the constructor of json.JSONDecoderand json.JSONEncoder, so using these methods can meet most needs. When you need to customize json.JSONDecoderand json.JSONEncodersubclass, you only need to pass the subclass to the clsparameter. At the same time, these methods have **kwparameters. When the constructor of the custom implementation class requires new parameters beyond the standard parameter list, this parameter will pass the new parameters to the constructor of the implementation class.

5 Related Resources

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325354974&siteId=291194637