python pit: Use the empty list as the default parameter, so I doubt encountered a supernatural codes

In python, do not use the lists, or other types of variable data container as default values. Otherwise you are likely to encounter strange problems Kiki.

If you are in a call to a function, you pass the same parameters, performed manually, each time the results are correct. But the execution loop through repeated several times, each time the results obtained are not the same, and each performed once, it returns the data are on a continuing basis Cadogan once. Then congratulations, you are likely to encounter the same problem with me.

First, the problem background

In a recent web application development framework with django, wrote a special function data retrieved from an interface, the data is processed, return to the front page.

But there have been strange question. When I direct the implementation of this function module is located separately on pycharm or call this function alone, after its execution, the returned data is no problem, and even in a short time, repeat run many times, it's no problem results ,all the same.

But when I put it in django project, run together throughout the project, click on the front page request is initiated from the back-end call this function for the first time is no problem, but if a very short time to initiate the same request again the data obtained will be superimposed on the time returned with the data. And if it is re-initiated after the same request from time to time, then it was not any problem for the first time, while the second execution of another problem.

This makes me senseless circle for several days, that met the supernatural code. Xiangponaodai can not understand what went wrong in the end. Even doubted the middle of the front end is not in the next ajax request, multiple calls to that function. Also doubted interface server response mechanism is not a problem. Or is there a problem I wrote the program logic. But step by step through the investigation, these have been rejected, the whole no problem.

This time is very troublesome, and when you know where the problem is, the check may be the answer, even if the problem is where all can not see, it can only be a little bit of investigation.

In the course of the investigation, I narrowed down step by step, eventually the "root cause" lock onto the parameters of this function. The fundamental reason is this: I'm on the function parameters, use the list as a default parameter.

If this approach from the specific application scenario is concerned, it is logically no problem, python interpreter allowed to do so, it will not perform error. But the key is the python memory reference mechanism of this language led to the emergence of two cases: if you repeat a short time to manually execute it many times, there is no problem; but if in one execution, a loop to repeat the call, you there may be unusual circumstances.

So in the end Why use an empty list as the default parameters will be the problem?

Accurate, it should be: the use of all types of variable data containers (such as lists, dictionaries) as the default parameters, are likely to experience this problem.

Second, the simulation to reproduce this problem:

The original project code for the entire function too long, in order to highlight the focus of the problem, I put the original contents of an entire function reduction, re-write two sample functions to compare:
Here Insert Picture Description

1, a simple talk about the role and the difference between these two functions:

The role of two functions are the same:
both pass in a name to store_name, judge page_num page is not on page 2. If not, they should add data to page_data this list, the page number plus 1, again re-execute this function. If page_num is page 2, then put the entire list after the processing returns.

The difference between the two functions: a list of empty directly as default values, a first empty list assigned to the variable, then the variable is passed to this parameter as a function of position;

2, a single run single call them what will happen?

We directly run the modules, a single call these two functions, give it a name to pass, "KFC", to see the effect of execution is kind of how.
(Note: The first function that has a default empty parameter list, do not we pass argument, it can also be a default page_data empty list to perform, but to call the second function, you must first define an empty list, it pass in as a parameter, this effect seems, it is the same)

Two function calls only once, the calling code as follows:

Here Insert Picture Description

In accordance with the normal way of thinking, we may feel that the results of two functions is the same.

Then we look at the output of the results:
Here Insert Picture Description
No problem, really output is the same. Demonstrated that single execution without problems. Manually repeatedly executed, and the result is not the problem, the same every time.

3, a single run call this function happens many times?

For loop will call two functions of each secondary code is as follows:
Here Insert Picture Description
The results are as follows:
Here Insert Picture Description
This result is not normal.

When the empty list as the default parameters:

In the first for loop, the inner loop is executed get_data_1 function twice, to obtain a list of [ 'KFC', 'KFC'];

While the second cycle for the beginning, it took time for the loop to get the list, and continue to use as the default parameters for the second cycle, and performs adding twice inside the function, so the second time for execution of the loop The result is [ "KFC", "KFC", "KFC", "KFC '].

However, before we call it a second time for the cycle, and does not assign any value to it page_data ah, hope that it will use the default parameters to perform, it would be strange, it does not have a value assigned Why? But also use the data for previous cycles get.

When an empty list as a location parameter when:

Even if you perform two times in a short time, it's twice the output result is still the same. This is in line with our expectations.

Normally, if we do not intend to call a function of the number of limitation, then when the results of our repeated call this function at the same parameters, it is necessary to get the same, not because of the number of calls and more, the result is output inconsistent. So obviously, get_data_1 this function is a problem, but get_data_2 this function is not the problem.

So what is the problem led to this situation?

The difference get_data_1 and get_data_2 just get_data_1 use the empty list as the default parameters. So by continuing research, I finally concluded that: This problem is characteristic of the language python caused.

Third, the analysis behind the hidden problems

Here it must start with four related problem is that these four questions will also appear abnormal operation resulted in the project, but no problem in a separate module manually repeated calls, the cycle repeated calls to be a problem.

1, is in the process of a variable assignment, python interpreter to do what?

当我们为一个变量赋值时,python解释器会先在计算机内存中,开辟一个内存空间,用于存放这个变量的值。当你为另一个变量赋值时,如果要赋的值跟第一个值是一样的,那么python解释器不会重新再开辟一个内存空间给你第二个变量使用,它只会调整你第二个变量的指向,让它指向第一个值存储的内存空间,你每次调用第二个变量时,它会直接到第一个值存储的内存空间去拿出数据。

演示一下:
打开ipython,在交互环境下,分别给a和b赋值,通过id(a)和id(b)可以获取到它们的内存地址。
Here Insert Picture Description
我们可以看到,当你给a和b赋值同样的内容时,它们的内存地址都是一样的。证实了上面所说的这个python语言特性。

2、当对一个列表执行append()操作时,改变了它的值,会不会改变它的内存地址?

答案是:不会。
这个也可以演示一下:
Here Insert Picture Description
当我们定义了一个空列表之后,即使你使用append(),给列表追加了内容,让它的值不一样,但执行前后2个变量的id还是不变的。

这说明了什么?在改变一个可变类型的数据时,python解释器只是将原来在内存中存放的值给你修改了,而不是重新给你一个内存空间去存放新的值。

3、python解释器什么时候释放内存?

当我们在一个模块下执行一整片代码时,python解释器是会从头到尾执行的,在执行结束之前,所有变量的赋值都会保持在内存中,随时可以调用。只有整个模块运行结束,内存才会被释放,所有变量的赋值会从内存中释放出来。

4、Django项目运行下,python的变量赋值是否会一直保存在内存中。

分为2种情况:
debug模式下(debug=Ture),django会加载reloader,隔一段时间重启一次服务,在每次服务重启之前,整个项目的变量赋值都会保存在内存中,只有重启服务才会释放内存后重新分配内存空间。
生产模式下(debug=False),不会自动重启服务,所以变量赋值会一直保存在内存空间中。

搞明白这4个问题之后,我们就可以重新分析一下,使用空列表作为默认参数为什么会产生这个问题。

四、我们来分析一下整个调用的过程中,python解释器都做了什么事:

分析细节请看图片上的文字
Here Insert Picture Description
Here Insert Picture Description
为了证实我的分析是正确的,我们可以分别在get_data_1和get_data_2之后,对page_data的内存地址进行打印,看看输出的结果。
Here Insert Picture Description
输出结果:
Here Insert Picture Description
结果正如所料:
第一次for循环中,get_data_1执行了2次,而输出的page_data内存地址始终没变化,说明python解释器始终调用的是同一个内存地址存放的值,即使它的值在第一次被调用的时候已经改变,也不影响它再次调用。

第二次for循环中,输出的page_data内存地址2次都不一样,说明python解释器在循环2次调用get_data_2之前,都有对变量page_data进行初始化重新赋值,所以不管你前一次对page_data的值做了什么改变,它最终都是先从空列表开始引用。

所以结论就是:python的内存引用机制,导致了当我们使用了可变类型的数据容器作为默认参数后,你的程序可能在某些操作下改变了它的值,而这个操作刚好是不会改变它的内存地址的,这时候python解释器只会在内存中对这个值进行修改,但暂时不会把它释放出来,所以你再一次引用这个默认参数时,会以为它还是你设置的默认参数值(空列表),但实际上它已经被修改过了。

五、如何避免这个问题:

1, do not use variable-type container as a default parameter data (including a list of dictionaries), it may be used as the positional parameters.

You can use get_data_2 such an approach, the list data type or variable data containers, before each call assignment, then use the location parameters of mass participation;

2, if you must use, need to be inside the function must determine whether it is the initial value is not what you want, if not, the assignment again.

But this method is only applicable to some cases, when you perform a function itself cyclically, and you need to call in an external loop, you can not determine in the end it is in the external circulation or in the internal cycle, so there it can not be determined each time value.

For example, I get_data_1 above, this function is not suitable for such a change, because it is based on the value of page_num to determine the number of cycles itself and automatically adjusts the value of the parameter page_data. So I can not judge when I called in circulation outside, in the end it is in my outer loop, the loop or in its internal functions, making it impossible to determine the value of the parameter should be.

Finally feeling about: the feeling that I have a problem Dan should not have.

Life is short, I learned python, learn python, life is even shorter.

Published 21 original articles · won praise 32 · views 3075

Guess you like

Origin blog.csdn.net/Jacky_kplin/article/details/103625734