Python中字典和列表的索引效率

序列


       Python有多种内建的序列,所有序列都可以做某些特定的操作,大致上常用的是:加,乘,索引,分片以及检查某个元素是否属于序列的成员。 在这里我们重点讨论两种,字典与列表。并且只讨论其索引效率

列表


       列表是Python中最具灵活性的有序集合对象类型,其属性:

1. 任意对象的有序集合

2. 通过偏移读取

3. 可变长度、异构以及任意嵌套

4. 属于可变序列的分类

5. 对象引用数组

字典


       除了列表外,字典是Python中最具灵活性的内置数据结构类型,其属性:

1. 通过键而不是偏移量来读取

2. 任意对象的无序集合(键的hash值存在有序)

3. 可变长、异构以及任意嵌套

4. 属于可映射类型

5. 对象引用表(散列表)

      在Python中,字典是通过哈希表实现的。也就是说,字典是一个数组,而数组的索引是键经过哈希函数处理后得到的。哈希函数的目的是使键均匀地分布在数组中。

索引效率分析


       通过上述可以简单地了解字典和列表的区别,列表是根据偏移量来读取的,字典是根据键的Hash来读取的。

通过实验来测试其索引效率:

#!/usr/bin/env python
# -*- coding:utf-8 -*-
###############################################
# File Name   : test.py
# Author      : Younger Liu
# Mail        : [email protected]
# Created Time: Tue 19 Jun 2018 08:58:42 AM CST
# Description : 
###############################################

import time
import random
import string

MAX_COUNT = 1000000

def test_dict(data, key=None):
	b_time = int(time.time()*1000*1000)
	data.get(key)
	a_time = int(time.time()*1000*1000)

	print("Elapsed time %d us to query [%s]" % (a_time - b_time, key))

def test_array(data, key=None):
	b_time = int(time.time()*1000*1000)
	for ele in data:
		if ele == key:
			break
		continue
	a_time = int(time.time()*1000*1000)
	print("Elapsed time %d us to query [%s]" % (a_time - b_time, key))

def gen_arr():
	arr = []
	for i in range(MAX_COUNT):
		ele = ''.join(random.sample(string.ascii_letters + string.digits, 8))
		arr.append(ele)
		if i == 0:
			first = ele
		elif i == MAX_COUNT - 1:
			last = ele
	return arr, first, last

def gen_dict():
	dict = {}
	for i in range(MAX_COUNT):
		ele = ''.join(random.sample(string.ascii_letters + string.digits, 8))
		dict[ele] = ele
		if i == 0:
			first = ele
		elif i == MAX_COUNT -1:
			last = ele
	return dict, first, last

if __name__ == '__main__':
	arr, first_ele, last_ele = gen_arr()
	print("-----Query first ele in array-------")
	test_array(arr, first_ele)
	print("-----Query last ele in array-------")
	test_array(arr, last_ele)
	print("-----Query ele not in array-------")
	test_array(arr, '111111')
	dict, first_key, last_key = gen_dict()
	print("-----Query first generated ele in dict-------")
	test_dict(dict, first_key)
	print("-----Query last generated ele in dict-------")
	test_dict(dict, last_key)
	print("-----Query ele not in dict-------")
	test_dict(dict, '111111')

运行结果如下:

-----Query first ele in array-------
Elapsed time 10 us to query [dRP3aEZQ]
-----Query last ele in array-------
Elapsed time 37236 us to query [6agsBq37]
-----Query ele not in array-------
Elapsed time 38666 us to query [111111]
-----Query first generated ele in dict-------
Elapsed time 12 us to query [DkYN2xJL]
-----Query last generated ele in dict-------
Elapsed time 3 us to query [BrNCeWP2]
-----Query ele not in dict-------
Elapsed time 2 us to query [111111]

多次运行,其结果相差不多

由此可见,字典的索引效率要远远大于列表

猜你喜欢

转载自blog.csdn.net/iamonlyme/article/details/81226050