今天来梳理一下Python set 类型的知识点,特此申明下面信息均参考自公司培训课PPT Nagiza F. Samatova, NC State Univ. All rights,主要涉及到下面几个点:
- • Mutable unordered collection of objects
- • Items: mutable, heterogeneous type & unique
- • Operations and methods: create, update, access, query, remove
- • Traversal: by item with in-operator
Set is a mutable collection of immutable objects.
set是由不可修改的元素组合成的一个可修改的集合
set_obj = {
1,2,3}
tuple_set_obj = {
(1,2,3), (4,5,6)}
print('set_obj: {}'.format(set_obj))
print('tuple_set_obj: {}'.format(tuple_set_obj))
list_set_obj = {
[1,2,3]}
print('list_set_obj: {}'.format(list_set_obj))
set中不能有list类型的object,因为list是mutable,不可hash
Items must be hashable (int, float, str, tuple)
# output:
set_obj: {
1, 2, 3}
tuple_set_obj: {
(1, 2, 3), (4, 5, 6)}
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-563-4e03cceaee62> in <module>
3 print('set_obj: {}'.format(set_obj))
4 print('tuple_set_obj: {}'.format(tuple_set_obj))
----> 5 list_set_obj = {
[1,2,3]}
6 print('list_set_obj: {}'.format(list_set_obj))
TypeError: unhashable type: 'list'
Set: No Duplicate Objects
可以参考文章《set去重原理》
Each set item is being hashed:
• i.e., mapped to a hash value
• for fast membership checks
Duplicates get the same hash value
• i.e., eliminated
set_obj = {
1,1,1}
tuple_set_obj = {
(1,2,3), (1,2,3)}
print('set_obj: {}'.format(set_obj))
print('tuple_set_obj: {}'.format(tuple_set_obj))
#output:
set_obj: {
1}
tuple_set_obj: {
(1, 2, 3)}
Set is an unordered collection
所以set是不支持随机index访问的
CREATE A SET OBJECT
empty
s = {}
s= set()
via coercionof another—强制类型转换
s = { any_type_immutable_object }
s = set (any_type_mutable_or_immutable_object )扫描二维码关注公众号,回复: 12718865 查看本文章![]()
str to set
str_obj = 'python'
set_via_symbol = {
str_obj}
set_via_constructor = set(str_obj)
print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))
# output:
set_via_symbol: {
'python'}
set_via_constructor: {
'h', 'o', 'n', 't', 'y', 'p'}
tuple to set
tuple_obj = ('one', 'two')
set_via_symbol = {
tuple_obj}
set_via_constructor = set(tuple_obj)
print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))
# output:
set_via_symbol: {
('one', 'two')}
set_via_constructor: {
'two', 'one'}
list to set
注意list不支持{list}->set,因为set不能有mutable object
list_obj = ['one', 'two']
# set_via_symbol = {
list_obj}
set_via_constructor = set(list_obj)
# print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))
# output:
set_via_constructor: {
'two', 'one'}
dict to set
注意dict不支持{dict}->set,因为set不能有mutable object
dict_obj = {
'script': 'python', 'version': '3.8'}
# set_via_symbol = {
dict_obj}
set_via_constructor = set(dict_obj)
# print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))
# output:
set_via_constructor: {
'version', 'script'}
Set Operations
UNION, INTERSECTION, DIFFERENCE, SYMMETRIC DIFFERENCE
set_obj = {
1,2,3,4,5,6,7,8}
set_obj2 = {
1,3,5,7,9}
set_difference = set_obj - set_obj2
set_union = set_obj | set_obj2
set_intersection = set_obj & set_obj2
set_symmetric_difference = set_obj ^ set_obj2
print('set_difference: {}'.format(set_difference))
print('set_union: {}'.format(set_union))
print('set_intersection: {}'.format(set_intersection))
print('set_symmetric_difference: {}'.format(set_symmetric_difference))
Sets are Mutable: Can change their value
# output:
set_difference: {
8, 2, 4, 6}
set_union: {
1, 2, 3, 4, 5, 6, 7, 8, 9}
set_intersection: {
1, 3, 5, 7}
set_symmetric_difference: {
2, 4, 6, 8, 9}
Sets: Other Methods
s = {
1, 3, 7, 9}
s2 = {
3, 7, 8, 12}
Operation | Description | Output |
---|---|---|
s.remove(7) | Remove element | {1, 3, 9} |
s.copy() | Return copy of the set | {1, 3, 7, 9} |
s.add(10) | Add given element to the set | {1, 3, 7, 9, 10} |
s.isdisjoint(s2) | Return True if sets have NO common elements | False |
s.intersection(s2) | Return new set with common elements to another one | {3, 7} |
s = {
1,2,3,4,5,7}
print('original s:\n id: {}\t set content:{}'.format(id(s), s))
print('copy s:\n id: {}\t set content:{}'.format(id(s.copy()), s.copy()))
s.remove(2)
print('after remove 2:\nid: {}\t set content:{}'.format(id(s), s))
s.add(8)
print('after add 8:\nid: {}\t set content:{}'.format(id(s), s))
Sets are Mutable: Can change their value
remove, add后set的id依然不变
copy是返回另外一个set对象
# output:
original s:
id: 2239126441536 set content:{
1, 2, 3, 4, 5, 7}
copy s:
id: 2239126982720 set content:{
1, 2, 3, 4, 5, 7}
after remove 2:
id: 2239126441536 set content:{
1, 3, 4, 5, 7}
after add 8:
id: 2239126441536 set content:{
1, 3, 4, 5, 7, 8}
QUERY AND ITERATE
in operator
● to check if an item is in a set
● Does NOT require traversal of the entire set (fast operation)
s = {
'python'}
print('python is in set: {}'.format('python' in s))
print('java is in set: {}'.format('java' in s))
# output:
python is in set: True
java is in set: False
Unsupported Operations
Concatenation with (+)
s1 = {
1,2,3}
s2 = {
4,5}
s = s1 + s2
# output:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-590-a84a0ef792a5> in <module>
1 s1 = {
1,2,3}
2 s2 = {
4,5}
----> 3 s = s1 + s2
TypeError: unsupported operand type(s) for +: 'set' and 'set'
Replication with (*)
s1 = {
1,2,3}
s = s1 * 2
# output:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
<ipython-input-591-75b9dae5da0b> in <module>
1 s1 = {
1,2,3}
----> 2 s = s1 * 2
TypeError: unsupported operand type(s) for *: 'set' and 'int
Set Traversal
Approaches: by item
s1 = {
1,2,3}
for item in s1:
print(item)
# output:
1
2
3
PERFORMANCE
Optimized for
● Store a unique collection of objects
● Add new items
● Remove / discard any item: fast for sets compared to lists / tuples
● Remove duplicate items: fast for sets compared to lists
● Check membership using in operator: faster compared to lists / tuples
● Data wrangling operations fast for sets compared to lists / tuples: Unions, Intersections, Symmetric difference