I have a list of test-failures as shown below -
all_failures = [
'test1/path/to/test1/log/failure_reason1',
'test1/path/to/test1/log/failure_reason2',
'test2/path/to/test2/log/failure_reason1',
'test2/path/to/test2/log/failure_reason2',
'test3/path/to/test3/log/failure_reason1',
'test4/path/to/test4/log/failure_reason1',
]
I am trying to construct a JSON like object by parsing each failure in the list. So far, i have tried to write the following code -
for failure in all_failures:
data = failure.split('/',1)
test = data[0]
failure_details_dict[test] = []
data = '/' + data[1]
data = data.rsplit('/', 1)
test_details_dict['path'] = data[0] + '/'
test_details_dict['reason'] = data[1]
failure_details_dict[test].append(test_details_dict)
test_details_dict = {}
for key,value in failure_details_dict.items():
print(key)
print(value)
print()
The output i am getting is -
test4
[{'reason': 'failure_reason1', 'path': '/path/to/test4/log/'}]
test3
[{'reason': 'failure_reason1', 'path': '/path/to/test3/log/'}]
test1
[{'reason': 'failure_reason2', 'path': '/path/to/test1/log/'}]
test2
[{'reason': 'failure_reason2', 'path': '/path/to/test2/log/'}]
whereas, the expected output is -
{
"test1": [
{
"path": "/path/to/test1/log/",
"reason": "failure_reason1"
},
{
"path": "/path/to/test1/log/",
"reason": "failure_reason2"
}
],
"test2": [
{
"path": "/path/to/test2/log/",
"reason": "failure_reason1"
},
{
"path": "/path/to/test2/log/",
"reason": "failure_reason2"
}
],
"test3": [
{
"path": "/path/to/test3/log/",
"reason": "failure_reason1"
},
],
"test4": [
{
"path": "/path/to/test4/log/",
"reason": "reason1"
},
]
}
As we can see, I have not been able to add the second path and reason for failure to the same key. Example - test1 and test2 have two reasons for failure.
Can someone please help to understand what i am missing? Thank you!
Reason
You are overwriting to failure_details_dict[test]
for every each loop.
Solution
You should set list to it only once.
You have several options to do it.
- Non-pythonic way(NOT RECOMMENDED)
if test not in failure_details_dict:
failure_details_dict[test] = []
- Replace assignment to
dict.setdefault
call. This way doesn't affect other interactions withfailure_details_dict
failure_details_dict.setdefault(test, []) # instead of failure_details_dict[test] = []
- Use
collections.defaultdict
instead ofdict
. This way will AFFECT other interactions withfailure_detilas_dict
.
from collections import defaultdict
failure_details_dict = defaultdict(list) # instead of {}
Example
And I have refactored your code:
all_failures = [
'test1/path/to/test1/log/failure_reason1',
'test1/path/to/test1/log/failure_reason2',
'test2/path/to/test2/log/failure_reason1',
'test2/path/to/test2/log/failure_reason2',
'test3/path/to/test3/log/failure_reason1',
'test4/path/to/test4/log/failure_reason1',
]
failure_details_dict = {}
for failure in all_failures:
key, *paths, reason = failure.split('/')
failure_details_dict.setdefault(key, []).append({
'path': f"/{'/'.join(paths)}/",
'reason': reason,
})
for key, value in failure_details_dict.items():
print(key)
print(value)
print()
Conclusion
- If you want a simple change, use
dict.setdefault
method. - If you have multiple accesses to
failure_details_dict
and you want default value for each access, usecollection.defaultdict
class.
Extra
How can we modify the code so that 'path' key is copied only once and only multiple dictionaries with 'reason' key is created? In general, what would be the best way to store the data in JSON format?
You can reformat your JSON like:
{
"test1": {
"path": "/path/to/test1/log/",
"reason": [
"failure_reason1",
"failure_reason2"
]
},
"test2": {
"path": "/path/to/test2/log/",
"reason": [
"failure_reason1",
"failure_reason2"
]
},
"test3": {
"path": "/path/to/test3/log/",
"reason": [
"failure_reason1"
]
},
"test4": {
"path": "/path/to/test4/log/",
"reason": [
"reason1"
]
}
}
From code:
all_failures = [
'test1/path/to/test1/log/failure_reason1',
'test1/path/to/test1/log/failure_reason2',
'test2/path/to/test2/log/failure_reason1',
'test2/path/to/test2/log/failure_reason2',
'test3/path/to/test3/log/failure_reason1',
'test4/path/to/test4/log/failure_reason1',
]
failure_details_dict = {}
for failure in all_failures:
key, *paths, reason = failure.split('/')
failure_details_dict.setdefault(key, {
'path': f"/{'/'.join(paths)}/",
'reason': [],
})['reason'].append(reason)
for key, value in failure_details_dict.items():
print(key)
print(value)
print()