I have little contact with data related to geology and minerals, and I have no experience. I just have time today, so I want to develop and build a target detection system based on artificially generated data sets. First, look at the renderings:
The data set is artificially synthesized, so it looks awkward, as follows:
The core method is the paste of PIL. If the combination is good, various data sets can be produced.
The YOLO format data annotation file is as follows:
The VOC format data annotation file is as follows:
The code for random partitioning of the data set is as follows:
pic_ids_list = [
one.split(".")[0].strip() for one in os.listdir(dataDir + "labels/")
]
print("pic_ids_list_length: ", len(pic_ids_list))
train_num = int(train_ratio * len(pic_ids_list))
train_list = random.sample(pic_ids_list, train_num)
test_list = [one for one in pic_ids_list if one not in train_list]
print("train_list_length: ", len(train_list))
print("test_list_length: ", len(test_list))
# 创建存储目录-测试集
testImgDir = saveDir + "images/test/"
testTxtDir = saveDir + "labels/test/"
if not os.path.exists(testImgDir):
os.makedirs(testImgDir)
if not os.path.exists(testTxtDir):
os.makedirs(testTxtDir)
for one_id in test_list:
if id_flag:
new_id = str(uuid.uuid4())
else:
new_id = one_id
shutil.copy(dataDir + "images/" + one_id + ".jpg", testImgDir + new_id + ".jpg")
shutil.copy(dataDir + "labels/" + one_id + ".txt", testTxtDir + new_id + ".txt")
print(
"================================Test Dataset Build Success================================"
)
"""
关注微信公众号
pythonAI之路
获取第一手学习更新进度
"""
# 创建存储目录-训练集
trainImgDir = saveDir + "images/train/"
trainTxtDir = saveDir + "labels/train/"
if not os.path.exists(trainImgDir):
os.makedirs(trainImgDir)
if not os.path.exists(trainTxtDir):
os.makedirs(trainTxtDir)
for one_id in pic_ids_list:
if id_flag:
new_id = str(uuid.uuid4())
else:
new_id = one_id
shutil.move(
dataDir + "images/" + one_id + ".jpg", trainImgDir + new_id + ".jpg"
)
shutil.move(
dataDir + "labels/" + one_id + ".txt", trainTxtDir + new_id + ".txt"
)
print(
"================================Train Dataset Build Success================================"
)
The default parameters are trained 100 times, and the results are as follows:
【F1】
data visualization
【PR curve】
Training visualization:
batch instance:
Examples of visual reasoning are as follows: