表里面有很多垃圾数据,现在删除保留其中一条(是否重复判断基准为多个字段)
方案一:
DELETE
FROM
vitae a
WHERE
(a.peopleId, a.seq) IN (
SELECT
peopleId,
seq
FROM
vitae
GROUP BY
peopleId,
seq
HAVING
count(*) > 1
)
AND rowid NOT IN (
SELECT
min(rowid)
FROM
vitae
GROUP BY
peopleId,
seq
HAVING
count(*) > 1
)
无奈报错
解决
DELETE
FROM
vitae a
WHERE
(a.peopleId, a.seq) IN (
SELECT t.* FROM (SELECT
peopleId,
seq
FROM
vitae
GROUP BY
peopleId,
seq
HAVING
count(*) > 1) t
)
AND rowid NOT IN (
SELECT t.* FROM (SELECT
min(rowid)
FROM
vitae
GROUP BY
peopleId,
seq
HAVING
count(*) > 1) t
)
方案二:完美的【去重留一】SQL
DELETE consum_record
FROM
consum_record,
(
SELECT
min(id) id,
user_id,
monetary,
consume_time
FROM
consum_record
GROUP BY
user_id,
monetary,
consume_time
HAVING
count(*) > 1
) t2
WHERE
consum_record.user_id = t2.user_id
and consum_record.monetary = t2.monetary
and consum_record.consume_time = t2.consume_time
AND consum_record.id > t2.id;
=======================================================
DELETE users FROM users , (SELECT user_id FROM users
GROUP BY name, age
having COUNT(*) > 1) AS t1
WHERE users .user_id = t1.qzkh_id
上面这条sql语句,仔细看一下,揣摩出思路也不难,大概也分为3步来理解:
(SELECT min(id) id, user_id, monetary, consume_time FROM consum_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2
查询出重复记录形成一个集合(临时表t2),集合里是每种重复记录的最小ID
consum_record.user_id = t2.user_id and consum_record.monetary = t2.monetary and consum_record.consume_time = t2.consume_time
关联 判断重复基准的字段