一天一篇mysql之十五:mysql中完美去重

表里面有很多垃圾数据,现在删除保留其中一条(是否重复判断基准为多个字段)

方案一:

DELETE
FROM
 vitae a
WHERE
 (a.peopleId, a.seq) IN (
  SELECT
   peopleId,
   seq
  FROM
   vitae
  GROUP BY
   peopleId,
   seq
  HAVING
   count(*) > 1
 )
AND rowid NOT IN (
 SELECT
  min(rowid)
 FROM
  vitae
 GROUP BY
  peopleId,
  seq
 HAVING
  count(*) > 1
)

无奈报错

解决

DELETE
FROM
 vitae a
WHERE
 (a.peopleId, a.seq) IN (
  SELECT t.* FROM (SELECT
   peopleId,
   seq
  FROM
   vitae
  GROUP BY
   peopleId,
   seq
  HAVING
   count(*) > 1) t
 )
AND rowid NOT IN (
 SELECT t.* FROM (SELECT
  min(rowid)
 FROM
  vitae
 GROUP BY
  peopleId,
  seq
 HAVING
  count(*) > 1) t
)

方案二:完美的【去重留一】SQL

DELETE consum_record
FROM
 consum_record, 
 (
  SELECT
   min(id) id,
   user_id,
   monetary,
   consume_time
  FROM
   consum_record
  GROUP BY
   user_id,
   monetary,
   consume_time
  HAVING
   count(*) > 1
 ) t2
WHERE
 consum_record.user_id = t2.user_id 
 and consum_record.monetary = t2.monetary
 and consum_record.consume_time = t2.consume_time
AND consum_record.id > t2.id;



=======================================================


DELETE users  FROM users , (SELECT user_id FROM users 
    GROUP BY name, age 
    having COUNT(*) > 1) AS t1
WHERE users .user_id = t1.qzkh_id

上面这条sql语句,仔细看一下,揣摩出思路也不难,大概也分为3步来理解:

(SELECT min(id) id, user_id, monetary, consume_time FROM consum_record GROUP BY user_id, monetary, consume_time HAVING count(*) > 1 ) t2 查询出重复记录形成一个集合(临时表t2),集合里是每种重复记录的最小ID

consum_record.user_id = t2.user_id and consum_record.monetary = t2.monetary and consum_record.consume_time = t2.consume_time 关联 判断重复基准的字段

猜你喜欢

转载自blog.csdn.net/weixin_39666581/article/details/82633682
今日推荐