Friday, March 23, 2012

help on find duplicates and delete procedure

Hi everybody I need help on finding duplicates and deleting the duplicate record depending on name and fname , deleting the duplicates and leaving only the first one.

my PERSON table is this below:

ID name fname ownerid id2

1 a b
2 c c
3 e f
4 a b 1 10
5 c c 2 11

I have this query below that returns records 1 and 4 and 2 and 5 since they have the same name and fname

select * from ( Select name ,fname, count(1) as cnt from PERSON group by
name,Fname ) where cnt > 1

ID name fname ownerid id2

1 a b
4 a b 1 10

2 c c
5 c c 2 11

With this result I need to delete the second record of each group but update the first records with the ownerid and id2 of the second record that would be deleted... I don't know how to proceed with this..

thanks
alexhow do you know which one is the "second" record?

and why go to the trouble of updating the one you keep? why not delete the one that doesn't have values in ownerid and id2?|||how do you know which one is the "second" record?

and why go to the trouble of updating the one you keep? why not delete the one that doesn't have values in ownerid and id2?

the second record is the one with ownerid and id2 well the reason behind is that the second record is inserted from another table and the id of the first one is the one i need due to relationships with other tables|||I see trouble ahead ;) If you update the missing data in the "first" of the duplicate rows, then how are you going to figure out which row is which when you later attempt to delete the "second" row?

Keep in mind that your question/description is rife with suggestions of "order" in your data. Beware "order" where there is none but that which is artificially created. (I need to get a job writing fortune cookies).

Probably you will need to add a column or work in a temporary table so that you can artificially add a "flag" of some type so you know which rows are the "good" rows and which ones you will later slaughter mercilessly after brazenly and carelessly using them only for their data.|||I see trouble ahead ;) If you update the missing data in the "first" of the duplicate rows, then how are you going to figure out which row is which when you later attempt to delete the "second" row?

Keep in mind that your question/description is rife with suggestions of "order" in your data. Beware "order" where there is none but that which is artificially created. (I need to get a job writing fortune cookies).

Probably you will need to add a column or work in a temporary table so that you can artificially add a "flag" of some type so you know which rows are the "good" rows and which ones you will later slaughter mercilessly after brazenly and carelessly using them only for their data.

yes thanks for your suggestion will try it then...|||this might work for u ...
[code]
delete from Table1 where ID in
(select ID from Table1, (Select name ,fname, count(*) as cnt from Table1 group by name,Fname having count(*) > 1) as xx
where Table1.name=xx.name and Table1.fname=xx.fname and Table1.ownerid is not null and Table1.id2 is not null)
[code]

No comments:

Post a Comment