"I don't know
why we are here, but I'm pretty sure that it is not in order to
enjoy ourselves." - Ludwig Wittgenstein (1889-1951) |
Read
first then play the video:
SQL-VIDEO -Troubleshooting
Deleting duplicated records
Troubleshooting Deleting
duplicated records
Introduction
Your organization has a lot
of duplicated records in their Account Employee table. This is a daily
problem in any organization and you should be prepared to fix the
problem. You have been assigned to clean all of the duplicated
records.
Begin by logging in as the
iself user.
SQL> CONNECT iself/schooling
Set the linesize to 100 and
the pagesize to 55.
SQL> SET LINESIZE 100 PAGESIZE 55
CREATE TABLE AS "
First, let's create a table
with lots of duplicated records. Create a table named dup_emp and copy
all of the EMP's records into it.
SQL> CREATE TABLE dup_emp
AS (SELECT * FROM emp)
SQL> /
Query the dup_emp table.
SQL> SELECT * FROM dup_emp
SQL> /
INSERT INTO " SELECT
command
Insert all the accounting
department rows into the dup_emp table.
SQL> INSERT INTO dup_emp
SELECT * FROM emp WHERE deptno = 10
SQL> /
SQL> commit;
Query all the duplicated
records from the dup_emp table order by the employee name.
SQL> SELECT * FROM dup_emp
WHERE deptno = 10
ORDER BY ename
SQL> /
Notice that all of the
employees of department 10 have been duplicated. Don"t try to
understand the following PL/SQL statements. You will learn how to
write a PL/SQL statement at the PL/SQL book.
--You will learn about the
following PL/SQL code on the PL/SQL subject
--We will cover about the
cursor and for loop statements later on PL/SQL.
--Only do it to delete all
the duplicate records. After finishing PL/SQL, you should understand
it.
Using PL/SQL to delete
duplicated records
Write a procedure to delete
all of the duplicated records.
SQL> DECLARE
SQL> -- Get the unique empno.
SQL> CURSOR c_empno IS
SQL> SELECT DISTINCT empno AS empno FROM dup_emp;
SQL> -- Get all duplicated records for an employee.
SQL> CURSOR c_duprec (p_empno NUMBER) IS
SQL> SELECT rowid FROM dup_emp WHERE empno = p_empno;
SQL> first_one BOOLEAN;
SQL> BEGIN
SQL> FOR v_empno IN c_empno LOOP
SQL> first_one := TRUE;
SQL> FOR v_duprec IN c_duprec (v_empno.empno) LOOP
SQL> IF NOT first_one THEN
SQL> DELETE FROM dup_emp
SQL> WHERE rowid = v_duprec.rowid;
SQL> COMMIT;
SQL> END IF;
SQL> first_one := FALSE;
SQL> END LOOP;
SQL> END LOOP;
SQL> END;
SQL> /
Query again the dup_emp
table order by the department and employee number.
Then check for duplication.
SQL> SELECT * FROM dup_emp
WHERE deptno = 10
ORDER BY ename
SQL> /
Notice that there are no duplicated records.
Using SQL to delete
duplicated records
You could write a delete
statement to remove all duplicated records. The following statement
will delete all records that have the same employee number except the
first one.
In this time, you did not have any duplicated record to be deleted.
SQL> DELETE FROM dup_emp
WHERE ROWID IN (SELECT MAX(ROWID)
FROM dup_emp
GROUP BY empno
HAVING COUNT (empno) > 1)
SQL> /
Drop the dup_emp table.
SQL> DROP TABLE dup_emp
SQL> /
"Good people do
not need laws to tell them to act responsibly, while bad people
will find a way around the laws." - Plato (427-347 B.C.) |
Questions:
Q: Why it is important to
eliminate duplicate records?
Q: What does the following
SQL statement?
SQL> DELETE FROM dup_emp
WHERE ROWID IN (SELECT MAX(ROWID)
FROM dup_emp
GROUP BY empno
HAVING COUNT (empno) > 1)
SQL> /
|