Monday, April 30

How to do an "insert ignore" in Oracle

Ok, I should start off with a disclaimer. This is not a good idea to do.
Using the "insert ignore" statement is a way to let MySQL insert data from a dataset which may contain duplicate constraints to existing data, and simply skip the duplicate row.
Sounds like a great way to screw up data doesn't it? Should not be used unless you really know what is going on.

Ok, in MySQL we can do this.
mysql> insert ignore into a select * from b;
Query OK, 1 row affected (0.00 sec)
Records: 2 Duplicates: 1 Warnings: 0
But can this be done in Oracle without to much fuss? After a discussion on IRC a guy asked why can't we simply use the merge statement?
Well we can and it is probably a quite good suggestion if you *really* want to ditch those dupe rows.
-- So what do we got?
SQL> select * from a;

A B
---------- ----------
1 2
2 3

SQL> select * from b;

A B
---------- ----------
1 3
3 4

SQL> merge into a using b on (a.a=b.a)
when not matched then insert values (b.a,b.b);

1 row merged.

SQL> select * from a;

A B
---------- ----------
1 2
2 3
3 4

SQL>

-- Let's rollback that and have a look at the exeuction plan.

SQL> rollback;

Rollback complete.

SQL> set autotrace on
SQL> merge into a using b on (a.a=b.a)
when not matched then insert values (b.a,b.b);

1 row merged.


Execution Plan
----------------------------------------------------------
Plan hash value: 1973318225

-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 2 | 64 | 7 (15)| 00:00:01 |
| 1 | MERGE | A | | | | |
| 2 | VIEW | | | | | |
|* 3 | HASH JOIN OUTER | | 2 | 64 | 7 (15)| 00:00:01 |
| 4 | TABLE ACCESS FULL| B | 2 | 52 | 3 (0)| 00:00:01 |
| 5 | TABLE ACCESS FULL| A | 2 | 12 | 3 (0)| 00:00:01 |
-----------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

3 - access("A"."A"(+)="B"."A")

Note
-----
- dynamic sampling used for this statement


Statistics
----------------------------------------------------------
0 recursive calls
5 db block gets
15 consistent gets
0 physical reads
0 redo size
819 bytes sent via SQL*Net to client
778 bytes received via SQL*Net from client
4 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processed

SQL>
Looks ok.

2 comments:

Anonymous said...

Or you could use 10gR2's new dml error logging capabiltiies.

Hampus said...

Yes we could use the dml error logging, it is better general solution. However, using error logging with a reject limit would require us to create the err$ table before doing the insert and also maintain it. More the kind of thing we want to do during bulk loads.
The merge version is more a generic solution at the app level (still bad though).

exec DBMS_ERRLOG.CREATE_ERROR_LOG('A');

insert into a
select * from b log errors reject limit unlimited;

SQL> select a,b from err$_a;

A B
----- -----
1 3

A "dont log errors" feature would be good.