This post is partly a reminder for myself; it’s about a part of SQL, which I tend to forget to use. But if I forget about it, then that means that others are likely to as well, right? (heh)
Anyhow, I was trying to create a process for maintaining sample data in an application–the users can install an application, muck up the sample data all they want, and then hit a “reset” button which takes everything back to the beginning. Pretty straightforward; just delete the data, and re-insert it. But the sample data is just to get people started using the apps, so we can’t just wipe the table–they may be using some of the data, or may have made changes that they want to keep, etc.
So we wanted to do something like this:
insert into table1 ( id, col1, col2 ) select 1 id, 'Sample' c1, 'Value' c2 from dual where not exists ( select null from table1 where id = 1;
That’d work, but in my case, I was loading 10 different tables, and anywhere from 3 to 15 different rows per table. I knew that I’d end up with a different id in the data I was inserting and in the not exists test sooner or later due to excessive cut-and-paste. (Alternatively, I could wrap each insert statement in a begin/end block and catch the duplicate ID exception–but who wants to do all that?) So I started trying to make something that would be simpler to maintain.
My solution was a private procedure that took a table name and an associative array of column names and values, and then built up the above insert statement using dynamic SQL. Effective, but rather ugly. I’m not even going to bother posting it here.
One of my co-workers then suggested that I use merge. The syntax is a little more complex than the above insert statement, but I can use one merge for all of the rows in a table–that’s a definite win. And I don’t have to type any of the values twice, so that source of error is gone!
merge into table1 dest using ( select 1 id, 'Sample' c1, 'Value' c2 from dual union all select 2, 'More', 'Value2' from dual ... ) src on ( dest.id = src.id ) when not matched then insert (id, col1, col2 ) values (src.id, src.col1, src.col2 );
It’s fast, it’s secure, and it’s easy to maintain. The only downside is that it wasn’t my first approach.