Copying in large dataset using copy_from, using temp table to do ON CONFLICT, CardinalityViolation?

I have a table with emails and names, and the email column is unique. I followed someone's post on here that explained how to do a copy_from when you are dealing with a unique column. The process is A) create a temp table based on the table you are inserting the data, B) copy in the data from the file into the temp table, and C) copy the temp table to the main table with an ON CONFLICT statement to deal with uniques. See code below. For some reason, this is resulting in the error: psycopg2.errors.CardinalityViolation: ON CONFLICT DO UPDATE command cannot affect row a second time HINT: Ensure that no rows proposed for insertion within the same command have duplicate constrained values.

What could be causing this to happen? I would have thought a temp table with no constraints which is then copied to a main table with constraints (using ON CONFLICT) would work without a problem... and sometimes it does seem to work, it seems only certain files cause issues.

with connection.cursor() as stmt:
            stmt.execute('CREATE TEMP TABLE tmp_main AS SELECT * FROM main WHERE 0 = 1') 
        
            with open(file) as f:
                stmt.copy_from(f, 'tmp_main', columns=('email','name'), sep=',')

            stmt.execute(f'INSERT INTO main (email,name) SELECT email,name FROM tmp_main ON CONFLICT (email) DO UPDATE SET name = coalesce(excluded.name,main.name)') 
            stmt.execute('DROP TABLE tmp_main')


Read more here: https://stackoverflow.com/questions/66325679/copying-in-large-dataset-using-copy-from-using-temp-table-to-do-on-conflict-ca

Content Attribution

This content was originally published by max at Recent Questions - Stack Overflow, and is syndicated here via their RSS feed. You can read the original post over there.

%d bloggers like this: