Insert by batch

Hi,
I need to insert around 10000 records, but when i use this command to execute it’s very slow


QueryRunner runner = new QueryRunner(persistence.getDataSource());
for(int i=1;i<10000;i++){
String sql = "insert into tmp(id,name) values('"+ i + "','name')";
runner.update(sql);
}

Do you have any way to insert faster or how to insert by batch in cuba
Thanks

Hi Thao,

When you create QueryRunner passing the DataSource, it will create and commit a new transaction on every operation, which can be slow. Try to open a transaction and commit it after all records are inserted (or in smaller batches):


QueryRunner runner = new QueryRunner();
try (Transaction tx = persistence.createTransaction()) {
    for (int i = 1; i < 10000; i++) {
        String sql = ...;
        runner.update(persistence.getEntityManager().getConnection(), sql);
    }
    tx.commit();
}

Hi Konstantin,

Can I insert batch entity as well? I have a collection of entities Collection<? extends Entity ents = … and batch insert them without writing a sql query…

I followed an example, as below, but it’s taking too much time to execute only 500 rows.
There is missing also release memory for the entityManager!
Do you know how can I make it better ?

`
        EntityManager entityManager = persistence.getEntityManager();

        final List<T> savedEntities = new ArrayList<T>(entities.size());


        int i = 0;
        for (T t : entities) {
            savedEntities.add(persistOrMerge(t, entityManager));
            i++;
            if (i % batchSize == 0) {
                // Flush a batch of inserts and release memory.
                entityManager.flush();
                //here is missing release memory
            }
        }

Hi Gent,

Neither CUBA nor JPA has API for batch writing, but you can try to specify the eclipselink.jdbc.batch-writing property in the app.properties file of your core module to make EclipseLink writes in batches where possible.

If it doesn’t work, you’ll have to do massive inserts through JDBC.

As for releasing memory consumed by EntityManager, just start a new transaction for every batch and get a new EntityManager instance. Then commit the transaction instead of flushing EntityManager.

BTW what time it takes to insert 500 new entities in the usual way? On my machine with local PostgreSQL database the following method inserts 500 users in less than 200ms:


public long insertUsers(int count) {
    long start = System.currentTimeMillis();
    Transaction tx = persistence.createTransaction();
    try {
        EntityManager em = persistence.getEntityManager();
        for (int i=0; i<count; i++) {
            User user = metadata.create(User.class);
            user.setLogin(RandomStringUtils.randomAlphanumeric(20));
            user.setName(RandomStringUtils.randomAlphanumeric(20));
            user.setGroup(em.getReference(Group.class, UUID.fromString("0fa2b1a5-1d68-4d69-9fbd-dff348347f93")));
            em.persist(user);
            if (i % 100 == 0) {
                tx.commitRetaining();
                em = persistence.getEntityManager();
            }
        }
        tx.commit();
    } finally {
        tx.end();
    }
    return System.currentTimeMillis() - start;
}
1 Like

Thanks, now it’s faster after using commitRetaining instead of flushing EntityManager. Had a speed weak because I was running cuba app in local and my ddb was in production cloud