Blog / Blogger

How to Submit Millions of ERPNext Invoices Without Losing Days

Bulk-importing records into ERPNext is straightforward — but bulk-submitting them, especially with ledger or stock impact, can be painfully slow. If you've ever watched a long queue of invoices crawl

July 19, 2025 · 3 min read

Handling The Huge Data

Before reading this , you can read how to split huge excel or CSV file and upload here

When you call .submit() on a Sales Invoice, ERPNext doesn’t just change the docstatus. It runs validations, posts General Ledger entries, updates stock, and sometimes triggers tax and revenue reports. That means multiple SQL writes, potential conflicts, and yes — locking issues.

If you're submitting thousands or millions of invoices with overlapping warehouses, items, or dates, you'll likely see:

Deadlocks (QueryDeadlockError)
Slow row-level locks
Stalling threads and failing batches

Step 1: Break the Load into Chunks

We'll start by splitting your pending invoices into multiple chunk files.

import frappe
import os
import math

SITE = "benchui2.erpgulf.com"
CHUNKS = 5
OUTPUT_DIR = "invoice_chunks"

frappe.init(site=SITE)
frappe.connect()

# 📥 Get all draft invoices
invoices = frappe.get_all("Sales Invoice", filters={"docstatus": 0}, fields=["name"])
invoice_names = [inv["name"] for inv in invoices]
total = len(invoice_names)
chunk_size = math.ceil(total / CHUNKS)

# 📦 Create output folder and files
os.makedirs(OUTPUT_DIR, exist_ok=True)

for i in range(CHUNKS):
    chunk = invoice_names[i * chunk_size : (i + 1) * chunk_size]
    with open(f"{OUTPUT_DIR}/invoices_{i+1}.txt", "w") as f:
        f.write("\n".join(chunk))

print(f"✅ Split {total} invoices into {CHUNKS} files.")
frappe.db.close()

This splits the work evenly and avoids submitting the same invoice twice. You now have invoices_1.txt through invoices_5.txt, ready for processing.

Step 2: Submit Each Chunk in Parallel with Threading

Each file will be processed independently with multi-threading to boost speed and throughput:

import frappe
import os
import traceback
from concurrent.futures import ThreadPoolExecutor

SITE = "benchui2.erpgulf.com"
CHUNK_FOLDER = "invoice_chunks"
CHUNK_FILES = [
    "invoices_1.txt",
    "invoices_2.txt",
    "invoices_3.txt",
    "invoices_4.txt",
    "invoices_5.txt"
]
MAX_WORKERS = 10  # Threads per chunk
BASE_DIR = os.path.dirname(__file__)

for idx, filename in enumerate(CHUNK_FILES, start=1):
    INVOICE_FILE = os.path.join(BASE_DIR, CHUNK_FOLDER, filename)

    try:
        with open(INVOICE_FILE) as f:
            invoice_names = [line.strip() for line in f if line.strip()]
    except FileNotFoundError:
        print(f"🚫 Chunk {idx}: File not found → {INVOICE_FILE}", flush=True)
        continue

    print(f"📦 Chunk {idx}: Submitting {len(invoice_names)} invoices from {filename}", flush=True)

    def submit_invoice(name):
        try:
            frappe.init(site=SITE)
            frappe.connect()
            frappe.local.dev_server = False

            doc = frappe.get_doc("Sales Invoice", name)
            doc.submit()
            frappe.db.commit()
            print(f"✔️ Chunk {idx}: {name}", flush=True)
        except frappe.exceptions.QueryDeadlockError as e:
            print(f"⚠️ Chunk {idx}: Deadlock on {name} -> {e}", flush=True)
            traceback.print_exc()
        except Exception as e:
            print(f"❌ Chunk {idx}: Failed {name} -> {e}", flush=True)
            traceback.print_exc()
        finally:
            frappe.db.close()

    with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
        executor.map(submit_invoice, invoice_names)

print("\n✅ All chunk files processed. Done.")

This script:

Processes each chunk independently
Uses up to 10 threads to maximize concurrency
Skips failed invoices without halting the batch
Prints live progress to the screen

You may need to adjust some MariaDB settings

if you see deadlocks, group invoice chunks by warehouse or customer to reduce row lock contention.

For large servers (32GB+), tune MariaDB with:

innodb_buffer_pool_size = 24G
innodb_read_io_threads = 16
innodb_write_io_threads = 16

Retry deadlocked invoices later — optionally log failures to a file.

Compared to submitting invoices sequentially, this approach:

Cuts runtime by 4–6x depending on your server
Minimizes lock contention
Fully utilizes CPU cores and disk I/O
Gives you visibility over failures without manual tracking

ERPNext is powerful — but for bulk operations, you need thoughtful batching, parallelism, and lock safety. This multi-threaded pipeline turns invoice submission from hours into minutes, and makes your system feel as fast as your hardware deserves.

How to Submit Millions of ERPNext Invoices Without Losing Days

ERPNext support