How to Submit Millions of ERPNext Invoices Without Losing Days

Bulk-importing records into ERPNext is straightforward — but bulk-submitting them, especially with ledger or stock impact, can be painfully slow. If you've ever watched a long queue of invoices crawl

 · 3 min read

Handling The Huge Data



Before reading this , you can read how to split huge excel or CSV file and upload here


When you call .submit() on a Sales Invoice, ERPNext doesn’t just change the docstatus. It runs validations, posts General Ledger entries, updates stock, and sometimes triggers tax and revenue reports. That means multiple SQL writes, potential conflicts, and yes — locking issues.

If you're submitting thousands or millions of invoices with overlapping warehouses, items, or dates, you'll likely see:

  1. Deadlocks (QueryDeadlockError)
  2. Slow row-level locks
  3. Stalling threads and failing batches



Step 1: Break the Load into Chunks


We'll start by splitting your pending invoices into multiple chunk files.

import frappe
import os
import math

SITE = "benchui2.erpgulf.com"
CHUNKS = 5
OUTPUT_DIR = "invoice_chunks"

frappe.init(site=SITE)
frappe.connect()

# 📥 Get all draft invoices
invoices = frappe.get_all("Sales Invoice", filters={"docstatus": 0}, fields=["name"])
invoice_names = [inv["name"] for inv in invoices]
total = len(invoice_names)
chunk_size = math.ceil(total / CHUNKS)

# 📦 Create output folder and files
os.makedirs(OUTPUT_DIR, exist_ok=True)

for i in range(CHUNKS):
   chunk = invoice_names[i * chunk_size : (i + 1) * chunk_size]
   with open(f"{OUTPUT_DIR}/invoices_{i+1}.txt", "w") as f:
       f.write("\n".join(chunk))

print(f"✅ Split {total} invoices into {CHUNKS} files.")
frappe.db.close()



This splits the work evenly and avoids submitting the same invoice twice. You now have invoices_1.txt through invoices_5.txt, ready for processing.



Step 2: Submit Each Chunk in Parallel with Threading


Each file will be processed independently with multi-threading to boost speed and throughput:


import frappe
import os
import traceback
from concurrent.futures import ThreadPoolExecutor

SITE = "benchui2.erpgulf.com"
CHUNK_FOLDER = "invoice_chunks"
CHUNK_FILES = [
   "invoices_1.txt",
   "invoices_2.txt",
   "invoices_3.txt",
   "invoices_4.txt",
   "invoices_5.txt"
]
MAX_WORKERS = 10  # Threads per chunk
BASE_DIR = os.path.dirname(__file__)

for idx, filename in enumerate(CHUNK_FILES, start=1):
   INVOICE_FILE = os.path.join(BASE_DIR, CHUNK_FOLDER, filename)

   try:
       with open(INVOICE_FILE) as f:
           invoice_names = [line.strip() for line in f if line.strip()]
   except FileNotFoundError:
       print(f"🚫 Chunk {idx}: File not found → {INVOICE_FILE}", flush=True)
       continue

   print(f"📦 Chunk {idx}: Submitting {len(invoice_names)} invoices from {filename}", flush=True)

   def submit_invoice(name):
       try:
           frappe.init(site=SITE)
           frappe.connect()
           frappe.local.dev_server = False

           doc = frappe.get_doc("Sales Invoice", name)
           doc.submit()
           frappe.db.commit()
           print(f"✔️ Chunk {idx}: {name}", flush=True)
       except frappe.exceptions.QueryDeadlockError as e:
           print(f"⚠️ Chunk {idx}: Deadlock on {name} -> {e}", flush=True)
           traceback.print_exc()
       except Exception as e:
           print(f"❌ Chunk {idx}: Failed {name} -> {e}", flush=True)
           traceback.print_exc()
       finally:
           frappe.db.close()

   with ThreadPoolExecutor(max_workers=MAX_WORKERS) as executor:
       executor.map(submit_invoice, invoice_names)

print("\n✅ All chunk files processed. Done.")



This script:

  1. Processes each chunk independently
  2. Uses up to 10 threads to maximize concurrency
  3. Skips failed invoices without halting the batch
  4. Prints live progress to the screen



You may need to adjust some MariaDB settings

if you see deadlocks, group invoice chunks by warehouse or customer to reduce row lock contention.

For large servers (32GB+), tune MariaDB with:


innodb_buffer_pool_size = 24G
innodb_read_io_threads = 16
innodb_write_io_threads = 16



  1. Retry deadlocked invoices later — optionally log failures to a file.



Compared to submitting invoices sequentially, this approach:

  1. Cuts runtime by 4–6x depending on your server
  2. Minimizes lock contention
  3. Fully utilizes CPU cores and disk I/O
  4. Gives you visibility over failures without manual tracking



ERPNext is powerful — but for bulk operations, you need thoughtful batching, parallelism, and lock safety. This multi-threaded pipeline turns invoice submission from hours into minutes, and makes your system feel as fast as your hardware deserves.



No comments yet.

Add a comment
Ctrl+Enter to add comment