Skip to content

[12.0][FIX] queue_job: don't enqueue dependent jobs after a retryable postpone (closed cursor)#944

Open
MiquelRForgeFlow wants to merge 1 commit into
OCA:12.0from
ForgeFlow:12.0-fix-queue_job-closed-cursor-retryable
Open

[12.0][FIX] queue_job: don't enqueue dependent jobs after a retryable postpone (closed cursor)#944
MiquelRForgeFlow wants to merge 1 commit into
OCA:12.0from
ForgeFlow:12.0-fix-queue_job-closed-cursor-retryable

Conversation

@MiquelRForgeFlow

Copy link
Copy Markdown
Contributor

Problem

Graph jobs that hit a Postgres serialization/concurrency error crash the /queue_job/runjob request with:

psycopg2.OperationalError: Unable to use a closed cursor.

Traceback tail:

controllers/main.py, in _enqueue_dependent_jobs -> job.enqueue_waiting()
job.py, in enqueue_waiting -> self.env.cr.execute(...)
OperationalError: Unable to use a closed cursor.

Root cause

In RunJobController.runjob, the except RetryableJobError handler calls the local retry_postpone(), which does:

job.env.clear()
with odoo.registry(job.env.cr.dbname).cursor() as new_cr:
    job.env = job.env(cr=new_cr)
    ...
    new_cr.commit()
# new_cr is closed here, but job.env.cr still points at it

The handler ended with env.cr.rollback() but no return, so control fell through to self._enqueue_dependent_jobs(env, job) ->
job.enqueue_waiting(), which runs self.env.cr.execute(...) on the closed cursor. This only triggers for jobs in a dependency graph that hit a retryable error, which is why it's intermittent.

Fix

Return right after the rollback in the RetryableJobError branch. A postponed job isn't done, so it must not release its dependents — and this avoids using the closed cursor. This matches upstream OCA queue_job behaviour in 14.0/15.0/16.0.

…pone

When a job's perform() raised a PG serialization error, the runjob controller wrapped it as RetryableJobError and called retry_postpone(), which reassigns job.env to a temporary cursor inside a `with` block and closes it on exit. The handler then fell through to _enqueue_dependent_jobs() -> Job.enqueue_waiting(), running self.env.cr.execute() on that now-closed cursor and raising
"psycopg2.OperationalError: Unable to use a closed cursor".

A postponed job is not done and must not release its dependent jobs, so return right after the rollback. This also avoids touching the closed cursor. Aligns 12.0 with the upstream 14.0+ behaviour.
@OCA-git-bot

Copy link
Copy Markdown
Contributor

Hi @guewen,
some modules you are maintaining are being modified, check this out!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants