Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent test failure: completer blocked on send #601

Open
bgentry opened this issue Sep 18, 2024 · 0 comments
Open

Intermittent test failure: completer blocked on send #601

bgentry opened this issue Sep 18, 2024 · 0 comments
Labels
bug Something isn't working

Comments

@bgentry
Copy link
Contributor

bgentry commented Sep 18, 2024

I noticed this failure. It

Run go test -p 1 -race ./... -timeout 2m
  go test -p 1 -race ./... -timeout [2](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:2)m
  shell: /usr/bin/bash -e {0}
  env:
    ADMIN_DATABASE_URL: ***localhost:54[3](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:3)2
    DATABASE_URL: ***127.0.0.1:5[4](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:4)32/river_dev?sslmode=disable
    TEST_DATABASE_URL: ***127.0.0.1:[5](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:5)432/river_test?sslmode=disable
    GOTOOLCHAIN: local
ok  	github.com/riverqueue/river	37.0[6](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:6)0s
ok  	github.com/riverqueue/river/internal/cmd/testdbman	(cached)
ok  	github.com/riverqueue/river/internal/dblist	1.068s
ok  	github.com/riverqueue/river/internal/dbunique	1.184s
panic: test timed out after 2m0s
running tests:
	TestAsyncCompleter/SlowerContinuousCompletion (1m59s)

goroutine 14318 [running]:
testing.(*M).startAlarm.func1()
	/opt/hostedtoolcache/go/1.22.[7](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:7)/x64/src/testing/testing.go:2366 +0x265
created by time.goFunc
	/opt/hostedtoolcache/go/1.22.7/x64/src/time/sleep.go:177 +0x45

goroutine 1 [chan receive, 1 minutes]:
testing.tRunner.func1()
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1650 +0x9b2
testing.tRunner(0xc00017[8](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:8)820, 0xc00003fa40)
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:16[9](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:10)5 +0x25e
testing.runTests(0xc0001281f8, {0x[10](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:11)d9980, 0x9, 0x9}, {0x7f5cfe781a48?, 0xc000114540?, 0x10e1240?})
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:2159 +0x8bf
testing.(*M).Run(0xc00012d040)
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:2027 +0xf18
github.com/riverqueue/river/internal/riverinternaltest.WrapTestMain(0xc00012d040)
	/home/runner/work/river/river/internal/riverinternaltest/riverinternaltest.go:291 +0x136
github.com/riverqueue/river/internal/jobcompleter.TestMain(...)
	/home/runner/work/river/river/internal/jobcompleter/main_test.go:10
main.main()
	_testmain.go:73 +0x2d6

goroutine 14204 [select]:
github.com/jackc/pgx/v5/pgxpool.(*Pool).backgroundHealthCheck(0xc000000b40)
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:4[11](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:12) +0x1d3
github.com/jackc/pgx/v5/pgxpool.NewWithConfig.func3()
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:275 +0x68
created by github.com/jackc/pgx/v5/pgxpool.NewWithConfig in goroutine 70
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:273 +0xa65

goroutine 551 [chan send, 1 minutes]:
github.com/riverqueue/river/internal/jobcompleter.(*AsyncCompleter).JobSetStateIfRunning.func1()
	/home/runner/work/river/river/internal/jobcompleter/job_completer.go:172 +0x26e
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/home/runner/go/pkg/mod/golang.org/x/sync@v0.8.0/errgroup/errgroup.go:78 +0x92
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 363
	/home/runner/go/pkg/mod/golang.org/x/sync@v0.8.0/errgroup/errgroup.go:75 +0x[12](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:13)5

goroutine 34 [chan receive, 1 minutes]:
testing.tRunner.func1()
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1650 +0x9b2
testing.tRunner(0xc000196000, 0xc84ab8)
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1695 +0x25e
created by testing.(*T).Run in goroutine 1
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1742 +0x826

goroutine 362 [semacquire, 1 minutes]:
sync.runtime_Semacquire(0xc000112f90?)
	/opt/hostedtoolcache/go/1.22.7/x64/src/runtime/sema.go:62 +0x25
sync.(*WaitGroup).Wait(0xc000112f88)
	/opt/hostedtoolcache/go/1.22.7/x64/src/sync/waitgroup.go:116 +0xa5
golang.org/x/sync/errgroup.(*Group).Wait(0xc000112f80)
	/home/runner/go/pkg/mod/golang.org/x/sync@v0.8.0/errgroup/errgroup.go:56 +0x38
github.com/riverqueue/river/internal/jobcompleter.(*AsyncCompleter).Start.func1()
	/home/runner/work/river/river/internal/jobcompleter/job_completer.go:200 +0x16a
created by github.com/riverqueue/river/internal/jobcompleter.(*AsyncCompleter).Start in goroutine 33
	/home/runner/work/river/river/internal/jobcompleter/job_completer.go:193 +0x1c5

goroutine 33 [chan receive, 1 minutes]:
github.com/riverqueue/river/rivershared/startstop.(*BaseStartStop).Stop(0xc0001b04b0)
	/home/runner/work/river/river/rivershared/startstop/start_stop.go:165 +0x3a
testing.(*common).Cleanup.func1()
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1175 +0x17a
testing.(*common).runCleanup(0xc000179860, 0x0)
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:[13](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:14)53 +0x262
testing.tRunner.func2()
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1683 +0x51
testing.tRunner(0xc000179860, 0xc000123c80)
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1695 +0x25e
created by testing.(*T).Run in goroutine 34
	/opt/hostedtoolcache/go/1.22.7/x64/src/testing/testing.go:1742 +0x826

goroutine 380 [select]:
github.com/jackc/pgx/v5/pgxpool.(*Pool).backgroundHealthCheck(0xc000000180)
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:411 +0x1d3
github.com/jackc/pgx/v5/pgxpool.NewWithConfig.func3()
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:275 +0x68
created by github.com/jackc/pgx/v5/pgxpool.NewWithConfig in goroutine 6
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:273 +0xa65

goroutine 10797 [select]:
github.com/jackc/pgx/v5/pgxpool.(*Pool).backgroundHealthCheck(0xc000000a80)
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:411 +0x1d3
github.com/jackc/pgx/v5/pgxpool.NewWithConfig.func3()
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:275 +0x68
created by github.com/jackc/pgx/v5/pgxpool.NewWithConfig in goroutine 31
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:273 +0xa65

goroutine 9247 [select]:
github.com/jackc/pgx/v5/pgxpool.(*Pool).backgroundHealthCheck(0xc0000009c0)
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:411 +0x1d3
github.com/jackc/pgx/v5/pgxpool.NewWithConfig.func3()
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:275 +0x68
created by github.com/jackc/pgx/v5/pgxpool.NewWithConfig in goroutine 69
	/home/runner/go/pkg/mod/github.com/jackc/pgx/v5@v5.7.1/pgxpool/pool.go:273 +0xa65
FAIL	github.com/riverqueue/river/internal/jobcompleter	120.026s
?   	github.com/riverqueue/river/internal/jobstats	[no test files]
ok  	github.com/riverqueue/river/internal/leadership	1.386s
ok  	github.com/riverqueue/river/internal/maintenance	4.811s
ok  	github.com/riverqueue/river/internal/notifier	1.736s
ok  	github.com/riverqueue/river/internal/notifylimiter	1.210s
?   	github.com/riverqueue/river/internal/rivercommon	[no test files]
ok  	github.com/riverqueue/river/internal/riverinternaltest	1.069s
?   	github.com/riverqueue/river/internal/riverinternaltest/riverdrivertest	[no test files]
ok  	github.com/riverqueue/river/internal/riverinternaltest/sharedtx	1.087s
ok  	github.com/riverqueue/river/internal/testdb	(cached)
ok  	github.com/riverqueue/river/internal/util/chanutil	1.261s
ok  	github.com/riverqueue/river/internal/util/dbutil	1.1[14](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:15)s
ok  	github.com/riverqueue/river/internal/util/hashutil	(cached)
?   	github.com/riverqueue/river/internal/workunit	[no test files]
ok  	github.com/riverqueue/river/rivermigrate	2.660s
ok  	github.com/riverqueue/river/rivertest	1.3[20](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:21)s
FAIL)

In particular, this bit jumped out:

goroutine 551 [chan send, 1 minutes]:
github.com/riverqueue/river/internal/jobcompleter.(*AsyncCompleter).JobSetStateIfRunning.func1()
	/home/runner/work/river/river/internal/jobcompleter/job_completer.go:172 +0x26e
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/home/runner/go/pkg/mod/golang.org/x/sync@v0.8.0/errgroup/errgroup.go:78 +0x92
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 363
	/home/runner/go/pkg/mod/golang.org/x/sync@v0.8.0/errgroup/errgroup.go:75 +0x[12](https://github.com/riverqueue/river/actions/runs/10930264275/job/30342796640#step:7:13)5

That line indicates an issue with shutdown ordering, because a completed job is trying to send notice to a subscription channel nobody is listening to, which blocks the shutdown.

@bgentry bgentry added the bug Something isn't working label Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant