Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recover from panics that happen during fetch #39

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

edsrzf
Copy link
Contributor

@edsrzf edsrzf commented Aug 11, 2019

Now a panic inside fetch doesn't take down the whole process.

This also adds a new optional Recover field to Config structs to allow control over transforming recovered values into errors. When it's nil, fmt.Errorf("%v", v) is used.

Fixes #31.

This also adds a new optional Recover field to Config structs, to allow
control over transforming recovered values into errors. When it's nil,
fmt.Errorf("%v", v) is used.
@edsrzf
Copy link
Contributor Author

edsrzf commented Aug 11, 2019

I'm not sure why CI is failing, but I can't reproduce it locally and can't see how it would be related to this change.

Copy link

@gracenoah gracenoah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test failure is random. It will pass if you re-run the test. It would be good to add a test for this PR though. 1 for no handler and 1 for custom panic handler.

$ go test ./... -test.count 100
?       github.com/vektah/dataloaden    [no test files]
ok      github.com/vektah/dataloaden/example    4.148s
?       github.com/vektah/dataloaden/example/pkgname    [no test files]
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=20) %!s(int=4)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
--- FAIL: TestUserLoader (0.04s)
    --- FAIL: TestUserLoader/it_sent_two_batches (0.00s)
        Error Trace:    usersliceloader_test.go:90
        Error:          "[%!s(int=5) %!s(int=50)]" should have 3 item(s), but has 2
        Test:           TestUserLoader/it_sent_two_batches
FAIL
FAIL    github.com/vektah/dataloaden/example/slice      4.155s
ok      github.com/vektah/dataloaden/pkg/generator      4.264s

Fetch func(keys []{{.KeyType.String}}) ([]{{.ValType.String}}, []error)

// Wait is how long wait before sending a batch
Wait time.Duration

// MaxBatch will limit the maximum number of keys to send in one batch, 0 = not limit
MaxBatch int

// Recover is a function to transform a recovered value into an error.
// If a function is not supplied, the value is formatted with fmt.Errorf("%v", v).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should specify that if a function is provided, it must not return nil.

@@ -12,6 +12,7 @@ var tpl = template.Must(template.New("generated").
package {{.Package}}

import (
"fmt"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like the rest of the imports use 4 spaces rather than tabs. It's not great, but let's stay consistent for now?

@edsrzf
Copy link
Contributor Author

edsrzf commented Aug 12, 2019

Added tests and hey, whaddya know, there was a bug.

@gracenoah
Copy link

I see one reason these tests are so flaky: they are abusing go's test parallelism. Depending on the number of cores (and therefore default test parallelism), different behaviours will happen. When a test calls t.Parallel(), it blocks and is resumed later. Running the tests with -parallel 2 reproduces the CI failures. I suspect that circle CI used to run on a machine with more cores. For now, try adding -parallel 4 to all go test commands in https://github.com/vektah/dataloaden/blob/master/.circleci/config.yml#L10 . Hopefully that will make the tests pass at least sometimes.

@Clark-zhang
Copy link

Hello, guys.
If you are going to use this patch, please notice to change close(b.done) line.

func (b *{{.Name|lcFirst}}Batch) end(l *{{.Name}}) {
        defer func() {
                if r := recover(); r != nil {
                        if l.recover != nil {
                                b.error = []error{l.recover(r)}
                        } else {
                                b.error = []error{fmt.Errorf("%v", r)}
                        }
                }
                 //causing panic: close of closed channel. this line should inside if, as the channel will be closed in function end.
                close(b.done)
        }()
        b.data, b.error = l.fetch(b.keys)
        close(b.done)
}

//working version
func (b *{{.Name|lcFirst}}Batch) end(l *{{.Name}}) {
        defer func() {
                if r := recover(); r != nil {
                        if l.recover != nil {
                                b.error = []error{l.recover(r)}
                        } else {
                                b.error = []error{fmt.Errorf("%v", r)}
                        }
                      close(b.done)
                }
        }()
        b.data, b.error = l.fetch(b.keys)
        close(b.done)
}

b.error = []error{fmt.Errorf("%v", r)}
}
}
close(b.done)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this line will cause "panic: close of closed channel", should inside the if-recover case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you tried this code and run into it? The channel needs to always be closed when the function exits. Otherwise Load calls will never return.

We've since forked this code and are running panic-handling almost identical to this in production with no issues.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, i have tried the patch, but I got "panic: close of closed channel".
From my opinion, it's because if l.fetch(b.keys) successfully execute, the channel will be closed, but inside defer func, we close it again. So, the second time we close it, we have to make sure there was panic, means inside r := recover(); && r != nil.

@gpopovic
Copy link

gpopovic commented May 2, 2020

Any news on this one?

@Maleandr
Copy link

Can it be merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Panics inside fetch are not recovered
5 participants