-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add symmetric flag to PowerMethod #1470
Conversation
closes #1468 Signed-off-by: Edoardo Pasca <edo.paskino@gmail.com>
add as unit test a check with a blurring operator, which should evaluate the same in case |
Currently unit tests fail on https://github.com/TomographicImaging/CIL/actions/runs/4736725424/jobs/8408634645?pr=1470#step:4:1570 |
I'm looking into this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this - this is an important fix and I believe it is correctly implemented. My only comment is that currently two terms are used "symmetric" and "diagonalisable". It is unclear if we mean the same by these. For example, in the old version "symmetric" was checked by whether the range and domain are the same, which is not sufficient. For clarity, I recommend replacing "symmetric" by "diagonalisable".
I am not sure that this is correct. Especially when we leave the default to be False. In the case where A is not normal, has eigenvalues 1 and 2 and is diagonalisable but not normal. In this case has eigenvalues |
I have done two things: changed symmetric to square. If the matrix is not square then we find the absolute value of the largest eigenvalue of There are cases where the power method won't converge and I highlighted this in a warning, if we get to the max number of iterations without converging. N.B we care about diagonalisability, because without this, the power method may not converge. We care about a matrix being symmetric because this means the power method converges quicker, but it doesn't actually require any change in the code for this speed up! I think the confusion here and in the initial issue was between symmetric (the matrix contents) and square (the matrix size). The question is now whether it deals with the issue: #1468 |
To deal with issue: #1468 specifically the matrix A = {{0, 1}, {0, 0}}, the power method now raises an error if the matrix is nil-potent, i.e. the largest eigenvalue is zero. The next question is whether this is the behaviour we want, or if we want the power method to output zero in this case? |
Thank you @MargaretDuff for looking into this so carefully. I think you are right about the source of confusion and the suggestion to keep using the matrix itself in case of a square matrix. On the case of nilpotent matrices - I note that for example matlab eig and eigs happily return zero(s). I could see us moving ahead with returning zero, which seems to me more useful, if the user just wants to know what the largest eigenvalue is. Then let for example FISTA deal with checking whether the value is zero ie can be used for a step size calculation. I think the confusion around symmetric possibly also had to do with ensuring a real-valued largest eigenvalue. How do we handle if this is not the case? For example, what would happen now if we have a real-valued, square but non-symmetrix matrix like in matlab notation A = [[-1,2], [-3,-4]], which has complex eigenvalues -2.5 +/- 1.9365i? Does the power method converge to anything? To one of these complex-valued eigenvalues, and which one since their magnitude is the same? |
Yep, this would deal with Kris' issue but we should check if it breaks anything else.
I think if the matrix has a unique largest eigenvalue then we should be fine for convergence. If there is not a unique largest eigenvalue but the matrix is hermitian then we should also be fine converging to the magnitude of the largest eigenvalue, but the eigenvector won't converge (we don't use that so that's fine). In your example, it doesn't converge, I think because the eigenvectors aren't orthogonal and don't span the space of 2D complex vectors. The case of A = [[-1,2], [-3,-4]] doesn't converge correctly and is worrying. Maybe a discussion about how to check convergence? |
I haven't looked enough into the power method on how it behaves in a such a case. I wonder if this is beyond the present issue and could be dealt with - for now - by creating a separate issue to keep track of it, and simply throw an error or warning in case it does not converge within a specified number of iterations? COuld consider allow the user to pass a "force" flag or similar if they know what they are doing and definitely want to get whatever value out that the method has made it to in the specified number of iterations? Should we (do we already?) also allow the user to specify the number of iterations and/or tolerance? |
After a chat with @paskino I have:
Open for a chat about what to name things |
Resume: If the operator If |
With @gfardell, @paskino and @jakobsj we discussed:
|
Signed-off-by: Margaret Duff <43645617+MargaretDuff@users.noreply.github.com>
@gfardell and I just had a chat about the power method. We decided that instead of a flag with options True, False and None on a flag called "method" with options "auto", "direct_only" and "composed_with_adjoint". Any thoughts on these names? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've suggested some logic and parameter changes. You also need to check in the file with the LinearOperator::Calculate_norm
update.
Signed-off-by: Gemma Fardell <47746591+gfardell@users.noreply.github.com>
…g/CIL into power_method_symmetric
Describe your changes
This pull request introduces a new argument to the power method, called "method".
If$x_{k+1} = A (x_k / | x_{k}|)$ then $|A|= |x_k|$ .
method="direct_only"
then the power method computesIf$x_{k+1} = A^TA(x_k/|x_k|)$ . $|A| = \sqrt{|x_k|}$ , i.e. it applies the power method to $A^TA$ before taking the square root.
method="composed_with_adjoint"
then the power method computesIf
method="auto"
then the code checks to see if the operator range and domain are equal, if they are then it uses the "direct_only" method, if they aren't or there are no range and domain geometries defined, it uses the "composed_with_adjoint" method.The LinearOperator::Calculate_norm method has been updated to use the "composed_with_adjoint" method to match the behaviour of norm with e.g. Matlab.
A
check_convergence
flag has been added in the casereturn_all=True
to check if the tolerance has been reached at the end of the iterations. This is only relevant in the case that a user would use the power method directly withreturn_all=True
.Link relevant issues
closes #1468
Checklist when you are ready to request a review
Contribution Notes
Please read and adhere to the developer guide and local patterns and conventions.