Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to connect via ssh despite --ssh flag #233

Open
sansarsecondary opened this issue Jan 4, 2022 · 2 comments
Open

Unable to connect via ssh despite --ssh flag #233

sansarsecondary opened this issue Jan 4, 2022 · 2 comments

Comments

@sansarsecondary
Copy link

All my job submissions have the --ssh flag enabled. I am unable to SSH into the dsub cloud worker machine via either web GUI or other gcloud supported means.

Task log files contain this:
2022/01/04 10:18:43 Failed to handle connection: handshake: ssh: disconnect, reason 11: Bye Bye

dsub ... --provider google-cls-v2 --ssh

Any thoughts why this may be occuring? I have tried having public IP but that does not make a difference. Job executes just fine.

Non dsub launched instances in the project exhibit no trouble connecting via ssh.

@wnojopra
Copy link
Contributor

wnojopra commented Jan 4, 2022

Hi @sansarsecondary! I haven't seen that error in dsub before. Could you help me debug with the following:

  1. Could you please run dstat with the --full option on a job where you could not ssh into the machine, and look for any messages related to SSH? I'm looking to see if there are any errors or warnings popping up here. Notable places to look would be in the events field and the status-detail.
  2. Could you describe with a bit more detail on what happens when you try to SSH via the web GUI? What errors show up?
  3. How long after job start does the Failed to handle connection error message show up in the log?
  4. You mention that all of your job submissions have the --ssh flag enabled. Around how many jobs is this? Admittedly I typically test --ssh with no more than a few jobs at once.

@rivershah
Copy link

I found the reason for the error. Firewall rules for the project were corrupted and ssh traffic was getting blocked. In case another user runs into same issue, please ensure ssh ingress traffic enabled

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants