Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DigitalOcean] droplet integration #3832

Open
wants to merge 65 commits into
base: master
Choose a base branch
from

Conversation

asaiacai
Copy link
Contributor

@asaiacai asaiacai commented Aug 14, 2024

This adds digital ocean droplets to the sky.

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • All smoke tests: pytest tests/test_smoke.py --do

@moinnadeem
Copy link

@asaiacai I would be really interested in using this work -- do you know when you might be able to land this PR?

@asaiacai
Copy link
Contributor Author

asaiacai commented Aug 31, 2024 via email

@asaiacai asaiacai marked this pull request as ready for review September 11, 2024 00:07
@asaiacai
Copy link
Contributor Author

@Michaelvll i think this PR is finally ready, once you have a free moment! I've currently disabled tests for gpu droplets as they are still in early access, but I think I got the remaining tests to work on normal CPU droplets fine. Let me know if you find any issues.

@asaiacai
Copy link
Contributor Author

also this catalog update is required to pass the sky serve requests since I added cheaper 2vcpu instances.

skypilot-org/skypilot-catalog#83

Copy link
Collaborator

@cblmemo cblmemo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this amazing work @asaiacai ! This would be very helpful. Left some discussions ;)

docs/source/getting-started/installation.rst Outdated Show resolved Hide resolved
sky/adaptors/do.py Show resolved Hide resolved
sky/clouds/do.py Outdated Show resolved Hide resolved
sky/clouds/do.py Outdated Show resolved Hide resolved
sky/clouds/do.py Outdated Show resolved Hide resolved
sky/templates/do-ray.yml.j2 Show resolved Hide resolved
tests/test_smoke.py Show resolved Hide resolved
tests/skyserve/update/bump_version_after.yaml Outdated Show resolved Hide resolved
sky/clouds/do.py Outdated
Comment on lines 36 to 38
clouds.CloudImplementationFeatures.DOCKER_IMAGE:
'Docker container images as runtime environments'
f' are not supported in {_REPR}. Try using in `run`',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why don't we support it? I saw relevant code snippet in the ray yaml file.

Copy link
Contributor Author

@asaiacai asaiacai Sep 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I included a startup script that installs docker, but ssh comes up in parallel before the startup script finishes running resulting in the docker environment setup failing. Docker does eventually get setup on the machine. If I for example, reattempt to launch on the same cluster after waiting for like 10 seconds, I can successfully use the docker environment, but the times i tested it never worked on freshly provisioned instances. Let me know what would make sense here.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to use a base image with docker installed? Or, can we wait for the docker to finish setup for DO cloud?

def _check_docker_installed(self):
no_exist = 'NoExist'
cleaned_output = self._run(
f'command -v {self.docker_cmd} || echo {no_exist!r}')
if no_exist in cleaned_output or 'docker' not in cleaned_output:
logger.error(
f'{self.docker_cmd.capitalize()} not installed. Please use an '
f'image with {self.docker_cmd.capitalize()} installed.')

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i added waiting for docker to _check_docker_installed but let me know if this should only be specific to DO

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern is that it will take too long to error out for an image without docker installation. Is it possible that we somehow detect if the docker installation is in progress, and immediately return if we found no installation?

sky/clouds/do.py Show resolved Hide resolved
asaiacai and others added 2 commits September 17, 2024 12:45
Co-authored-by: Tian Xia <cblmemo@gmail.com>
Co-authored-by: Tian Xia <cblmemo@gmail.com>
@asaiacai asaiacai requested a review from cblmemo October 27, 2024 23:35
@asaiacai
Copy link
Contributor Author

@cblmemo reran smoke, but if you have a moment to review again, thanks in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants