Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configure the GPU Server #5

Open
dmccreary opened this issue Jul 19, 2019 · 4 comments
Open

Configure the GPU Server #5

dmccreary opened this issue Jul 19, 2019 · 4 comments
Assignees
Labels
help wanted Extra attention is needed

Comments

@dmccreary
Copy link
Collaborator

Our sites may not have good Internet access, so we need to be able to set up and configure a local GPU cluster to train our models.

We want to have someone to document setting up a server with GPUs and then allowing students to train their models.

@dmccreary dmccreary added the help wanted Extra attention is needed label Jul 19, 2019
@parkererickson
Copy link
Collaborator

Jon and I were talking about this briefly. I'm not sure if the hardware would have enough throughput, but if we had like 8-10 Docker containers up with TensorFlow and Jupyter Notebooks on them, we could have everybody access their own container and go from there.

@dmccreary
Copy link
Collaborator Author

We now have a loaner server that we need to set up and configure. It has two large Nvidia GPUs on it.

We need someone to get TensorFlow and the DonkeyCar software running on it and configure it to allow people to load their images and run the training program. They then need to be able to transfer their files to their car.

See the documentation page here: http://docs.donkeycar.com/guide/train_autopilot/

@dmccreary dmccreary changed the title Create a Strategy for Local GPU Training Servers Configure the GPU Server Aug 27, 2019
@dmccreary
Copy link
Collaborator Author

dmccreary commented Sep 11, 2019

Neal Kelly and got our GPU server working last weekend. The good news is we trained some modes with 10K images in under 5 minutes!! That sever really rocks!

We installed Python, TensorFlow, Jupyter Notebooks and yesterday I finally got the SSH system working. Next, I will set up 10 accounts (one for each car) called arl1, arl2, arl3 and assign each car to one account. Then can then SSH in and train their models. We still need to figure out what type of virtual environment will work.

@dmccreary dmccreary self-assigned this Sep 11, 2019
@parkererickson
Copy link
Collaborator

Very cool! I usually just don't use a virtual environment (irresponsible I know) but donkey car uses miniconda. Here are their install instructions: http://docs.donkeycar.com/guide/host_pc/setup_ubuntu/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants