How to run a Jupyter Notebook inside Pytorch Container for Accelerated Machine Learning E2E GPU Wizard?

Launching an Instance based on Pytorch container with Jupyter-notebook Running

Using myaccount portal -

  1. First sign into the myaccount portal

  2. Navigate to GPU Wizard

    1. Under the compute menu click on GPU

    2. Then click on GPU Wizard

    3. For NGC Container Pytorch click on Next under actions column

    4. Choose the card according to requirement, A100 is recommended.

      ../../_images/n1.png
    5. Choose the plan as per the requirement.

    6. Optionally you can add SSH key (recommended) or subscribe to CDP backup.

    7. Click on “Create my node”.

    8. Wait for few minutes and and confirm that node is in running state

../../_images/n2.png
  1. Open terminal on your local PC and type the following command

ssh -NL localhost:1234:localhost:8888 root@<your_node_ip>
../../_images/n3.png
  1. The command usually will not show any output which represents the command has run without any error.

  2. Go to a web browser on your local PC and hit the url: http://localhost:1234

../../_images/n4.png
  1. Congratulations! Now you can run your python code inside this jupyter notebook which has Pytorch and all the libraries frequently used in machine learning preconfigured.

  2. To get the most out of GPU acceleration use RAPIDS and DALI which are already installed inside this container.

  3. RAPIDS and DALI accelerate the tasks in machine learning apart from the learning also like data loading and preprocessing.

How to Run FashionMNIST Dataset Image Recognition Pytorch Notebook

  1. Download the notebook by clicking on the following link on your local PC:- https://pytorch.org/tutorials/_downloads/af0caf6d7af0dda755f4c9d7af9ccc2c/quickstart_tutorial.ipynb

  2. Upload the notebook in your Jupyter notebook environment.

  3. Choose Run All from Cell Menu.

../../_images/n5.png
  1. You can observe the whole workflow of machine learning and accuracy of approx 63% after 5 epoch of training.

../../_images/n6.png

Troubleshooting steps if notebook isn’t accessible

  1. Do ssh in your node with public ip ssh username@<public-ip>

../../_images/n7.png
  1. Check the output of the docker ps command.

../../_images/n8.png
  1. If it shows a container is running using Pytorch image then you can use the following in command line

cnid=$(docker ps |grep pytorch|head -1|awk '{print $1}');
/usr/bin/nohup /usr/bin/docker exec $cnid
/usr/local/bin/jupyter-notebook --no-browser --port=8888 --allow-root
--NotebookApp.token='' --NotebookApp.password=''
--NotebookApp.quit_button=False
--NotebookApp.notebook_dir=/home/pytorch &
../../_images/n9.png
  1. If docker ps doesn’t list any container running, then we have to start the container first.

    1. Run docker ps -a

../../_images/n10.png
  1. Now copy the container id and run docker start <container_id>

../../_images/n11.png
  1. Run the following command.

cnid=$(docker ps |grep pytorch|head -1|awk '{print $1}');
/usr/bin/nohup /usr/bin/docker exec $cnid /usr/local/bin/jupyter-notebook
--no-browser --port=8888 --allow-root --NotebookApp.token=''
--NotebookApp.password='' --NotebookApp.quit_button=False
--NotebookApp.notebook_dir=/home/pytorch &
../../_images/n12.png
  1. Now create a ssh tunnel again in your local PC with ssh -NL localhost:1234:localhost:8888 root@<your_node_ip> and attempt to access the jupyter notebook.

../../_images/n13.png

Why doesn’t my notebook have a password?

The notebook is accessed via a ssh tunnel which creates a secure encrypted connection between node and pc which will be accessible only for those who have SSH access. But still if you want to set the password you can do it the following way.

  1. Stop any existing jupyter notebook processes inside the container.

  2. Edit the field –NotebookApp.password=’Your password’ in the above commands.

  3. In order to set it on the next boot, Open the file /etc/rc.local in your server and set password as shown below.

../../_images/n14.png