Skip to content
Snippets Groups Projects
README.md 1.22 KiB
Newer Older
Melissa Quinnan's avatar
Melissa Quinnan committed
### Melissa's AXOL1TL Training for NRP Nautilus
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
creating and running test pod: using image   gitlab-registry.nrp-nautilus.io/mquinnan/axol1tl-hub:axo-env-gpu-conda 
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
```
Melissa Quinnan's avatar
Melissa Quinnan committed
#in same directory as axo-pod.yml
kubectl create -f axo-pod.yml -n axol1tl
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
#check status
kubectl --namespace=axol1tl get pod
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
#enter pod once ready
kubectl exec -it axo-pod -- /bin/bash
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
#run testing script:
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
bash
conda activate axo-env
git clone https://gitlab.nrp-nautilus.io/mquinnan/nrp-axo.git .
export KERAS_BACKEND=torch
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
ray start --head
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
python test_contrastive_vae_tuner.py
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
#exit and delete pod when done
exit
kubectl delete -f axo-pod.yml -n axol1tl
Melissa Quinnan's avatar
Melissa Quinnan committed
```
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
for jobs still working on yml script, pending. script for that is contrastive_vae_tuner.py
Melissa Quinnan's avatar
Melissa Quinnan committed

Melissa Quinnan's avatar
Melissa Quinnan committed
data files are located in ``/axovol/`` persistent volume
Melissa Quinnan's avatar
Melissa Quinnan committed

Ellison Scheuller's avatar
Ellison Scheuller committed

### Running Job on a Raycluster 

```
#make the raycluster
helm install raycluster kuberay/ray-cluster -f ray-cluster.yaml

#check status
kubectl get rayclusters -n axol1tl

#start the job
kubectl apply -f tuner-job.yaml

#view the job
kubectl get jobs

#view the pod
kubectl get pods

#debugging
kubectl logs tuner-job- -n axol1tl

#delete the job
kubectl delete tuner-job -n axol1tl

#delete the ray cluster
kubectl delete raycluster-kuberay -n axol1tl
```