add tf limits

This commit is contained in:
Cornelius Specht 2024-05-03 11:23:17 +02:00
parent 5fc8d072ec
commit dc09353259
1 changed files with 23 additions and 4 deletions

View File

@ -5,7 +5,7 @@
## Introduction
The **Sandbox Development Environment** enables students and lecturers to create and work with interactive case studies. It also provides a development environment for researchers and advanced programmers.
The Sandbox is available at: https://sandbox.iuk.hdm-stuttgart.de
The Sandbox is available at: [https://sandbox.iuk.hdm-stuttgart.de](https://sandbox.iuk.hdm-stuttgart.de)
Please carefully handle the resources and follow the fair use principle and as well as the German laws.
## Getting Started
@ -16,12 +16,12 @@ Currently, inside the technical lectures students (in the role User) get in touc
1. **Sign in**, use your **HdM Credentials** at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de/)
2. Select the image you want to start (three options)
2. Select the environment you want to start (three options, GPU enviroments are limited to 10 instances in total)
1. **Datascience environment**
2. **GPU PyTorch environment** (PyTorch library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
3. **GPU TF environment** (Tensorflow library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
3. Create or upload a .ipynb file:
1. . **create a empty .ipynb file:**
1. **create a empty .ipynb file:**
![sandbox launcher](res/sandbox_launcher.png "Sandbox Launcher")
2. **upload a existing .ipynb file:**
![sandbox upload file](res/sandbox_upload_file_selector.png "Sandbox upload file")
@ -69,4 +69,23 @@ Each instance has following resource limits:
For GPU enabled environments, 40GB shared (time sliced) GPU memory is availble. For additional information, please see the [official nvidia documentation](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html).
### Limitations
Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de) website. The HDD free space can not be extended. The Sandbox should only be used for short running GPU tasks. For longer running trainings, please use the [Training Environment](sandbox/training.md). After GPU usage please stop your kernel/instance or free the blocked GPU resources manually. After 30 minutes of inactivity, the instanced will automatically removed. The HDD space is persistent, but will be deleted after 6 months.
Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de) website. The HDD free space can not be extended. The Sandbox should only be used for short running GPU tasks. For longer running trainings, please use the [Training Environment](sandbox/training.md). After GPU usage please stop your kernel/instance or free the blocked GPU resources manually. After 30 minutes of inactivity, the instanced will automatically removed. The HDD space is persistent, but will be deleted after 6 months.
- **Important** Only for **GPU TF environment** use following memory limitation:
Tensorflow allocates in the beginning of each session all available resources. With the following chunk of code the memory limitation is set to 6024 MB GPU memory. If you dont limit the memory, all other users dont have any GPU support.
```
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=6024)])
except RuntimeError as e:
print(e)
```
At the end of your file you have to clean your session that the GPU resource is released.
```
tf.keras.backend.clear_session()
```