dev_environment #2

Merged
grosse merged 4 commits from dev_environment into master 2024-05-14 09:23:10 +00:00
6 changed files with 96 additions and 33 deletions

View File

@ -14,7 +14,7 @@
## NAS ## NAS
Additional 20TB backup storage is located at IKARUS NAS. Additional 20TB backup storage is located at TAURUS NAS.

View File

@ -2,19 +2,10 @@
## Conceptual Design ## Conceptual Design
![Sandbox Architecture](res/sandbox-architecture.png) ![Sandbox Architecture](res/sandbox-architecture.png)
## Development Environment
An interactive, browser-based, development playground provides beginners and advanced users an isolated and safe AI playground.
## Training Environment
For long running processes (building/training) a userfriendly pipeline is available.
## Use Cases ## Use Cases
The sandbox offers the possibility to provide interactive use cases and demo scripts for online courses.
## Data Generator The platform is a web-based, interactive environment runing on isolated and privacy friendly containers, designed to support an interdisciplinary user base comprising normal users, advanced users and scientific users. It facilitates teaching, research, and AI experimentation integrating four perspectives from Ethics, Law, Business and Computer Science.
In order to analyse training behaviour of maschine learning models, synthetic data can be generator according to the user requirements. (tbd)
## Data Pool
The Data Pool allows the user to exchange data with other sandbox users, advanced users, and scientific users. Therefore the users can use the storage for their assignments (notebook files), data (training) and other sandbox driven data.
## Sandbox Use Cases
The following scenarios can be derived for the system: on the one hand, the system is to be used for teaching and on the other hand, the system is to be used for training.
Therefore, a mixed-use architecture is provided, which allows the system to be used for these two different requirements, hence following role concept has been designed:
### Roles ### Roles
@ -29,8 +20,20 @@ Therefore, a mixed-use architecture is provided, which allows the system to be u
* Provide/develop (own) examples/demos. * Provide/develop (own) examples/demos.
**Scientific User** **Scientific User**
* Data generation using synthetic data generator (tbd)
* GPU usage * GPU usage
* version control (GIT) * version control (GIT)
* CI/CD for long running tasks * CI/CD for long running tasks
* Storage * Storage
## Development Environment
An interactive, browser-based, development playground provides beginners and advanced users an isolated and safe AI playground. The platform is a web-based, interactive Jupyter Notebook environment powered by Kubernetes, designed to provide scalable and efficient computational resources. Users can create, edit, and run Jupyter Notebooks in their browsers, benefiting from the integrated environment to dynamically allocate computing power, manage resources, and ensure high availability. This setup allows for seamless collaboration, reproducibility, and the ability to handle complex data science workflows and provide a playground for AI, making it ideal for researchers, educators, and developers.
## Training Environment
The training environment empowers scientific users with a suite of flexible and advanced toolsets. This includes access to GPU-supported high-memory/CPU instances, ideal for tackling computationally intensive tasks and long-running training pipelines. For enhanced reproducibility and collaboration, the environment utilizes isolated containers with full version control, all seamlessly managed by a robust Kubernetes infrastructure. This ensures a consistent and scalable platform for your scientific workflows.
## Data Pool
The data pool caters to a wide range of users by offering a variety of flexible data storage options. Basic and advanced users alike can leverage the user-friendly Sandbox browser-based Explorer for intuitive data management. This web interface provides a drag-and- drop functionality and visual tools, making data exploration and organization effortless. For a more streamlined workflow, the Share UI provides a standalone interface. For power users, a powerful Share CLI is available, enabling them to efficiently save and manage their long-running trained models. This command-line interface integrates seamlessly with common data science tools and scripting languages, allowing for automation and customization.

View File

@ -1,13 +1,16 @@
# Data Pool # Data Pool
In the following section we describe how to store data on the Sandbox. There are three different ways to do achieve: Inside the Sandbox, headless file upload on Object storage and Git LFS. In the following section we describe how to store data on the Sandbox. There are four different ways to do achieve: Inside the Sandbox, headless file upload & user interface on Object storage and Git LFS.
![Sandbox Data Pool](res/sandbox_datapool.png)
## Inside the Sandbox ## Inside the Sandbox
To store the data inside the Sandbox, you just have to drag & drop or click on the upload button to save the file to your running instance. You can also create folders and new Notebooks. Please see the free space limitations [here](sandbox/dev_env#resources.md). To store the data inside the Sandbox, you just have to drag & drop or click on the upload button to save the file to your running instance. You can also create folders and new Notebooks. Please see the free space limitations [here](sandbox/dev_env#resources.md).
![file upload](res/sandbox_upload_file_selector.png "File Upload") ![file upload](res/sandbox_upload_file_selector.png "File Upload")
## Headless file storage ## Share CLI (Headless)
To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url. To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url.
**Upload Example** **Upload Example**
@ -37,8 +40,23 @@ To use the headless object storage, you can upload a file via REST-Interface or
``` ```
## Share UI
[Share UI](https://share.sandbox.iuk.hdm-stuttgart.de/) is a web application that allows file transfer between all user groups. The user interface allows you to drag and drop files onto the sandbox, delete them and set passwords, which provides additional security for the data. Once the file has been successfully uploaded, a URL can be used to make the file(s) available to course participants or other user groups.
## Git LFS ## Git LFS
The following solution we highly recomment only for users which are familiar with git command line tools! Git Large File Storage (LFS). An open source Git extension for versioning large files. Git LFS replaces large files (audio, sample, datasets, videos) by a text pointer inside git. The files get stored on our gitea Server. The following solution we highly recomment only for users which are familiar with git command line tools! Git Large File Storage (LFS). An open source Git extension for versioning large files. Git LFS replaces large files (audio, sample, datasets, videos) by a text pointer inside git. The files get stored on our gitea Server.
For further information visit [Git LFS](https://git-lfs.com/). For further information visit [Git LFS](https://git-lfs.com/).
## Troubleshooting
- Files do not exist anymore after a certain period of time: The shared space is limited to 3 or 6 months.
- Impossible to curl from Share UI: The Share UI is mainly for UI users only, please use the Share CLI instead
## Useful Links
- [CLI Share](https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload/uuid/filename)
- [Share UI](https://share.sandbox.iuk.hdm-stuttgart.de/)
- [Curl](https://curl.se/docs/tutorial.html)
- [GIT LFS](https://git-lfs.com/)

View File

@ -5,22 +5,23 @@
## Introduction ## Introduction
The **Sandbox Development Environment** enables students and lecturers to create and work with interactive case studies. It also provides a development environment for researchers and advanced programmers. The **Sandbox Development Environment** enables students and lecturers to create and work with interactive case studies. It also provides a development environment for researchers and advanced programmers.
The Sandbox is available at: https://sandbox.iuk.hdm-stuttgart.de The Sandbox is available at: [https://sandbox.iuk.hdm-stuttgart.de](https://sandbox.iuk.hdm-stuttgart.de)
Please carefully handle the resources and follow the fair use principle and as well as the German laws. Please carefully handle the resources and follow the fair use principle and as well as the German laws.
## Getting Started ## Getting Started
Within the sandbox, the different disciplines of the IKID project can provide tasks to be worked on by the respective student groups. Both text-based tasks and programmatic tasks can be processed. For example, Markdown files can be created for editing textual tasks. Within the sandbox, the different disciplines of the IKID project can provide tasks to be worked on by the respective student groups. Both text-based tasks and programmatic tasks can be processed. For example, Markdown files can be created for editing textual tasks.
Currently, it is planned for the technical lectures that students get in touch with the programming language Python for the first time. Therefore, the Sandbox platform was created, in which experiments with Python can be carried out. Currently, inside the technical lectures students (in the role User) get in touch with prorgramming languages for the first time. Therefore, the Sandbox platform was created, in which experiments and Usecases with Python of all User levels can be carried out.
1. **Sign in**, use your **HdM Credentials** at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de/) 1. **Sign in**, use your **HdM Credentials** at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de/)
2. Select the image you want to start (two options) 2. Select the environment you want to start (three options, GPU enviroments are limited to 10 instances in total)
1. **Datascience environment** 1. **Datascience environment**
2. **Datascience GPU environment** (choose only if you realy need the graphic card, otherwise you block resources from those who need them) 2. **GPU PyTorch environment** (PyTorch library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
3. **GPU TF environment** (Tensorflow library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
3. Create or upload a .ipynb file: 3. Create or upload a .ipynb file:
1. . **create a empty .ipynb file:** 1. **create a empty .ipynb file:**
![sandbox launcher](res/sandbox_launcher.png "Sandbox Launcher") ![sandbox launcher](res/sandbox_launcher.png "Sandbox Launcher")
2. **upload a existing .ipynb file:** 2. **upload a existing .ipynb file:**
![sandbox upload file](res/sandbox_upload_file_selector.png "Sandbox upload file") ![sandbox upload file](res/sandbox_upload_file_selector.png "Sandbox upload file")
@ -37,22 +38,27 @@ Currently, it is planned for the technical lectures that students get in touch w
## Technical Overview ## Technical Overview
### Environments ### Environments
The Sandbox provides multiple scientific environments. If any additional packages or libraries are need, please open an issue on our [GIT](https://git.sandbox.iuk.hdm-stuttgart.de/). The Sandbox provides multiple scientific environments. If any additional packages or libraries are needed, check if they are available on [PyPi](https://pypi.org/) and use pip to install it. If a problem occurs, please open an issue on our [GIT](https://git.sandbox.iuk.hdm-stuttgart.de/).
#### Datascience environment #### Datascience environment
* Available Data Science image is based on [Data Science Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience/latest) * Available Data Science image is based on [Data Science Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience/latest)
* most common data analysis library's included for Julia, Python, R are available * most common data analysis library's included for Python are available.
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience/issues/new) if any additional packages are needed or issues occurred. Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience/issues/new) if any additional packages are needed or issues occurred.
#### Datascience GPU environment #### GPU PyTorch environment
* Available GPU image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest) * Available GPU image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
* support added for the NVIDIA GPU A100 computations based on python most common GPU-able libraries like Tensorflow, PyTorch and Keras. * support added for the NVIDIA GPU A100 computations based on python library PyTorch.
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred. Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
#### GPU TF environment
* Available GPU image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
* support added for the NVIDIA GPU A100 computations based on python library Tensorflow.
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
### Resources ### Resources
Each instance has following resource limits: Each instance has following resource limits:
@ -63,4 +69,37 @@ Each instance has following resource limits:
For GPU enabled environments, 40GB shared (time sliced) GPU memory is availble. For additional information, please see the [official nvidia documentation](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html). For GPU enabled environments, 40GB shared (time sliced) GPU memory is availble. For additional information, please see the [official nvidia documentation](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html).
### Limitations ### Limitations
Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de) website. The HDD free space can not be extended. The Sandbox should only be used for short running GPU tasks. For longer running trainings, please use the [Training Environment](sandbox/training.md). After GPU usage please stop your kernel/instance or free the blocked GPU resources manually. After 30 minutes of inactivity, the instanced will automatically removed. The HDD space is persistent, but will be deleted after 6 months. Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de) website. The HDD free space can not be extended. The Sandbox should only be used for short running GPU tasks. For longer running trainings, please use the [Training Environment](sandbox/training.md). After GPU usage please stop your kernel/instance or free the blocked GPU resources manually. After 30 minutes of inactivity, the instanced will automatically removed. The HDD space is persistent, but will be deleted after 6 months.
- **Important** Only for **GPU TF environment** use following memory limitation:
Tensorflow allocates in the beginning of each session all available resources. With the following chunk of code the memory limitation is set to 6024 MB GPU memory. If you dont limit the memory, all other users dont have any GPU support.
```
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
try:
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=6024)])
except RuntimeError as e:
print(e)
```
At the end of your file you have to clean your session that the GPU resource is released.
```
tf.keras.backend.clear_session()
```
## Troubleshooting
- when using javascript based plugins, it can happened that they either not load or run buggy: make sure you use chrome. Other browsers are often not supported by third party plugins
- Notebooks/Server/Dev Environment is going down: To avoid blocking resources, each environment has a certain timeout (frontend inactivity), so idling notebooks get culled to free resources for other user.
- my favourite xyz python packages is missing: use conda/pip (or inside a notebook: !pip install) to add additional packages
- all my data is gone after the semester break: all persistent storages get recycled of every(!) user each semester break. Please backup your data locally if needed
- package conflicts: common issue is to install a unspecific library version, please specify or upgrade all dependencies manually.
## Useful Links
- [Jupyter Documentation](https://docs.jupyter.org/en/latest/)
- [pip](https://pip.pypa.io/en/stable/user_guide/)
- [python](https://docs.python.org/3.11/)

View File

@ -2,14 +2,17 @@
![Sandbox Architecture](res/sandbox-architecture.png) ![Sandbox Architecture](res/sandbox-architecture.png)
## Use Cases
A brief overview how to create [Use Cases](use_cases.md) on the Sandbox
## Development Environment ## Development Environment
A [Development Environment](dev_env.md) or Playground for beginners and advanced users. A [Development Environment](dev_env.md) or Playground for beginners and advanced users.
## Use Cases
A brief overview how to create [Use Cases](use_cases.md) on the Sandbox
## Data Pool ## Data Pool
Details about how to store and manage data on the Sandbox: Additional information: [Data Pool](data_pool.md) Details about how to store and manage data on the Sandbox: Additional information: [Data Pool](data_pool.md)
## Training Environment ## Training Environment
(tbd) Detailed Information about how to perform a GPU supported model training on the Sandbox Environment.
## Data Generator
(tbd)

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB