Compare commits
1 Commits
Author | SHA1 | Date |
---|---|---|
Cornelius Specht | 863ada64c9 |
|
@ -1,21 +1,18 @@
|
||||||
steps:
|
pipeline:
|
||||||
create-book:
|
create-book:
|
||||||
image: peaceiris/mdbook:v0.4.30
|
image: peaceiris/mdbook:v0.4.30
|
||||||
commands:
|
commands:
|
||||||
- mdbook init --theme light
|
|
||||||
- mdbook build
|
- mdbook build
|
||||||
build_and_release:
|
|
||||||
image: maltegrosse/woodpecker-buildah:0.0.12
|
publish-container:
|
||||||
|
image: woodpeckerci/plugin-docker-buildx:2.1.0
|
||||||
|
secrets: [docker_username, docker_password]
|
||||||
|
group: docker
|
||||||
settings:
|
settings:
|
||||||
registry: git.sandbox.iuk.hdm-stuttgart.de
|
registry: https://git.sandbox.iuk.hdm-stuttgart.de
|
||||||
repository: grosse/sandbox-docs-public
|
repo: git.sandbox.iuk.hdm-stuttgart.de/grosse/sandbox-docs-public
|
||||||
tag: latest
|
dockerfile: Dockerfile
|
||||||
architectures: amd64
|
tags: latest
|
||||||
context: Dockerfile
|
|
||||||
imagename: sandbox-docs-public
|
|
||||||
username:
|
|
||||||
from_secret: docker_username
|
|
||||||
password:
|
|
||||||
from_secret: docker_password
|
|
||||||
|
|
||||||
|
|
||||||
|
branches:
|
||||||
|
exclude: cspecht
|
||||||
|
|
|
@ -1,5 +1,7 @@
|
||||||
FROM nginx:alpine3.17-slim
|
FROM nginx:alpine3.17-slim
|
||||||
WORKDIR /app
|
WORKDIR /app
|
||||||
|
|
||||||
|
COPY . .
|
||||||
|
|
||||||
COPY ./nginx.conf /etc/nginx/nginx.conf
|
COPY ./nginx.conf /etc/nginx/nginx.conf
|
||||||
COPY ./book /app/static
|
COPY ./book /app/static
|
|
@ -1,4 +1,3 @@
|
||||||
[book]
|
[book]
|
||||||
title = "Sandbox Documentation"
|
title = "Sandbox Documentation"
|
||||||
language = "en"
|
language = "en"
|
||||||
[output.html]
|
|
|
@ -1,17 +1,8 @@
|
||||||
# Introduction
|
# Introduction
|
||||||
This is the official documenation of the IKID [Sandbox](https://sandbox.iuk.hdm-stuttgart.de/) project.
|
|
||||||
|
|
||||||
## Project Description
|
## Introduction
|
||||||
Interdisciplinary AI Exploratorium: Integrated Teaching for the Responsible Use of Artificial Intelligence based on Physical-Virtual Demonstrators is a project funded by BMBF and MWK BW. (German: Interdisziplinäres KI-Exploratorium: Integrierte Lehre zur verantwortungsvollen Nutzung Künstlicher Intelligenz auf Basis physisch-virtueller Demonstratoren, short IKID)
|
Public documentation for **sandbox**.
|
||||||
|
|
||||||
The majority of current AI teaching formats view AI from a single isolated perspective. However, a responsible use of AI requires a comprehensive view from different perspectives: Technology, Economics, Law, and Ethics. This project aims to fill this gap in current university teaching in an innovative way. Within this project, an AI Exploratorium will be created, which will present eight different use cases and their complexity of AI and thus make them directly tangible for students. To ensure that these use cases are not limited by physical access, an IT infrastructure (Sandbox) will be created that allows students to use this infrastructure remotely as well.
|
## Purpose
|
||||||
|
|
||||||
The Sandbox allows both teachers and students to work independently on a containerized platform which includes a web based integrated development environment in the context of the AI Exploratorium. This enables collaborative work on case studies available on a shared data pool. Additionally, synthetic data can be generated to simulate subject related scenarios in order to analyse AI related behaviour within training pipelines.
|
|
||||||
|
|
||||||
## Sponsors
|
## Sponsors
|
||||||
|
|
||||||
<img src="res/logo_bmbf.svg" width="150"/>
|
|
||||||
<img src="res/logo_mwk.png" width="150"/>
|
|
||||||
<img src="res/logo_hdm.png" width="120"/>
|
|
||||||
<img src="res/logo_IKID.png" width="120"/>
|
|
||||||
<img src="res/logo_IAAI.svg" width="150"/>
|
|
||||||
|
|
|
@ -9,9 +9,13 @@
|
||||||
- [Software](architecture/software.md)
|
- [Software](architecture/software.md)
|
||||||
|
|
||||||
# Playground
|
# Playground
|
||||||
|
- [Overview](sandbox/overview.md)
|
||||||
- [Use Cases](sandbox/use_cases.md)
|
|
||||||
- [Development Environment](sandbox/dev_env.md)
|
- [Development Environment](sandbox/dev_env.md)
|
||||||
|
- [Getting Started](sandbox/dev_env#getting-started.md)
|
||||||
|
- [Environments](sandbox/dev_env#environments.md)
|
||||||
|
- [Resources](sandbox/dev_env#resources.md)
|
||||||
|
- [Limitations](sandbox/dev_env#limitations.md)
|
||||||
|
- [Use Cases](sandbox/use_cases.md)
|
||||||
- [Data Pool](sandbox/data_pool.md)
|
- [Data Pool](sandbox/data_pool.md)
|
||||||
- [Training Environment](sandbox/training.md)
|
- [Training Environment](sandbox/training.md)
|
||||||
|
- [Data Generator](sandbox/data_generator.md)
|
||||||
|
|
|
@ -14,7 +14,7 @@
|
||||||
|
|
||||||
## NAS
|
## NAS
|
||||||
|
|
||||||
Additional 20TB backup storage is located at TAURUS NAS.
|
Additional 20TB backup storage is located at IKARUS NAS.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
|
|
@ -1,11 +1,20 @@
|
||||||
# Architecture Overview
|
# Architecture Overview
|
||||||
|
|
||||||
## Conceptual Design
|
## Conceptual Design
|
||||||
![Sandbox Architecture](res/sb-overview.png)
|
![Sandbox Architecture](res/sandbox-architecture.png)
|
||||||
|
## Development Environment
|
||||||
|
An interactive, browser-based, development playground provides beginners and advanced users an isolated and safe AI playground.
|
||||||
|
## Training Environment
|
||||||
|
For long running processes (building/training) a userfriendly pipeline is available.
|
||||||
## Use Cases
|
## Use Cases
|
||||||
|
The sandbox offers the possibility to provide interactive use cases and demo scripts for online courses.
|
||||||
The platform is a web-based, interactive environment runing on isolated and privacy friendly containers, designed to support an interdisciplinary user base comprising normal users, advanced users and scientific users. It facilitates teaching, research, and AI experimentation integrating four perspectives from Ethics, Law, Business and Computer Science.
|
## Data Generator
|
||||||
|
In order to analyse training behaviour of maschine learning models, synthetic data can be generator according to the user requirements. (tbd)
|
||||||
|
## Data Pool
|
||||||
|
The Data Pool allows the user to exchange data with other sandbox users, advanced users, and scientific users. Therefore the users can use the storage for their assignments (notebook files), data (training) and other sandbox driven data.
|
||||||
|
## Sandbox Use Cases
|
||||||
|
The following scenarios can be derived for the system: on the one hand, the system is to be used for teaching and on the other hand, the system is to be used for training.
|
||||||
|
Therefore, a mixed-use architecture is provided, which allows the system to be used for these two different requirements, hence following role concept has been designed:
|
||||||
|
|
||||||
### Roles
|
### Roles
|
||||||
|
|
||||||
|
@ -20,20 +29,8 @@ The platform is a web-based, interactive environment runing on isolated and priv
|
||||||
* Provide/develop (own) examples/demos.
|
* Provide/develop (own) examples/demos.
|
||||||
|
|
||||||
**Scientific User**
|
**Scientific User**
|
||||||
|
* Data generation using synthetic data generator (tbd)
|
||||||
* GPU usage
|
* GPU usage
|
||||||
* version control (GIT)
|
* version control (GIT)
|
||||||
* CI/CD for long running tasks
|
* CI/CD for long running tasks
|
||||||
* Storage
|
* Storage
|
||||||
|
|
||||||
|
|
||||||
## Development Environment
|
|
||||||
An interactive, browser-based, development playground provides beginners and advanced users an isolated and safe AI playground. The platform is a web-based, interactive Jupyter Notebook environment powered by Kubernetes, designed to provide scalable and efficient computational resources. Users can create, edit, and run Jupyter Notebooks in their browsers, benefiting from the integrated environment to dynamically allocate computing power, manage resources, and ensure high availability. This setup allows for seamless collaboration, reproducibility, and the ability to handle complex data science workflows and provide a playground for AI, making it ideal for researchers, educators, and developers.
|
|
||||||
|
|
||||||
|
|
||||||
## Training Environment
|
|
||||||
The training environment empowers scientific users with a suite of flexible and advanced toolsets. This includes access to GPU-supported high-memory/CPU instances, ideal for tackling computationally intensive tasks and long-running training pipelines. For enhanced reproducibility and collaboration, the environment utilizes isolated containers with full version control, all seamlessly managed by a robust Kubernetes infrastructure. This ensures a consistent and scalable platform for your scientific workflows.
|
|
||||||
|
|
||||||
|
|
||||||
## Data Pool
|
|
||||||
The data pool caters to a wide range of users by offering a variety of flexible data storage options. Basic and advanced users alike can leverage the user-friendly Sandbox browser-based Explorer for intuitive data management. This web interface provides a drag-and- drop functionality and visual tools, making data exploration and organization effortless. For a more streamlined workflow, the Share UI provides a standalone interface. For power users, a powerful Share CLI is available, enabling them to efficiently save and manage their long-running trained models. This command-line interface integrates seamlessly with common data science tools and scripting languages, allowing for automation and customization.
|
|
||||||
|
|
||||||
|
|
Before Width: | Height: | Size: 266 KiB After Width: | Height: | Size: 70 KiB |
Before Width: | Height: | Size: 246 KiB |
Before Width: | Height: | Size: 23 KiB |
|
@ -1,8 +1,6 @@
|
||||||
# Software
|
# Software
|
||||||
## Software Defined Architecture
|
## Software Defined Architecture
|
||||||
![SDA](res/sda.png)
|
![SDA](res/vms.png)
|
||||||
|
|
||||||
|
|
||||||
The server is separated into three virtual machines, according to their responsibilities, and is highly extendable regarding new external hardware.
|
The server is separated into three virtual machines, according to their responsibilities, and is highly extendable regarding new external hardware.
|
||||||
- Master VM: Management environment for software orchestration
|
- Master VM: Management environment for software orchestration
|
||||||
- Node 1 VM: Application node for software foundation like databases, sandbox controller, ci/cd controller
|
- Node 1 VM: Application node for software foundation like databases, sandbox controller, ci/cd controller
|
||||||
|
|
Before Width: | Height: | Size: 304 KiB |
Before Width: | Height: | Size: 29 KiB |
Before Width: | Height: | Size: 27 KiB |
Before Width: | Height: | Size: 16 KiB |
Before Width: | Height: | Size: 163 KiB |
|
@ -1,4 +1,6 @@
|
||||||
# Data Generator
|
# Data Generator
|
||||||
|
|
||||||
Currently under heavy development, coming soon....stayed tuned!
|
|
||||||
|
|
||||||
|
Coming Soon ...
|
||||||
|
|
||||||
|
|
|
@ -1,17 +1,17 @@
|
||||||
# Data Pool
|
# Data Pool
|
||||||
|
|
||||||
In the following section we describe how to store data on the Sandbox. There are four different ways to do achieve: Inside the Sandbox, headless file upload & user interface on Object storage and Git LFS.
|
|
||||||
|
|
||||||
![Sandbox Data Pool](res/sandbox_datapool.png)
|
# Storage
|
||||||
|
|
||||||
|
In the following section we describe how to store data on the Sandbox. There are three diffrent ways to do so: Inside the Sandbox, Object Storage, Git LFS.
|
||||||
|
|
||||||
## Inside the Sandbox
|
## Inside the Sandbox
|
||||||
To store the data inside the Sandbox, you just have to drag & drop or click on the upload button to save the file to your running instance. You can also create folders and new Notebooks. Please see the free space limitations [here](sandbox/dev_env#resources.md).
|
To store the data inside the Sandbox, you just have to drag & drop or click on the upload button to save the file to your running instance. You can also create folders and new Notebooks. The only limitation is, that each user has 1GB of storage.
|
||||||
![file upload](res/sandbox_upload_file_selector.png "File Upload")
|
![file upload](res/sandbox_upload_file_selector.png "File Upload")
|
||||||
|
|
||||||
|
|
||||||
## Share CLI (Headless)
|
## Object storage
|
||||||
To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url. The upload is only available from the Sandbox.
|
To use the Object storage, you can upload a file via REST-Interface and access it by the key you get provided in the response. If you want to upload your file:
|
||||||
|
|
||||||
**Upload Example**
|
**Upload Example**
|
||||||
```python
|
```python
|
||||||
|
@ -23,22 +23,7 @@ To use the headless object storage, you can upload a file via REST-Interface or
|
||||||
r = requests.post(url, files=files)
|
r = requests.post(url, files=files)
|
||||||
```
|
```
|
||||||
|
|
||||||
or using curl as command line tool:
|
**Usage Example**
|
||||||
|
|
||||||
```
|
|
||||||
curl -F fileUpload=@file.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload
|
|
||||||
```
|
|
||||||
|
|
||||||
**Example JSON response**
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"PublicUrl": "https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload/f2a69e9a-f60b-418d-a678-efce181fb8a5/untitled.txt",
|
|
||||||
"Size": 11,
|
|
||||||
"Expiration": "2023-10-04T00:00:00Z"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
**Import Example**
|
|
||||||
```python
|
```python
|
||||||
import pandas as pd
|
import pandas as pd
|
||||||
|
|
||||||
|
@ -46,24 +31,8 @@ curl -F fileUpload=@file.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/
|
||||||
df = pd.read_csv(url)
|
df = pd.read_csv(url)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
||||||
## Share UI
|
|
||||||
[Share UI](https://share.sandbox.iuk.hdm-stuttgart.de/) is a web application that allows file transfer between all user groups. The user interface allows you to drag and drop files onto the sandbox, delete them and set passwords, which provides additional security for the data. Once the file has been successfully uploaded, a URL can be used to make the file(s) available to course participants or other user groups.
|
|
||||||
|
|
||||||
|
|
||||||
## Git LFS
|
## Git LFS
|
||||||
The following solution we highly recomment only for users which are familiar with git command line tools! Git Large File Storage (LFS). An open source Git extension for versioning large files. Git LFS replaces large files (audio, sample, datasets, videos) by a text pointer inside git. The files get stored on our gitea Server.
|
The following solution we highly recomment only for users which are familiar with git command line tools! Git Large File Storage (LFS). An open source Git extension for versioning large files. Git LFS replaces large files (audio, sample, datasets, videos) by a text pointer inside git. The files get stored on our gitea Server.
|
||||||
For further information visit [Git LFS](https://git-lfs.com/).
|
For further information visit [Git LFS](https://git-lfs.com/).
|
||||||
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
- Files do not exist anymore after a certain period of time: The shared space is limited to 3 or 6 months.
|
|
||||||
- Impossible to curl from Share UI: The Share UI is mainly for UI users only, please use the Share CLI instead
|
|
||||||
|
|
||||||
|
|
||||||
## Useful Links
|
|
||||||
|
|
||||||
- [CLI Share](https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload/uuid/filename)
|
|
||||||
- [Share UI](https://share.sandbox.iuk.hdm-stuttgart.de/)
|
|
||||||
- [Curl](https://curl.se/docs/tutorial.html)
|
|
||||||
- [GIT LFS](https://git-lfs.com/)
|
|
||||||
|
|
|
@ -3,31 +3,29 @@
|
||||||
![sandbox](https://uptime.monitoring.iuk.hdm-stuttgart.de/api/badge/1/status)
|
![sandbox](https://uptime.monitoring.iuk.hdm-stuttgart.de/api/badge/1/status)
|
||||||
|
|
||||||
## Introduction
|
## Introduction
|
||||||
The **Sandbox Development Environment** enables students and lecturers to create and work with interactive case studies. It also provides a development environment for researchers and advanced programmers.
|
use cases, limitations + t&c
|
||||||
|
|
||||||
The Sandbox is available at: [https://sandbox.iuk.hdm-stuttgart.de](https://sandbox.iuk.hdm-stuttgart.de)
|
|
||||||
|
|
||||||
Please carefully handle the resources and follow the fair use principle and as well as the German laws.
|
|
||||||
## Getting Started
|
## Getting Started
|
||||||
|
|
||||||
Within the sandbox, the different disciplines of the IKID project can provide tasks to be worked on by the respective student groups. Both text-based tasks and programmatic tasks can be processed. For example, Markdown files can be created for editing textual tasks.
|
Within the sandbox, the different disciplines inside the IKID project can provide tasks to be worked on by the respective student groups. Both text-based tasks and programmatic tasks can be provided and processed. For example, Markdown files can be created for editing textual tasks. These files can be converted from the source form (unformatted) to the target form (formatted) using a simple syntax.
|
||||||
Currently, inside the technical lectures students (in the role User) get in touch with prorgramming languages for the first time. Therefore, the Sandbox platform was created, in which experiments and Usecases with Python of all User levels can be carried out.
|
Currently, it is planned for the technical lectures that the students get first in touch with the programming language Python. Therefore, the Sandbox platform was created, in which experiments with Python can be carried out. But if needed, it is possible to add more supported languages in the future.
|
||||||
|
|
||||||
|
[Sandbox](https://sandbox.iuk.hdm-stuttgart.de/)
|
||||||
|
(only accessible from the HdM-Network)
|
||||||
|
|
||||||
|
|
||||||
|
1. **Sign in**, use your **HdM Credentials**
|
||||||
1. **Sign in**, use your **HdM Credentials** at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de/)
|
2. Select the image you want to start (two options)
|
||||||
2. Select the environment you want to start (three options, GPU enviroments are limited to 10 instances in total)
|
|
||||||
1. **Datascience environment**
|
1. **Datascience environment**
|
||||||
2. **GPU PyTorch environment** (PyTorch library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
|
2. **GPU environment** (choose only if you realy need the graphic card, otherwise you steal resources from those who need them)
|
||||||
3. **GPU TF environment** (Tensorflow library with GPU support, choose only if you realy need the graphic card, otherwise you block resources from those who need them)
|
3. Create or upload a .ipynb file to start with
|
||||||
3. Create or upload a .ipynb file:
|
1. . **create a empty .ipynb file:**
|
||||||
1. **create a empty .ipynb file:**
|
|
||||||
![sandbox launcher](res/sandbox_launcher.png "Sandbox Launcher")
|
![sandbox launcher](res/sandbox_launcher.png "Sandbox Launcher")
|
||||||
2. **upload a existing .ipynb file:**
|
2. **upload a existing .ipynb file:**
|
||||||
![sandbox upload file](res/sandbox_upload_file_selector.png "Sandbox upload file")
|
![sandbox upload file](res/sandbox_upload_file_selector.png "Sandbox upload file")
|
||||||
4. **open** the file from the filebrowser & start working!
|
4. **open** the file from the filebrowser & start working!
|
||||||
![sandbox .ipynb file](res/sandbox_ipynb_example.png "Sandbox Notebook").
|
![sandbox .ipynb file](res/sandbox_ipynb_example.png "Sandbox Notebook").
|
||||||
5. **After you finished your work dont forget to shutdown your server!** Therefore, you should shutdown your server to release server resources. **Select File, Hub Control Panel**
|
5. **After you finished your work dont forget to shutdown your server!** Therfore you should shutdown your server to release server resources. **Select File, Hub Control Panel**
|
||||||
![file menue](res/sandbox_file_menu.png "File menue")
|
![file menue](res/sandbox_file_menu.png "File menue")
|
||||||
7. Select **Stop Server**
|
7. Select **Stop Server**
|
||||||
![stop server](res/sandbox_stop_server.png "Stop Server")
|
![stop server](res/sandbox_stop_server.png "Stop Server")
|
||||||
|
@ -35,98 +33,22 @@ Currently, inside the technical lectures students (in the role User) get in touc
|
||||||
![logout](res/sandbox_logout.png)
|
![logout](res/sandbox_logout.png)
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Technical Overview
|
## Technical Overview
|
||||||
|
|
||||||
|
Which python packages are installed, How can I install a python package?
|
||||||
### Environments
|
### Environments
|
||||||
The Sandbox provides multiple scientific environments. If any additional packages or libraries are needed, check if they are available on [PyPi](https://pypi.org/) and use pip to install it. If a problem occurs, please open an issue on our [GIT](https://git.sandbox.iuk.hdm-stuttgart.de/).
|
|
||||||
|
|
||||||
|
|
||||||
#### Datascience environment
|
#### Datascience environment
|
||||||
* Available Data Science image is based on [Data Science Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience/latest)
|
* Available Data Science image is based on [Official Data Science Image](https://hub.docker.com/r/jupyter/datascience-notebook/tags/)
|
||||||
* most common data analysis library's included for Python are available.
|
* most common data analysis library's included for Julia, Python, R
|
||||||
|
|
||||||
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience/issues/new) if any additional packages are needed or issues occurred.
|
#### GPU environment
|
||||||
|
* Available GPU image is based on [Official GPU Image](https://hub.docker.com/r/cschranz/gpu-jupyter)
|
||||||
|
* support added for the NVIDIA GPU A100 calculations based on python most common GPU-able libraries Tensorflow, PyTorch and Keras.
|
||||||
#### GPU PyTorch environment
|
|
||||||
* Available GPU image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
|
|
||||||
* support added for the NVIDIA GPU A100 computations based on python library PyTorch.
|
|
||||||
|
|
||||||
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
|
|
||||||
|
|
||||||
#### GPU TF environment
|
|
||||||
* Available GPU image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
|
|
||||||
* support added for the NVIDIA GPU A100 computations based on python library Tensorflow.
|
|
||||||
|
|
||||||
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
|
|
||||||
|
|
||||||
#### GPU Ollama environment
|
|
||||||
* Available Ollama image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
|
|
||||||
* support added for the NVIDIA GPU A100 computations based on python library PyTorch.
|
|
||||||
|
|
||||||
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
|
|
||||||
|
|
||||||
|
|
||||||
### Resources
|
### Resources
|
||||||
Each instance has following resource limits:
|
|
||||||
- maximum 2 physical CPUs, guranteed 0.5 CPUs
|
|
||||||
- maximum 10GB DDR memory, guranteed 1GB
|
|
||||||
- maximum of 1GB HDD
|
|
||||||
|
|
||||||
For GPU enabled environments, 40GB shared (time sliced) GPU memory is availble. For additional information, please see the [official nvidia documentation](https://docs.nvidia.com/datacenter/tesla/mig-user-guide/index.html).
|
|
||||||
|
|
||||||
### Limitations
|
### Limitations
|
||||||
Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iuk.hdm-stuttgart.de) website. The HDD free space can not be extended. The Sandbox should only be used for short running GPU tasks. For longer running trainings, please use the [Training Environment](sandbox/training.md). After GPU usage please stop your kernel/instance or free the blocked GPU resources manually. After 30 minutes of inactivity, the instanced will automatically removed. The HDD space is persistent, but will be deleted after 6 months.
|
|
||||||
|
|
||||||
- **Important** Only for **GPU TF environment** use following memory limitation:
|
|
||||||
Tensorflow allocates in the beginning of each session all available resources. With the following chunk of code the memory limitation is set to 6024 MB GPU memory. If you dont limit the memory, all other users dont have any GPU support.
|
|
||||||
|
|
||||||
```
|
|
||||||
gpus = tf.config.experimental.list_physical_devices('GPU')
|
|
||||||
if gpus:
|
|
||||||
|
|
||||||
try:
|
|
||||||
tf.config.experimental.set_virtual_device_configuration(gpus[0], [tf.config.experimental.VirtualDeviceConfiguration(memory_limit=6024)])
|
|
||||||
|
|
||||||
except RuntimeError as e:
|
|
||||||
print(e)
|
|
||||||
```
|
|
||||||
At the end of your file you have to clean your session that the GPU resource is released.
|
|
||||||
|
|
||||||
```
|
|
||||||
tf.keras.backend.clear_session()
|
|
||||||
```
|
|
||||||
|
|
||||||
## Troubleshooting
|
|
||||||
- when using javascript based plugins, it can happened that they either not load or run buggy: make sure you use chrome. Other browsers are often not supported by third party plugins
|
|
||||||
- Notebooks/Server/Dev Environment is going down: To avoid blocking resources, each environment has a certain timeout (frontend inactivity), so idling notebooks get culled to free resources for other user.
|
|
||||||
- my favourite xyz python packages is missing: use conda/pip (or inside a notebook: !pip install) to add additional packages
|
|
||||||
- all my data is gone after the semester break: all persistent storages get recycled of every(!) user each semester break. Please backup your data locally if needed
|
|
||||||
- package conflicts: common issue is to install a unspecific library version, please specify or upgrade all dependencies manually.
|
|
||||||
|
|
||||||
## Large-Language-Models
|
|
||||||
An advanced Pytorch development environment is preinstalled with [Ollama](https://ollama.com/), which makes it easy to download and run different LLMs.
|
|
||||||
|
|
||||||
If you want to run Ollama including the [OpenWebUI](https://openwebui.com/), following commands need to be executed in multiple Terminal windows:
|
|
||||||
![terminal](res/sandbox-terminal.png "Terminal")
|
|
||||||
|
|
||||||
1. execute `ollama serve` to start the ollama backend
|
|
||||||
2. execute `ollama run mistral:7b` to download and run a specific model in a second terminal.
|
|
||||||
|
|
||||||
Following steps are only needed if you want to access the WebUI. It also showcases how other http services can temporally be tunneled and exposed to public.
|
|
||||||
![tunnel](res/sandbox-tunnel.png "Tunnel")
|
|
||||||
|
|
||||||
3. execute `open-webui serve` to serve the WebUI locally on port 8080 in another terminal.
|
|
||||||
4. visit [tun.iuk.hdm-stuttgart.de](https://tun.iuk.hdm-stuttgart.de) to obtain a token with your browser
|
|
||||||
5. execute `pgrok init --remote-addr tun.iuk.hdm-stuttgart.de:80 --forward-addr https://{user}.tun.iuk.hdm-stuttgart.de --token {token}` in the terminal. Replace `{user}` and `{token}` with your username and the previous obtained token.
|
|
||||||
6. execute `pgrok http 8080` to run the tunnel and expose the webui. Now you are able to access the webui at ´https://{user}.tun.iuk.hdm-stuttgart.
|
|
||||||
|
|
||||||
### Notes
|
|
||||||
Please use the tunnel only temporally and carefully. The tunnel only support http(s) tunnels. Tunneled services are public available and accessible by anyone! If you want to train/finetune any LLMs, please use the [Training Environment](training.md) instead.
|
|
||||||
|
|
||||||
## Useful Links
|
|
||||||
|
|
||||||
- [Jupyter Documentation](https://docs.jupyter.org/en/latest/)
|
|
||||||
- [pip](https://pip.pypa.io/en/stable/user_guide/)
|
|
||||||
- [python](https://docs.python.org/3.11/)
|
|
||||||
- [Ollama](https://ollama.com/)
|
|
|
@ -0,0 +1,15 @@
|
||||||
|
# Overview Sandbox
|
||||||
|
|
||||||
|
![Sandbox Architecture](res/sandbox-architecture.png)
|
||||||
|
|
||||||
|
## Development Environment
|
||||||
|
Playground...Getting started... Use cases
|
||||||
|
|
||||||
|
## Use Cases
|
||||||
|
|
||||||
|
## Data Pool
|
||||||
|
|
||||||
|
## Training Environment
|
||||||
|
(tbd)
|
||||||
|
## Data Generator
|
||||||
|
(tbd)
|
Before Width: | Height: | Size: 501 KiB |
Before Width: | Height: | Size: 266 KiB After Width: | Height: | Size: 70 KiB |
Before Width: | Height: | Size: 20 KiB |
Before Width: | Height: | Size: 10 KiB |
Before Width: | Height: | Size: 12 KiB |
Before Width: | Height: | Size: 71 KiB |
Before Width: | Height: | Size: 24 KiB |
Before Width: | Height: | Size: 38 KiB |
|
@ -1,114 +1,6 @@
|
||||||
# Training Environment
|
# Training Environment
|
||||||
|
|
||||||
This documentation is for advanced users which are aware of following tools: git, python/R, cuda, pytorch/tensorflow and basic container knowledge.
|
|
||||||
![repos](./res/training.svg)
|
|
||||||
## Overview
|
|
||||||
Available are two worker agents with
|
|
||||||
- 12 physical CPUs
|
|
||||||
- 40 GB memory
|
|
||||||
- 20 GB Nvidia GPU memory
|
|
||||||
- 100 GB Hdd Diskspace
|
|
||||||
|
|
||||||
Only two pipelines can run in parallel to ensure having the promised hardware resources. If more jobs occur, they will be stored in a queue and released after the fifo principle. Storage is not persistent - every build/training job needs to saved somewhere external.
|
# Training
|
||||||
|
Coming Soon ...
|
||||||
|
|
||||||
|
|
||||||
## Development
|
|
||||||
|
|
||||||
### Git
|
|
||||||
Create a new git repository and commit your latest code here: https://git.sandbox.iuk.hdm-stuttgart.de/
|
|
||||||
|
|
||||||
Repositories can be private or public - depends on your use case.
|
|
||||||
|
|
||||||
|
|
||||||
### CI
|
|
||||||
Connect your newly created repository here: https://ci.sandbox.iuk.hdm-stuttgart.de/
|
|
||||||
1. After login, click on "+ Add repository"
|
|
||||||
![repos](./res/sandbox-ci-repos.png)
|
|
||||||
2. Enable the specific repository
|
|
||||||
|
|
||||||
3. Go to the repositories [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos) and select your enabled repository
|
|
||||||
4. Go to settings (clicking the settings icon)
|
|
||||||
![repos](./res/sandbox-ci-settings.png)
|
|
||||||
5. Set a reasonable timeout in minutes (e.g. 360 minutes for 6hours) if some training crashes/hangs
|
|
||||||
6. Add additional settings like secrets or container registries, see the official [documentation](https://woodpecker-ci.org/docs/usage/project-settings) for additional settings
|
|
||||||
|
|
||||||
|
|
||||||
### Pipeline File
|
|
||||||
An example script can be found here:
|
|
||||||
|
|
||||||
https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci
|
|
||||||
|
|
||||||
|
|
||||||
1. Create a new file in your repository `.woodpecker.yml` (or different regarding repository settings above)
|
|
||||||
2. The content can look like following:
|
|
||||||
|
|
||||||
```
|
|
||||||
steps:
|
|
||||||
"train":
|
|
||||||
image: nvcr.io/nvidia/tensorflow:23.10-tf2-py3
|
|
||||||
commands:
|
|
||||||
- echo "starting python script"
|
|
||||||
- python run.py
|
|
||||||
"compress and upload":
|
|
||||||
image: alpine:3
|
|
||||||
commands:
|
|
||||||
- apk --no-cache add zip curl
|
|
||||||
- zip mymodel.zip mymodel.keras
|
|
||||||
- curl -F fileUpload=@mymodel.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload
|
|
||||||
```
|
|
||||||
See the official [documentation](https://woodpecker-ci.org/docs/usage/workflow-syntax) for the syntax.
|
|
||||||
|
|
||||||
Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In most cases you can find predefined containers at [Dockerhub](https://hub.docker.com/) or GPU supported containers at [NVIDIA](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch). If needed, custom images can be created and stored internally (on the Sandbox Git package repository) or any other public available container repository. In the second step, the model gets compressed and pushed on the temp. sandbox storage.
|
|
||||||
|
|
||||||
3. Commit and push
|
|
||||||
4. See current state of the pipelines at the [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos)
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
### Exporting trained model
|
|
||||||
We provide a 3-months disposal internal storage.
|
|
||||||
You can either use the a simple curl command `curl -F fileUpload=@mymodel.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload` to upload a file or a simple python script
|
|
||||||
|
|
||||||
```
|
|
||||||
import requests
|
|
||||||
import os
|
|
||||||
|
|
||||||
myurl = 'https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload'
|
|
||||||
|
|
||||||
print("uploading file")
|
|
||||||
files = {
|
|
||||||
'fileUpload':('mymodel.keras', open('mymodel.keras', 'rb'),'application/octet-stream')
|
|
||||||
}
|
|
||||||
|
|
||||||
response = requests.post(myurl, files=files)
|
|
||||||
print(response,response.text)
|
|
||||||
|
|
||||||
```
|
|
||||||
|
|
||||||
which returns a json with the download url of your uploaded file.
|
|
||||||
|
|
||||||
```
|
|
||||||
{"PublicUrl":"https://storage.sandbox.iuk.hdm-stuttgart.de/upload/49676006-94e4-4da6-be3f-466u786768979/mymodel.keras","Size":97865925,"Expiration":"2024-03-30T00:00:00Z"}
|
|
||||||
|
|
||||||
```
|
|
||||||
## Troubleshooting:
|
|
||||||
- The first time an external container is pulled, depending on the size, container images can take quite a while as different organization (like dockerhub) limit the download speed. The Sandbox git also supports hosting container images...
|
|
||||||
- Choose a proper way to output some reasonable logs during your training, so it wont spam the logs too heavily
|
|
||||||
- training exists after 60 minutes: increase maximum duration in the ci repository settings
|
|
||||||
|
|
||||||
|
|
||||||
## Advanced Parameters (Matrix Workflos)
|
|
||||||
The woodpecker cli yaml defintion files support [matrix workflows](https://woodpecker-ci.org/docs/usage/matrix-workflows), such that multiple pipeline runs are executed with all combinations of the predefined variables.
|
|
||||||
See the [test-ci](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci/src/branch/matrix) matrix branch as an example to define multiple pipeline runs with different epochs and optimizers. In the CI it is shown with different labels for each parameter:
|
|
||||||
![repos](./res/matrix-ci.png)
|
|
||||||
|
|
||||||
## Useful Links
|
|
||||||
- [Sandbox GIT](https://git.sandbox.iuk.hdm-stuttgart.de/)
|
|
||||||
- [Sandbox CI](https://ci.sandbox.iuk.hdm-stuttgart.de)
|
|
||||||
- [Git](https://git-scm.com/docs/gittutorial)
|
|
||||||
- [Woodpecker Syntax](https://woodpecker-ci.org/docs/2.3/usage/workflow-syntax)
|
|
||||||
- [PyTorch](https://pytorch.org/docs/stable/index.html)
|
|
||||||
- [TensorFlow](https://www.tensorflow.org/versions/r2.15/api_docs/python/tf)
|
|
||||||
- [NVIDIA PyTorch Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
|
|
||||||
- [NVIDIA Tensorflow Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow)
|
|
||||||
- [Dockerhub](https://hub.docker.com/)
|
|
|
@ -1,11 +1,14 @@
|
||||||
# Use Cases
|
# Use Cases
|
||||||
|
|
||||||
## Example Python
|
tbd
|
||||||
|
ipynb example syntax + markdown + tex + voila slider (interactive dashboards)
|
||||||
|
|
||||||
|
# Example Python
|
||||||
Inside the Notebook file you can write normal python syntax and can use plotting libaries to visualize your data and show the insights.
|
Inside the Notebook file you can write normal python syntax and can use plotting libaries to visualize your data and show the insights.
|
||||||
|
|
||||||
![Sandbox Example Python](res/sandbox_example_python.png)
|
![Sandbox Example Python](res/sandbox_example_python.png)
|
||||||
|
|
||||||
## Example Markdown
|
# Example Markdown
|
||||||
Inside Notebooks its possible to write Markdown text. This allows the user and the lectures to write formatted text inside the code editor, to create and answer Assignments.
|
Inside Notebooks its possible to write Markdown text. This allows the user and the lectures to write formatted text inside the code editor, to create and answer Assignments.
|
||||||
|
|
||||||
| Markdown Syntax | Description |
|
| Markdown Syntax | Description |
|
||||||
|
@ -30,13 +33,17 @@ if you need more or advanced syntax to format your text with markdown have a loo
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
## Example interactive dashboard
|
# Example interactive dashboard
|
||||||
The following example shows the use of a interactive dashboard. The User Interface, makes it possible to the enduser to experiment/interact more easily with the Notebook.
|
The following example shows the use of a interactive dashboard. The User Interface, makes it possible to the enduser to experiment/interact more easily with the Notebook.
|
||||||
|
|
||||||
![Sandbox Architecture](res/sandbox_example_ui.png)
|
![Sandbox Architecture](res/sandbox_example_ui.png)
|
||||||
|
|
||||||
## Useful Links
|
# Example tex
|
||||||
|
|
||||||
- [Example Notebooks](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/notebook-examples)
|
|
||||||
|
idee:
|
||||||
- [Cheat Sheet](res/cheatsheet.pdf)
|
- bild use case example
|
||||||
|
- python code example
|
||||||
|
- markdown example (bild,headline, bullet points)
|
||||||
|
- tex
|
||||||
|
- voila
|