Compare commits

..

No commits in common. "master" and "dev_environment" have entirely different histories.

15 changed files with 48 additions and 76 deletions

View File

@ -1,21 +1,18 @@
steps: pipeline:
create-book: create-book:
image: peaceiris/mdbook:v0.4.30 image: peaceiris/mdbook:v0.4.30
commands: commands:
- mdbook init --theme light
- mdbook build - mdbook build
build_and_release:
image: maltegrosse/woodpecker-buildah:0.0.12 publish-container:
image: woodpeckerci/plugin-docker-buildx:2.1.0
secrets: [docker_username, docker_password]
group: docker
settings: settings:
registry: git.sandbox.iuk.hdm-stuttgart.de registry: https://git.sandbox.iuk.hdm-stuttgart.de
repository: grosse/sandbox-docs-public repo: git.sandbox.iuk.hdm-stuttgart.de/grosse/sandbox-docs-public
tag: latest dockerfile: Dockerfile
architectures: amd64 tags: latest
context: Dockerfile
imagename: sandbox-docs-public
username:
from_secret: docker_username
password:
from_secret: docker_password
branches:
exclude: cspecht

View File

@ -1,5 +1,7 @@
FROM nginx:alpine3.17-slim FROM nginx:alpine3.17-slim
WORKDIR /app WORKDIR /app
COPY . .
COPY ./nginx.conf /etc/nginx/nginx.conf COPY ./nginx.conf /etc/nginx/nginx.conf
COPY ./book /app/static COPY ./book /app/static

View File

@ -1,4 +1,3 @@
[book] [book]
title = "Sandbox Documentation" title = "Sandbox Documentation"
language = "en" language = "en"
[output.html]

View File

@ -9,9 +9,9 @@
- [Software](architecture/software.md) - [Software](architecture/software.md)
# Playground # Playground
- [Overview](sandbox/overview.md)
- [Use Cases](sandbox/use_cases.md)
- [Development Environment](sandbox/dev_env.md) - [Development Environment](sandbox/dev_env.md)
- [Use Cases](sandbox/use_cases.md)
- [Data Pool](sandbox/data_pool.md) - [Data Pool](sandbox/data_pool.md)
- [Training Environment](sandbox/training.md) - [Training Environment](sandbox/training.md)
- [Data Generator](sandbox/data_generator.md)

View File

@ -1,7 +1,7 @@
# Architecture Overview # Architecture Overview
## Conceptual Design ## Conceptual Design
![Sandbox Architecture](res/sb-overview.png) ![Sandbox Architecture](res/sandbox-architecture.png)
## Use Cases ## Use Cases

Binary file not shown.

Before

Width:  |  Height:  |  Size: 246 KiB

View File

@ -11,7 +11,7 @@ To store the data inside the Sandbox, you just have to drag & drop or click on t
## Share CLI (Headless) ## Share CLI (Headless)
To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url. The upload is only available from the Sandbox. To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url.
**Upload Example** **Upload Example**
```python ```python
@ -22,17 +22,10 @@ To use the headless object storage, you can upload a file via REST-Interface or
files = {'fileUpload': (filename, open(filename, 'rb'),'text/csv')} files = {'fileUpload': (filename, open(filename, 'rb'),'text/csv')}
r = requests.post(url, files=files) r = requests.post(url, files=files)
``` ```
or using curl as command line tool:
```
curl -F fileUpload=@file.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload
```
**Example JSON response** **Example JSON response**
```json ```json
{ {
"PublicUrl": "https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload/f2a69e9a-f60b-418d-a678-efce181fb8a5/untitled.txt", "PublicUrl": "https://storage.sandbox.iuk.hdm-stuttgart.de/upload/a1236b2b-49bf-4047-a536-20dab15b7777/untitled.txt",
"Size": 11, "Size": 11,
"Expiration": "2023-10-04T00:00:00Z" "Expiration": "2023-10-04T00:00:00Z"
} }

View File

@ -60,13 +60,6 @@ Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterla
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred. Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
#### GPU Ollama environment
* Available Ollama image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
* support added for the NVIDIA GPU A100 computations based on python library PyTorch.
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
### Resources ### Resources
Each instance has following resource limits: Each instance has following resource limits:
- maximum 2 physical CPUs, guranteed 0.5 CPUs - maximum 2 physical CPUs, guranteed 0.5 CPUs
@ -104,29 +97,9 @@ Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iu
- all my data is gone after the semester break: all persistent storages get recycled of every(!) user each semester break. Please backup your data locally if needed - all my data is gone after the semester break: all persistent storages get recycled of every(!) user each semester break. Please backup your data locally if needed
- package conflicts: common issue is to install a unspecific library version, please specify or upgrade all dependencies manually. - package conflicts: common issue is to install a unspecific library version, please specify or upgrade all dependencies manually.
## Large-Language-Models
An advanced Pytorch development environment is preinstalled with [Ollama](https://ollama.com/), which makes it easy to download and run different LLMs.
If you want to run Ollama including the [OpenWebUI](https://openwebui.com/), following commands need to be executed in multiple Terminal windows:
![terminal](res/sandbox-terminal.png "Terminal")
1. execute `ollama serve` to start the ollama backend
2. execute `ollama run mistral:7b` to download and run a specific model in a second terminal.
Following steps are only needed if you want to access the WebUI. It also showcases how other http services can temporally be tunneled and exposed to public.
![tunnel](res/sandbox-tunnel.png "Tunnel")
3. execute `open-webui serve` to serve the WebUI locally on port 8080 in another terminal.
4. visit [tun.iuk.hdm-stuttgart.de](https://tun.iuk.hdm-stuttgart.de) to obtain a token with your browser
5. execute `pgrok init --remote-addr tun.iuk.hdm-stuttgart.de:80 --forward-addr https://{user}.tun.iuk.hdm-stuttgart.de --token {token}` in the terminal. Replace `{user}` and `{token}` with your username and the previous obtained token.
6. execute `pgrok http 8080` to run the tunnel and expose the webui. Now you are able to access the webui at ´https://{user}.tun.iuk.hdm-stuttgart.
### Notes
Please use the tunnel only temporally and carefully. The tunnel only support http(s) tunnels. Tunneled services are public available and accessible by anyone! If you want to train/finetune any LLMs, please use the [Training Environment](training.md) instead.
## Useful Links ## Useful Links
- [Jupyter Documentation](https://docs.jupyter.org/en/latest/) - [Jupyter Documentation](https://docs.jupyter.org/en/latest/)
- [pip](https://pip.pypa.io/en/stable/user_guide/) - [pip](https://pip.pypa.io/en/stable/user_guide/)
- [python](https://docs.python.org/3.11/) - [python](https://docs.python.org/3.11/)
- [Ollama](https://ollama.com/)

18
src/sandbox/overview.md Normal file
View File

@ -0,0 +1,18 @@
# Overview Sandbox
![Sandbox Architecture](res/sandbox-architecture.png)
## Use Cases
A brief overview how to create [Use Cases](use_cases.md) on the Sandbox
## Development Environment
A [Development Environment](dev_env.md) or Playground for beginners and advanced users.
## Data Pool
Details about how to store and manage data on the Sandbox: Additional information: [Data Pool](data_pool.md)
## Training Environment
Detailed Information about how to perform a GPU supported model training on the Sandbox Environment.

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 501 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 71 KiB

View File

@ -58,8 +58,7 @@ steps:
``` ```
See the official [documentation](https://woodpecker-ci.org/docs/usage/workflow-syntax) for the syntax. See the official [documentation](https://woodpecker-ci.org/docs/usage/workflow-syntax) for the syntax.
Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In most cases you can find predefined containers at [Dockerhub](https://hub.docker.com/) or GPU supported containers at [NVIDIA](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch). If needed, custom images can be created and stored internally (on the Sandbox Git package repository) or any other public available container repository. In the second step, the model gets compressed and pushed on the temp. sandbox storage. Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In the second step, the model gets compressed and pushed on the temp. sandbox storage.
3. Commit and push 3. Commit and push
4. See current state of the pipelines at the [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos) 4. See current state of the pipelines at the [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos)
@ -96,12 +95,6 @@ which returns a json with the download url of your uploaded file.
- Choose a proper way to output some reasonable logs during your training, so it wont spam the logs too heavily - Choose a proper way to output some reasonable logs during your training, so it wont spam the logs too heavily
- training exists after 60 minutes: increase maximum duration in the ci repository settings - training exists after 60 minutes: increase maximum duration in the ci repository settings
## Advanced Parameters (Matrix Workflos)
The woodpecker cli yaml defintion files support [matrix workflows](https://woodpecker-ci.org/docs/usage/matrix-workflows), such that multiple pipeline runs are executed with all combinations of the predefined variables.
See the [test-ci](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci/src/branch/matrix) matrix branch as an example to define multiple pipeline runs with different epochs and optimizers. In the CI it is shown with different labels for each parameter:
![repos](./res/matrix-ci.png)
## Useful Links ## Useful Links
- [Sandbox GIT](https://git.sandbox.iuk.hdm-stuttgart.de/) - [Sandbox GIT](https://git.sandbox.iuk.hdm-stuttgart.de/)
- [Sandbox CI](https://ci.sandbox.iuk.hdm-stuttgart.de) - [Sandbox CI](https://ci.sandbox.iuk.hdm-stuttgart.de)
@ -111,4 +104,3 @@ See the [test-ci](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci/src/br
- [TensorFlow](https://www.tensorflow.org/versions/r2.15/api_docs/python/tf) - [TensorFlow](https://www.tensorflow.org/versions/r2.15/api_docs/python/tf)
- [NVIDIA PyTorch Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) - [NVIDIA PyTorch Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
- [NVIDIA Tensorflow Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow) - [NVIDIA Tensorflow Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow)
- [Dockerhub](https://hub.docker.com/)

View File

@ -1,11 +1,14 @@
# Use Cases # Use Cases
## Example Python tbd
ipynb example syntax + markdown + tex + voila slider (interactive dashboards)
# Example Python
Inside the Notebook file you can write normal python syntax and can use plotting libaries to visualize your data and show the insights. Inside the Notebook file you can write normal python syntax and can use plotting libaries to visualize your data and show the insights.
![Sandbox Example Python](res/sandbox_example_python.png) ![Sandbox Example Python](res/sandbox_example_python.png)
## Example Markdown # Example Markdown
Inside Notebooks its possible to write Markdown text. This allows the user and the lectures to write formatted text inside the code editor, to create and answer Assignments. Inside Notebooks its possible to write Markdown text. This allows the user and the lectures to write formatted text inside the code editor, to create and answer Assignments.
| Markdown Syntax | Description | | Markdown Syntax | Description |
@ -30,13 +33,8 @@ if you need more or advanced syntax to format your text with markdown have a loo
## Example interactive dashboard # Example interactive dashboard
The following example shows the use of a interactive dashboard. The User Interface, makes it possible to the enduser to experiment/interact more easily with the Notebook. The following example shows the use of a interactive dashboard. The User Interface, makes it possible to the enduser to experiment/interact more easily with the Notebook.
![Sandbox Architecture](res/sandbox_example_ui.png) ![Sandbox Architecture](res/sandbox_example_ui.png)
## Useful Links
- [Example Notebooks](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/notebook-examples)
- [Cheat Sheet](res/cheatsheet.pdf)