Compare commits

..

24 Commits

Author SHA1 Message Date
Cornelius Specht 696b12481b add GPU environment
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-07-10 07:36:12 +02:00
Malte Grosse 06dd09f7a0 fix link
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-07-10 07:24:12 +02:00
Malte Grosse f3cd4f2fff fix link
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-07-10 07:22:51 +02:00
Malte Grosse 476a2634ac add ollama
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-07-10 07:00:34 +02:00
Malte Grosse ac5fb71aa7 update image
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-24 12:06:23 +02:00
Cornelius Specht 840b03544f add cheat_sheet
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-24 10:50:22 +02:00
Malte Grosse 273f11b350 typo
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-14 13:15:49 +02:00
Malte Grosse 7aec926276 added matrix
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-14 13:13:47 +02:00
Malte Grosse 9cec953db2 add ip whitelist
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-13 14:50:43 +02:00
Malte Grosse 533c2f298d add example notebook links
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-13 14:34:58 +02:00
Malte Grosse 43801a77f2 use case fix
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-13 14:31:03 +02:00
Malte Grosse a0cefaac7c rem theme
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-06-13 14:26:40 +02:00
Malte Grosse d63a51e30e added curl example
ci/woodpecker/push/woodpecker Pipeline failed Details
2024-06-13 14:25:09 +02:00
Malte Grosse 82fcdf2910 last
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:47:06 +09:00
Malte Grosse d8bb1aa7a2 light
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:44:48 +09:00
Malte Grosse b7bcf231c3 light theme
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:40:45 +09:00
Malte Grosse f2ea003553 fixed
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:38:48 +09:00
Malte Grosse ee8156c522 docker fixed
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:34:47 +09:00
Malte Grosse d3a2521828 t
ci/woodpecker/push/woodpecker Pipeline failed Details
2024-05-14 18:33:31 +09:00
Malte Grosse 237fd559bb t
ci/woodpecker/push/woodpecker Pipeline was successful Details
2024-05-14 18:31:13 +09:00
Malte Grosse da6e7324f1 test
ci/woodpecker/push/woodpecker Pipeline failed Details
2024-05-14 18:30:12 +09:00
Malte Grosse 0302ee633d x86 only
ci/woodpecker/push/woodpecker Pipeline failed Details
2024-05-14 18:27:39 +09:00
Malte Grosse 3c0cce77a9 pipeline run
ci/woodpecker/push/woodpecker Pipeline failed Details
2024-05-14 18:26:28 +09:00
Malte Grosse 73cd54c042 Merge pull request 'dev_environment' (#2) from dev_environment into master
Reviewed-on: #2
2024-05-14 09:23:09 +00:00
15 changed files with 76 additions and 48 deletions

View File

@ -1,18 +1,21 @@
pipeline:
steps:
create-book:
image: peaceiris/mdbook:v0.4.30
commands:
- mdbook init --theme light
- mdbook build
publish-container:
image: woodpeckerci/plugin-docker-buildx:2.1.0
secrets: [docker_username, docker_password]
group: docker
build_and_release:
image: maltegrosse/woodpecker-buildah:0.0.12
settings:
registry: https://git.sandbox.iuk.hdm-stuttgart.de
repo: git.sandbox.iuk.hdm-stuttgart.de/grosse/sandbox-docs-public
dockerfile: Dockerfile
tags: latest
registry: git.sandbox.iuk.hdm-stuttgart.de
repository: grosse/sandbox-docs-public
tag: latest
architectures: amd64
context: Dockerfile
imagename: sandbox-docs-public
username:
from_secret: docker_username
password:
from_secret: docker_password
branches:
exclude: cspecht

View File

@ -1,7 +1,5 @@
FROM nginx:alpine3.17-slim
WORKDIR /app
COPY . .
COPY ./nginx.conf /etc/nginx/nginx.conf
COPY ./book /app/static

View File

@ -1,3 +1,4 @@
[book]
title = "Sandbox Documentation"
language = "en"
[output.html]

View File

@ -9,9 +9,9 @@
- [Software](architecture/software.md)
# Playground
- [Overview](sandbox/overview.md)
- [Development Environment](sandbox/dev_env.md)
- [Use Cases](sandbox/use_cases.md)
- [Development Environment](sandbox/dev_env.md)
- [Data Pool](sandbox/data_pool.md)
- [Training Environment](sandbox/training.md)
- [Data Generator](sandbox/data_generator.md)

View File

@ -1,7 +1,7 @@
# Architecture Overview
## Conceptual Design
![Sandbox Architecture](res/sandbox-architecture.png)
![Sandbox Architecture](res/sb-overview.png)
## Use Cases

Binary file not shown.

After

Width:  |  Height:  |  Size: 246 KiB

View File

@ -11,7 +11,7 @@ To store the data inside the Sandbox, you just have to drag & drop or click on t
## Share CLI (Headless)
To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url.
To use the headless object storage, you can upload a file via REST-Interface or curl. The json response message provides you the destination url. The upload is only available from the Sandbox.
**Upload Example**
```python
@ -22,10 +22,17 @@ To use the headless object storage, you can upload a file via REST-Interface or
files = {'fileUpload': (filename, open(filename, 'rb'),'text/csv')}
r = requests.post(url, files=files)
```
or using curl as command line tool:
```
curl -F fileUpload=@file.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload
```
**Example JSON response**
```json
{
"PublicUrl": "https://storage.sandbox.iuk.hdm-stuttgart.de/upload/a1236b2b-49bf-4047-a536-20dab15b7777/untitled.txt",
"PublicUrl": "https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload/f2a69e9a-f60b-418d-a678-efce181fb8a5/untitled.txt",
"Size": 11,
"Expiration": "2023-10-04T00:00:00Z"
}

View File

@ -60,6 +60,13 @@ Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterla
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
#### GPU Ollama environment
* Available Ollama image is based on [Datascience GPU Container](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/-/packages/container/jupyterlab-datascience-gpu/latest)
* support added for the NVIDIA GPU A100 computations based on python library PyTorch.
Please open an [issue](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/jupyterlab-datascience-gpu/issues/new) if any additional packages are needed or issues occurred.
### Resources
Each instance has following resource limits:
- maximum 2 physical CPUs, guranteed 0.5 CPUs
@ -97,9 +104,29 @@ Please carefully follow the terms and conditions at [Sandbox](https://sandbox.iu
- all my data is gone after the semester break: all persistent storages get recycled of every(!) user each semester break. Please backup your data locally if needed
- package conflicts: common issue is to install a unspecific library version, please specify or upgrade all dependencies manually.
## Large-Language-Models
An advanced Pytorch development environment is preinstalled with [Ollama](https://ollama.com/), which makes it easy to download and run different LLMs.
If you want to run Ollama including the [OpenWebUI](https://openwebui.com/), following commands need to be executed in multiple Terminal windows:
![terminal](res/sandbox-terminal.png "Terminal")
1. execute `ollama serve` to start the ollama backend
2. execute `ollama run mistral:7b` to download and run a specific model in a second terminal.
Following steps are only needed if you want to access the WebUI. It also showcases how other http services can temporally be tunneled and exposed to public.
![tunnel](res/sandbox-tunnel.png "Tunnel")
3. execute `open-webui serve` to serve the WebUI locally on port 8080 in another terminal.
4. visit [tun.iuk.hdm-stuttgart.de](https://tun.iuk.hdm-stuttgart.de) to obtain a token with your browser
5. execute `pgrok init --remote-addr tun.iuk.hdm-stuttgart.de:80 --forward-addr https://{user}.tun.iuk.hdm-stuttgart.de --token {token}` in the terminal. Replace `{user}` and `{token}` with your username and the previous obtained token.
6. execute `pgrok http 8080` to run the tunnel and expose the webui. Now you are able to access the webui at ´https://{user}.tun.iuk.hdm-stuttgart.
### Notes
Please use the tunnel only temporally and carefully. The tunnel only support http(s) tunnels. Tunneled services are public available and accessible by anyone! If you want to train/finetune any LLMs, please use the [Training Environment](training.md) instead.
## Useful Links
- [Jupyter Documentation](https://docs.jupyter.org/en/latest/)
- [pip](https://pip.pypa.io/en/stable/user_guide/)
- [python](https://docs.python.org/3.11/)
- [Ollama](https://ollama.com/)

View File

@ -1,18 +0,0 @@
# Overview Sandbox
![Sandbox Architecture](res/sandbox-architecture.png)
## Use Cases
A brief overview how to create [Use Cases](use_cases.md) on the Sandbox
## Development Environment
A [Development Environment](dev_env.md) or Playground for beginners and advanced users.
## Data Pool
Details about how to store and manage data on the Sandbox: Additional information: [Data Pool](data_pool.md)
## Training Environment
Detailed Information about how to perform a GPU supported model training on the Sandbox Environment.

Binary file not shown.

Binary file not shown.

After

Width:  |  Height:  |  Size: 501 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 12 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 71 KiB

View File

@ -58,7 +58,8 @@ steps:
```
See the official [documentation](https://woodpecker-ci.org/docs/usage/workflow-syntax) for the syntax.
Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In the second step, the model gets compressed and pushed on the temp. sandbox storage.
Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In most cases you can find predefined containers at [Dockerhub](https://hub.docker.com/) or GPU supported containers at [NVIDIA](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch). If needed, custom images can be created and stored internally (on the Sandbox Git package repository) or any other public available container repository. In the second step, the model gets compressed and pushed on the temp. sandbox storage.
3. Commit and push
4. See current state of the pipelines at the [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos)
@ -95,6 +96,12 @@ which returns a json with the download url of your uploaded file.
- Choose a proper way to output some reasonable logs during your training, so it wont spam the logs too heavily
- training exists after 60 minutes: increase maximum duration in the ci repository settings
## Advanced Parameters (Matrix Workflos)
The woodpecker cli yaml defintion files support [matrix workflows](https://woodpecker-ci.org/docs/usage/matrix-workflows), such that multiple pipeline runs are executed with all combinations of the predefined variables.
See the [test-ci](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci/src/branch/matrix) matrix branch as an example to define multiple pipeline runs with different epochs and optimizers. In the CI it is shown with different labels for each parameter:
![repos](./res/matrix-ci.png)
## Useful Links
- [Sandbox GIT](https://git.sandbox.iuk.hdm-stuttgart.de/)
- [Sandbox CI](https://ci.sandbox.iuk.hdm-stuttgart.de)
@ -104,3 +111,4 @@ which returns a json with the download url of your uploaded file.
- [TensorFlow](https://www.tensorflow.org/versions/r2.15/api_docs/python/tf)
- [NVIDIA PyTorch Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch)
- [NVIDIA Tensorflow Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tensorflow)
- [Dockerhub](https://hub.docker.com/)

View File

@ -1,14 +1,11 @@
# Use Cases
tbd
ipynb example syntax + markdown + tex + voila slider (interactive dashboards)
# Example Python
## Example Python
Inside the Notebook file you can write normal python syntax and can use plotting libaries to visualize your data and show the insights.
![Sandbox Example Python](res/sandbox_example_python.png)
# Example Markdown
## Example Markdown
Inside Notebooks its possible to write Markdown text. This allows the user and the lectures to write formatted text inside the code editor, to create and answer Assignments.
| Markdown Syntax | Description |
@ -33,8 +30,13 @@ if you need more or advanced syntax to format your text with markdown have a loo
# Example interactive dashboard
## Example interactive dashboard
The following example shows the use of a interactive dashboard. The User Interface, makes it possible to the enduser to experiment/interact more easily with the Notebook.
![Sandbox Architecture](res/sandbox_example_ui.png)
## Useful Links
- [Example Notebooks](https://git.sandbox.iuk.hdm-stuttgart.de/grosse/notebook-examples)
- [Cheat Sheet](res/cheatsheet.pdf)