grosse #1
			
				
			
		
		
		
	
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 20 KiB  | 
										
											Binary file not shown.
										
									
								
							| 
		 After Width: | Height: | Size: 10 KiB  | 
| 
						 | 
				
			
			@ -1,4 +1,94 @@
 | 
			
		|||
# Training Environment 
 | 
			
		||||
 | 
			
		||||
Currently under heavy development, coming soon....stayed tuned!
 | 
			
		||||
This documentation is for advanced users which are aware of following tools: git, python/R, cuda, pytorch/tensorflow and basic container knowledge.
 | 
			
		||||
 | 
			
		||||
## Overview
 | 
			
		||||
Available are two worker agents with 
 | 
			
		||||
- 12 CPUs 
 | 
			
		||||
- 40 GB memory
 | 
			
		||||
- 20 GB Nvidia GPU memory
 | 
			
		||||
- 100 GB Hdd Diskspace
 | 
			
		||||
 | 
			
		||||
Only two pipelines can run in parallel to ensure having the promised hardware resources. If more jobs occur, they will be stored in a queue and released after the fifo principle.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
## Development
 | 
			
		||||
 | 
			
		||||
### Git
 | 
			
		||||
Create a new git repository and commit your latest code here: https://git.sandbox.iuk.hdm-stuttgart.de/
 | 
			
		||||
 | 
			
		||||
Repositories can be private or public - depends on your use case.
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
### CI
 | 
			
		||||
Connect your newly created repository here: https://ci.sandbox.iuk.hdm-stuttgart.de/ 
 | 
			
		||||
1. After login, click on "+ Add repository"
 | 
			
		||||

 | 
			
		||||
2. Enable the specific repository
 | 
			
		||||
 | 
			
		||||
3. Go to the repositories [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos) and select your enabled repository
 | 
			
		||||
4. Go to settings (clicking the settings icon)
 | 
			
		||||

 | 
			
		||||
5. Set a reasonable timeout in minutes (e.g. 360 minutes for 6hours) if some training crashes/hangs
 | 
			
		||||
6. Add additional settings like secrets or container registries, see the official [documentation](https://woodpecker-ci.org/docs/usage/project-settings) for additional settings
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
### Pipeline File
 | 
			
		||||
An example script can be found here:
 | 
			
		||||
 | 
			
		||||
 https://git.sandbox.iuk.hdm-stuttgart.de/grosse/test-ci
 | 
			
		||||
 | 
			
		||||
 | 
			
		||||
1. Create a new file in your repository `.woodpecker.yml` (of different regarding repository settings above)
 | 
			
		||||
2. The content can look like following:
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
steps:
 | 
			
		||||
  "train":
 | 
			
		||||
    image: nvcr.io/nvidia/tensorflow:23.10-tf2-py3
 | 
			
		||||
    commands:
 | 
			
		||||
      - echo "starting python script"
 | 
			
		||||
      - python run.py
 | 
			
		||||
  "compress and upload":
 | 
			
		||||
    image: alpine:3
 | 
			
		||||
    commands:
 | 
			
		||||
      - apk --no-cache add zip curl
 | 
			
		||||
      - zip mymodel.zip mymodel.keras
 | 
			
		||||
      - curl -F fileUpload=@mymodel.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload
 | 
			
		||||
```
 | 
			
		||||
See the official [documentation](https://woodpecker-ci.org/docs/usage/workflow-syntax) for the syntax.
 | 
			
		||||
 | 
			
		||||
Generally, the pipeline is based on different steps, and in each step, another container environment can be chosen. In the example above, first an official tensorflow container with python 3 is used to run the training python script. In the second step, the model gets compressed and pushed on the temp sandbox storage.
 | 
			
		||||
3. Commit and push
 | 
			
		||||
4. See current state of the pipelines at the [overview site](https://ci.sandbox.iuk.hdm-stuttgart.de/repos)
 | 
			
		||||
 | 
			
		||||
Hints: 
 | 
			
		||||
- The first time an external container is pulled, depending on the size, container images can take quite a while as different organization (like dockerhub) limit the download speed. The Sandbox git also supports hosting container images...
 | 
			
		||||
- Choose a proper way to output some reasonable logs during your training, so it wont spam the logs too heavily
 | 
			
		||||
 | 
			
		||||
### Exporting trained model
 | 
			
		||||
We provide a 3-months disposal internal storage.
 | 
			
		||||
You can either use the a simple curl command `curl -F fileUpload=@mymodel.zip https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload` to upload a file or a simple python script 
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
import requests
 | 
			
		||||
import os
 | 
			
		||||
 | 
			
		||||
myurl = 'https://share.storage.sandbox.iuk.hdm-stuttgart.de/upload'
 | 
			
		||||
 | 
			
		||||
print("uploading file")
 | 
			
		||||
files = {
 | 
			
		||||
    'fileUpload':('mymodel.keras',  open('mymodel.keras', 'rb'),'application/octet-stream')
 | 
			
		||||
}
 | 
			
		||||
 | 
			
		||||
response = requests.post(myurl, files=files)
 | 
			
		||||
print(response,response.text)
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
 | 
			
		||||
which returns a json
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
{"PublicUrl":"https://storage.sandbox.iuk.hdm-stuttgart.de/upload/49676006-94e4-4da6-be3f-466u786768979/mymodel.keras","Size":97865925,"Expiration":"2024-03-30T00:00:00Z"}
 | 
			
		||||
 | 
			
		||||
```
 | 
			
		||||
		Loading…
	
		Reference in New Issue