chore: initialize screenjob project baseline

This commit is contained in:
Space-Banane
2026-05-27 17:31:49 +02:00
commit 84b0df520c
9 changed files with 1045 additions and 0 deletions

94
README.md Normal file
View File

@@ -0,0 +1,94 @@
# ScreenJob
Single-file behavior, split into maintainable modules under `src/`.
## Entry point
- Primary: `python main.py "<task>"`
- Backward compatible: `python screenjob.py "<task>"`
## Install
```powershell
pip install openai pillow pyautogui python-dotenv
```
## Configure
Create a `.env` file in project root:
```env
OPENAI_API_KEY=your_key_here
```
## Usage
```powershell
python main.py "Open amazon.de and go to my orders"
```
Optional flags:
```powershell
python main.py "Open amazon.de" --model gpt-5.2 --max-steps 80
```
## Tools exposed to the model
- `execute_command(command)`
- `sleep(seconds)` (replaces shell-based sleep calls)
- `see_screen()`
- `enhance(coordinate)`
- `click(coordinate, offset_up/down/left/right, sleep_after_seconds)`
- `type(text)`
- `press_key(key, repeats=1)`
- `task_complete(result)`
### Offset examples
- `{"coordinate":{"x":1000,"y":500},"offset_up":"2px"}`
- `{"coordinate":{"x":1000,"y":500},"offset_right":4}`
### Multi-tool calls in one step
The agent supports multiple tool calls in a single model response and executes them in order.
Example sequence in one step:
1. `click(...)`
2. `sleep({"seconds": 1.5})`
You can also use `click(..., sleep_after_seconds=1.5)` for a one-call variant.
## Output
Each run creates:
- `screenjob_runs/run_YYYYMMDD_HHMMSS/logs/screenjob.log`
- `screenjob_runs/run_YYYYMMDD_HHMMSS/screens/*.png`
- `screenjob_runs/run_YYYYMMDD_HHMMSS/enhanced/*.png`
Final stdout is JSON:
```json
{
"completed": true,
"result": "...",
"steps": 13,
"elapsed_seconds": 59.691,
"artifacts_dir": "C:\\...\\screenjob_runs\\run_..."
}
```
## Project layout
```text
main.py
screenjob.py
src/
__init__.py
cli.py
agent.py
models.py
utils.py
```