# ScreenJob

Single-file behavior, split into maintainable modules under `src/`.

## Entry point

- Primary: `python main.py "<task>"`
- Backward compatible: `python screenjob.py "<task>"`

## Install

```powershell
pip install openai pillow pyautogui python-dotenv
```

## Configure

Create a `.env` file in project root:

```env
OPENAI_API_KEY=your_key_here
```

## Usage

```powershell
python main.py "Open amazon.de and go to my orders"
```

Optional flags:

```powershell
python main.py "Open amazon.de" --model gpt-5.2 --max-steps 80
```

## Tools exposed to the model

- `execute_command(command)`
- `sleep(seconds)` (replaces shell-based sleep calls)
- `see_screen()`
- `enhance(coordinate)`
- `click(coordinate, offset_up/down/left/right, sleep_after_seconds)`
- `type(text)`
- `press_key(key, repeats=1)`
- `task_complete(result)`

### Offset examples

- `{"coordinate":{"x":1000,"y":500},"offset_up":"2px"}`
- `{"coordinate":{"x":1000,"y":500},"offset_right":4}`

### Multi-tool calls in one step

The agent supports multiple tool calls in a single model response and executes them in order.  
Example sequence in one step:

1. `click(...)`
2. `sleep({"seconds": 1.5})`

You can also use `click(..., sleep_after_seconds=1.5)` for a one-call variant.

## Output

Each run creates:

- `screenjob_runs/run_YYYYMMDD_HHMMSS/logs/screenjob.log`
- `screenjob_runs/run_YYYYMMDD_HHMMSS/screens/*.png`
- `screenjob_runs/run_YYYYMMDD_HHMMSS/enhanced/*.png`

Final stdout is JSON:

```json
{
  "completed": true,
  "result": "...",
  "steps": 13,
  "elapsed_seconds": 59.691,
  "artifacts_dir": "C:\\...\\screenjob_runs\\run_..."
}
```

## Project layout

```text
main.py
screenjob.py
src/
  __init__.py
  cli.py
  agent.py
  models.py
  utils.py
```