185 lines
4.6 KiB
Markdown
185 lines
4.6 KiB
Markdown
# Big File Generator
|
|
|
|
A collection of Python CLI tools for creating and reading large binary files. Useful for testing disk I/O performance, storage systems, and file transfer mechanisms.
|
|
|
|
## Tools
|
|
|
|
### `make_big_file.py` - File Generator
|
|
|
|
Creates large binary files filled with zeros for testing purposes.
|
|
|
|
**Features:**
|
|
- Configurable file size with human-readable units (GB, TB, MB, etc.)
|
|
- Adjustable chunk size for write optimization
|
|
- Disk space validation before writing
|
|
- Real-time progress reporting with speed metrics
|
|
- Prevents accidental file overwrites
|
|
- Graceful interrupt handling with cleanup
|
|
- Quiet mode for scripting
|
|
|
|
**Usage:**
|
|
```bash
|
|
python make_big_file.py <output> <size> [options]
|
|
```
|
|
|
|
**Arguments:**
|
|
- `output` - Output file path
|
|
- `size` - File size (e.g., 15GB, 1.5TB, 500MB)
|
|
|
|
**Options:**
|
|
- `--chunk-size <size>` - Chunk size for writing (default: 64MB)
|
|
- `--quiet, -q` - Suppress progress output
|
|
- `--version` - Show version information
|
|
- `--help, -h` - Show help message
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Create a 15GB file
|
|
python make_big_file.py output.bin 15GB
|
|
|
|
# Create a 1.5TB file with 128MB chunks
|
|
python make_big_file.py bigfile.dat 1.5TB --chunk-size 128MB
|
|
|
|
# Create a 500MB file quietly
|
|
python make_big_file.py test.bin 500MB --quiet
|
|
```
|
|
|
|
### `read_big_file.py` - File Reader & Benchmark
|
|
|
|
Reads large files and measures I/O performance, optionally computing checksums.
|
|
|
|
**Features:**
|
|
- Configurable chunk size for read optimization
|
|
- Real-time progress reporting with speed metrics
|
|
- SHA256 hash computation option
|
|
- File validation before reading
|
|
- Quiet mode for scripting
|
|
- Graceful interrupt handling
|
|
|
|
**Usage:**
|
|
```bash
|
|
python read_big_file.py <input> [options]
|
|
```
|
|
|
|
**Arguments:**
|
|
- `input` - Input file path to read
|
|
|
|
**Options:**
|
|
- `--chunk-size <size>` - Chunk size for reading (default: 64MB)
|
|
- `--hash` - Compute SHA256 hash of the file
|
|
- `--quiet, -q` - Suppress progress output
|
|
- `--version` - Show version information
|
|
- `--help, -h` - Show help message
|
|
|
|
**Examples:**
|
|
```bash
|
|
# Read a large file
|
|
python read_big_file.py largefile.bin
|
|
|
|
# Read with 128MB chunks and compute hash
|
|
python read_big_file.py test.dat --chunk-size 128MB --hash
|
|
|
|
# Read quietly and compute hash
|
|
python read_big_file.py data.bin --hash --quiet
|
|
```
|
|
|
|
## Installation
|
|
|
|
No external dependencies required. Works with Python 3.6+.
|
|
|
|
```bash
|
|
# Clone or download the scripts
|
|
git clone <repository-url>
|
|
cd bigfilegen
|
|
|
|
# Make scripts executable (optional, Unix/Linux/Mac)
|
|
chmod +x make_big_file.py read_big_file.py
|
|
```
|
|
|
|
## Requirements
|
|
|
|
- Python 3.6 or higher
|
|
- Sufficient disk space for file creation
|
|
- Read/write permissions in target directories
|
|
|
|
## Performance Tips
|
|
|
|
### Chunk Size Optimization
|
|
- **SSDs**: Use larger chunks (64-128MB) for better performance
|
|
- **HDDs**: Use moderate chunks (32-64MB) to balance speed and memory
|
|
- **Network drives**: Experiment with different sizes based on network speed
|
|
|
|
### File System Considerations
|
|
- **NTFS** (Windows): Supports files up to 16 EiB
|
|
- **exFAT**: Good for large files on external drives
|
|
- **ext4** (Linux): Supports files up to 16 TiB
|
|
- **APFS/HFS+** (macOS): Supports very large files
|
|
|
|
## Use Cases
|
|
|
|
- **Performance Testing**: Benchmark disk I/O speeds
|
|
- **Storage Validation**: Verify storage capacity and integrity
|
|
- **Transfer Testing**: Test file transfer mechanisms and speeds
|
|
- **Application Testing**: Test applications with large file handling
|
|
- **Disk Burn-in**: Stress test new storage devices
|
|
|
|
## Output Examples
|
|
|
|
### Creating a file:
|
|
```
|
|
Creating file: test.bin
|
|
Target size: 15.00 GiB
|
|
Chunk size: 64.00 MiB
|
|
|
|
Progress: 5% (768.00 MiB written)
|
|
Written: 1.50 GiB, Speed: 1.23 GiB/s
|
|
Progress: 10% (1.50 GiB written)
|
|
...
|
|
✓ Successfully created test.bin (15.00 GiB)
|
|
Time taken: 12.34 seconds
|
|
Average speed: 1.22 GiB/s
|
|
```
|
|
|
|
### Reading a file:
|
|
```
|
|
Reading file: test.bin
|
|
File size: 15.00 GiB
|
|
Chunk size: 64.00 MiB
|
|
|
|
Progress: 5% (768.00 MiB read)
|
|
Read: 1.50 GiB, Speed: 1.45 GiB/s
|
|
Progress: 10% (1.50 GiB read)
|
|
...
|
|
✓ Successfully read 15.00 GiB
|
|
Time taken: 10.12 seconds
|
|
Average speed: 1.48 GiB/s
|
|
SHA256: a3d5c... (if --hash was used)
|
|
```
|
|
|
|
## Error Handling
|
|
|
|
Both tools include comprehensive error handling:
|
|
- File existence checks
|
|
- Disk space validation
|
|
- Permission verification
|
|
- Interrupt handling (Ctrl+C)
|
|
- Automatic cleanup on errors
|
|
|
|
## Exit Codes
|
|
|
|
- `0` - Success
|
|
- `1` - General error (file not found, permission denied, etc.)
|
|
- `130` - Interrupted by user (Ctrl+C)
|
|
|
|
## License
|
|
|
|
MIT License - Feel free to use and modify as needed.
|
|
|
|
## Contributing
|
|
|
|
Contributions welcome! Feel free to submit issues or pull requests.
|
|
|
|
## Author
|
|
|
|
Created for testing and benchmarking large file operations.
|