4.6 KiB
4.6 KiB
Big File Generator
A collection of Python CLI tools for creating and reading large binary files. Useful for testing disk I/O performance, storage systems, and file transfer mechanisms.
Tools
make_big_file.py - File Generator
Creates large binary files filled with zeros for testing purposes.
Features:
- Configurable file size with human-readable units (GB, TB, MB, etc.)
- Adjustable chunk size for write optimization
- Disk space validation before writing
- Real-time progress reporting with speed metrics
- Prevents accidental file overwrites
- Graceful interrupt handling with cleanup
- Quiet mode for scripting
Usage:
python make_big_file.py <output> <size> [options]
Arguments:
output- Output file pathsize- File size (e.g., 15GB, 1.5TB, 500MB)
Options:
--chunk-size <size>- Chunk size for writing (default: 64MB)--quiet, -q- Suppress progress output--version- Show version information--help, -h- Show help message
Examples:
# Create a 15GB file
python make_big_file.py output.bin 15GB
# Create a 1.5TB file with 128MB chunks
python make_big_file.py bigfile.dat 1.5TB --chunk-size 128MB
# Create a 500MB file quietly
python make_big_file.py test.bin 500MB --quiet
read_big_file.py - File Reader & Benchmark
Reads large files and measures I/O performance, optionally computing checksums.
Features:
- Configurable chunk size for read optimization
- Real-time progress reporting with speed metrics
- SHA256 hash computation option
- File validation before reading
- Quiet mode for scripting
- Graceful interrupt handling
Usage:
python read_big_file.py <input> [options]
Arguments:
input- Input file path to read
Options:
--chunk-size <size>- Chunk size for reading (default: 64MB)--hash- Compute SHA256 hash of the file--quiet, -q- Suppress progress output--version- Show version information--help, -h- Show help message
Examples:
# Read a large file
python read_big_file.py largefile.bin
# Read with 128MB chunks and compute hash
python read_big_file.py test.dat --chunk-size 128MB --hash
# Read quietly and compute hash
python read_big_file.py data.bin --hash --quiet
Installation
No external dependencies required. Works with Python 3.6+.
# Clone or download the scripts
git clone <repository-url>
cd bigfilegen
# Make scripts executable (optional, Unix/Linux/Mac)
chmod +x make_big_file.py read_big_file.py
Requirements
- Python 3.6 or higher
- Sufficient disk space for file creation
- Read/write permissions in target directories
Performance Tips
Chunk Size Optimization
- SSDs: Use larger chunks (64-128MB) for better performance
- HDDs: Use moderate chunks (32-64MB) to balance speed and memory
- Network drives: Experiment with different sizes based on network speed
File System Considerations
- NTFS (Windows): Supports files up to 16 EiB
- exFAT: Good for large files on external drives
- ext4 (Linux): Supports files up to 16 TiB
- APFS/HFS+ (macOS): Supports very large files
Use Cases
- Performance Testing: Benchmark disk I/O speeds
- Storage Validation: Verify storage capacity and integrity
- Transfer Testing: Test file transfer mechanisms and speeds
- Application Testing: Test applications with large file handling
- Disk Burn-in: Stress test new storage devices
Output Examples
Creating a file:
Creating file: test.bin
Target size: 15.00 GiB
Chunk size: 64.00 MiB
Progress: 5% (768.00 MiB written)
Written: 1.50 GiB, Speed: 1.23 GiB/s
Progress: 10% (1.50 GiB written)
...
✓ Successfully created test.bin (15.00 GiB)
Time taken: 12.34 seconds
Average speed: 1.22 GiB/s
Reading a file:
Reading file: test.bin
File size: 15.00 GiB
Chunk size: 64.00 MiB
Progress: 5% (768.00 MiB read)
Read: 1.50 GiB, Speed: 1.45 GiB/s
Progress: 10% (1.50 GiB read)
...
✓ Successfully read 15.00 GiB
Time taken: 10.12 seconds
Average speed: 1.48 GiB/s
SHA256: a3d5c... (if --hash was used)
Error Handling
Both tools include comprehensive error handling:
- File existence checks
- Disk space validation
- Permission verification
- Interrupt handling (Ctrl+C)
- Automatic cleanup on errors
Exit Codes
0- Success1- General error (file not found, permission denied, etc.)130- Interrupted by user (Ctrl+C)
License
MIT License - Feel free to use and modify as needed.
Contributing
Contributions welcome! Feel free to submit issues or pull requests.
Author
Created for testing and benchmarking large file operations.