ca1f80d47072d32fc932f24783ed483716ce49f5
Big File Generator
A collection of Python CLI tools for creating and reading large binary files. Useful for testing disk I/O performance, storage systems, and file transfer mechanisms.
Tools
make_big_file.py - File Generator
Creates large binary files filled with zeros for testing purposes.
Features:
- Configurable file size with human-readable units (GB, TB, MB, etc.)
- Adjustable chunk size for write optimization
- Disk space validation before writing
- Real-time progress reporting with speed metrics
- Prevents accidental file overwrites
- Graceful interrupt handling with cleanup
- Quiet mode for scripting
Usage:
python make_big_file.py <output> <size> [options]
Arguments:
output- Output file pathsize- File size (e.g., 15GB, 1.5TB, 500MB)
Options:
--chunk-size <size>- Chunk size for writing (default: 64MB)--quiet, -q- Suppress progress output--version- Show version information--help, -h- Show help message
Examples:
# Create a 15GB file
python make_big_file.py output.bin 15GB
# Create a 1.5TB file with 128MB chunks
python make_big_file.py bigfile.dat 1.5TB --chunk-size 128MB
# Create a 500MB file quietly
python make_big_file.py test.bin 500MB --quiet
read_big_file.py - File Reader & Benchmark
Reads large files and measures I/O performance, optionally computing checksums.
Features:
- Configurable chunk size for read optimization
- Real-time progress reporting with speed metrics
- SHA256 hash computation option
- File validation before reading
- Quiet mode for scripting
- Graceful interrupt handling
Usage:
python read_big_file.py <input> [options]
Arguments:
input- Input file path to read
Options:
--chunk-size <size>- Chunk size for reading (default: 64MB)--hash- Compute SHA256 hash of the file--quiet, -q- Suppress progress output--version- Show version information--help, -h- Show help message
Examples:
# Read a large file
python read_big_file.py largefile.bin
# Read with 128MB chunks and compute hash
python read_big_file.py test.dat --chunk-size 128MB --hash
# Read quietly and compute hash
python read_big_file.py data.bin --hash --quiet
Installation
No external dependencies required. Works with Python 3.6+.
# Clone or download the scripts
git clone <repository-url>
cd bigfilegen
# Make scripts executable (optional, Unix/Linux/Mac)
chmod +x make_big_file.py read_big_file.py
Requirements
- Python 3.6 or higher
- Sufficient disk space for file creation
- Read/write permissions in target directories
Performance Tips
Chunk Size Optimization
- SSDs: Use larger chunks (64-128MB) for better performance
- HDDs: Use moderate chunks (32-64MB) to balance speed and memory
- Network drives: Experiment with different sizes based on network speed
File System Considerations
- NTFS (Windows): Supports files up to 16 EiB
- exFAT: Good for large files on external drives
- ext4 (Linux): Supports files up to 16 TiB
- APFS/HFS+ (macOS): Supports very large files
Use Cases
- Performance Testing: Benchmark disk I/O speeds
- Storage Validation: Verify storage capacity and integrity
- Transfer Testing: Test file transfer mechanisms and speeds
- Application Testing: Test applications with large file handling
- Disk Burn-in: Stress test new storage devices
Output Examples
Creating a file:
Creating file: test.bin
Target size: 15.00 GiB
Chunk size: 64.00 MiB
Progress: 5% (768.00 MiB written)
Written: 1.50 GiB, Speed: 1.23 GiB/s
Progress: 10% (1.50 GiB written)
...
✓ Successfully created test.bin (15.00 GiB)
Time taken: 12.34 seconds
Average speed: 1.22 GiB/s
Reading a file:
Reading file: test.bin
File size: 15.00 GiB
Chunk size: 64.00 MiB
Progress: 5% (768.00 MiB read)
Read: 1.50 GiB, Speed: 1.45 GiB/s
Progress: 10% (1.50 GiB read)
...
✓ Successfully read 15.00 GiB
Time taken: 10.12 seconds
Average speed: 1.48 GiB/s
SHA256: a3d5c... (if --hash was used)
Error Handling
Both tools include comprehensive error handling:
- File existence checks
- Disk space validation
- Permission verification
- Interrupt handling (Ctrl+C)
- Automatic cleanup on errors
Exit Codes
0- Success1- General error (file not found, permission denied, etc.)130- Interrupted by user (Ctrl+C)
License
MIT License - Feel free to use and modify as needed.
Contributing
Contributions welcome! Feel free to submit issues or pull requests.
Author
Created for testing and benchmarking large file operations.
Description
Languages
Python
100%