# Big File Generator A collection of Python CLI tools for creating and reading large binary files. Useful for testing disk I/O performance, storage systems, and file transfer mechanisms. ## Tools ### `make_big_file.py` - File Generator Creates large binary files filled with zeros for testing purposes. **Features:** - Configurable file size with human-readable units (GB, TB, MB, etc.) - Adjustable chunk size for write optimization - Disk space validation before writing - Real-time progress reporting with speed metrics - Prevents accidental file overwrites - Graceful interrupt handling with cleanup - Quiet mode for scripting **Usage:** ```bash python make_big_file.py [options] ``` **Arguments:** - `output` - Output file path - `size` - File size (e.g., 15GB, 1.5TB, 500MB) **Options:** - `--chunk-size ` - Chunk size for writing (default: 64MB) - `--quiet, -q` - Suppress progress output - `--version` - Show version information - `--help, -h` - Show help message **Examples:** ```bash # Create a 15GB file python make_big_file.py output.bin 15GB # Create a 1.5TB file with 128MB chunks python make_big_file.py bigfile.dat 1.5TB --chunk-size 128MB # Create a 500MB file quietly python make_big_file.py test.bin 500MB --quiet ``` ### `read_big_file.py` - File Reader & Benchmark Reads large files and measures I/O performance, optionally computing checksums. **Features:** - Configurable chunk size for read optimization - Real-time progress reporting with speed metrics - SHA256 hash computation option - File validation before reading - Quiet mode for scripting - Graceful interrupt handling **Usage:** ```bash python read_big_file.py [options] ``` **Arguments:** - `input` - Input file path to read **Options:** - `--chunk-size ` - Chunk size for reading (default: 64MB) - `--hash` - Compute SHA256 hash of the file - `--quiet, -q` - Suppress progress output - `--version` - Show version information - `--help, -h` - Show help message **Examples:** ```bash # Read a large file python read_big_file.py largefile.bin # Read with 128MB chunks and compute hash python read_big_file.py test.dat --chunk-size 128MB --hash # Read quietly and compute hash python read_big_file.py data.bin --hash --quiet ``` ## Installation No external dependencies required. Works with Python 3.6+. ```bash # Clone or download the scripts git clone cd bigfilegen # Make scripts executable (optional, Unix/Linux/Mac) chmod +x make_big_file.py read_big_file.py ``` ## Requirements - Python 3.6 or higher - Sufficient disk space for file creation - Read/write permissions in target directories ## Performance Tips ### Chunk Size Optimization - **SSDs**: Use larger chunks (64-128MB) for better performance - **HDDs**: Use moderate chunks (32-64MB) to balance speed and memory - **Network drives**: Experiment with different sizes based on network speed ### File System Considerations - **NTFS** (Windows): Supports files up to 16 EiB - **exFAT**: Good for large files on external drives - **ext4** (Linux): Supports files up to 16 TiB - **APFS/HFS+** (macOS): Supports very large files ## Use Cases - **Performance Testing**: Benchmark disk I/O speeds - **Storage Validation**: Verify storage capacity and integrity - **Transfer Testing**: Test file transfer mechanisms and speeds - **Application Testing**: Test applications with large file handling - **Disk Burn-in**: Stress test new storage devices ## Output Examples ### Creating a file: ``` Creating file: test.bin Target size: 15.00 GiB Chunk size: 64.00 MiB Progress: 5% (768.00 MiB written) Written: 1.50 GiB, Speed: 1.23 GiB/s Progress: 10% (1.50 GiB written) ... ✓ Successfully created test.bin (15.00 GiB) Time taken: 12.34 seconds Average speed: 1.22 GiB/s ``` ### Reading a file: ``` Reading file: test.bin File size: 15.00 GiB Chunk size: 64.00 MiB Progress: 5% (768.00 MiB read) Read: 1.50 GiB, Speed: 1.45 GiB/s Progress: 10% (1.50 GiB read) ... ✓ Successfully read 15.00 GiB Time taken: 10.12 seconds Average speed: 1.48 GiB/s SHA256: a3d5c... (if --hash was used) ``` ## Error Handling Both tools include comprehensive error handling: - File existence checks - Disk space validation - Permission verification - Interrupt handling (Ctrl+C) - Automatic cleanup on errors ## Exit Codes - `0` - Success - `1` - General error (file not found, permission denied, etc.) - `130` - Interrupted by user (Ctrl+C) ## License MIT License - Feel free to use and modify as needed. ## Contributing Contributions welcome! Feel free to submit issues or pull requests. ## Author Created for testing and benchmarking large file operations.