HFDL - Hugging Face Download Library (v. 0.4.0)

A fast and reliable downloader for Hugging Face models and datasets with intelligent optimization features that adapt to your system capabilities, network conditions, and specific needs.

Smart Systems

HFDL incorporates several intelligent systems that work together to optimize your download experience:

1. CPU-Based Thread Auto-Scaling

What it does: Automatically determines the optimal number of threads based on your CPU cores
How it works:
- 1-2 CPU cores: Allocates 2 threads
- 3-8 cores: Uses a number of threads equal to the core count
- More than 8 cores: Caps at 8 threads to prevent overloading
Why it's smart: Balances performance and resource usage without manual tuning

2. Size-Based File Categorization

What it does: Classifies files as "small" or "big" based on a configurable threshold (default: 100 MB)
How it works:
- Small files: Downloaded quickly, often in parallel
- Big files: Handled with bandwidth control for efficient resource allocation
Why it's smart: Optimizes download strategy based on file characteristics

3. Bandwidth Measurement and Control

What it does: Measures your download speed and limits usage to a percentage (default: 95%)
How it works:
- Measures initial speed with a sample file
- Allocates bandwidth across threads for large files
- Introduces micro-delays to maintain speed limits when needed
Why it's smart: Prevents network saturation while maximizing throughput

4. Graceful Interruption Handling

What it does: Ensures downloads can be safely interrupted without corrupted files
How it works:
- Uses a dedicated thread for interrupt signals on multi-core systems
- Implements clean shutdown procedures for all resources
Why it's smart: Provides reliability and responsiveness during long downloads

5. Comprehensive Error Handling

What it does: Anticipates and manages a wide range of potential errors
How it works:
- Implements custom exception hierarchy for precise error handling
- Provides fallback mechanisms and recovery strategies
Why it's smart: Maintains operation even under adverse conditions

6. Progress Tracking

What it does: Monitors and displays download progress at both file and overall levels
How it works:
- Tracks bytes downloaded for each file
- Aggregates progress across all files for overall completion percentage
Why it's smart: Provides real-time feedback with thread-safe accuracy

Installation

pip install hfdl

Or install from source:

git clone https://github.com/MubarakHAlketbi/hfdl.git
cd hfdl
pip install -e .

Quick Start

from hfdl import HFDownloader

# Basic usage
downloader = HFDownloader("MaziyarPanahi/Qwen2.5-7B-Instruct-GGUF")
downloader.download()

# Enhanced mode with custom settings
downloader = HFDownloader(
    "Anthropic/hh-rlhf",
    repo_type="dataset",
    enhanced_mode=True,
    size_threshold_mb=100,
    bandwidth_percentage=95
)
downloader.download()

Command Line Usage

The CLI has been reorganized for better usability, with options grouped into Basic, Advanced, and Output categories.

Interactive Mode

If you run hfdl without arguments, it will enter interactive mode and guide you through the process:

# Start interactive mode
hfdl

Command Examples

# Basic usage
hfdl MaziyarPanahi/Qwen2.5-7B-Instruct-GGUF

# Advanced mode with optimized downloading
hfdl Anthropic/hh-rlhf --optimize-download

# Custom threads and directory
hfdl Anthropic/hh-rlhf --threads 4 --directory ./models

# Test what would be downloaded without downloading
hfdl Anthropic/hh-rlhf --dry-run

Available Options

Basic Options:
  -d, --directory DIR       Directory where files will be saved
  -r, --repo-type TYPE      Type of repository (model/dataset/space)
  --verify                  Verify integrity of downloaded files
  --force                   Force fresh download, overwriting existing files
  --no-resume               Disable download resuming

Advanced Options:
  --optimize-download       Enable optimized downloading with size-based
                            categorization and bandwidth control
  -t, --threads NUM         Number of download threads (auto: optimal based on
                            CPU cores, or specify a positive number)
  --size-threshold MB       Files larger than this size will use bandwidth control
  --bandwidth PERCENT       Percentage of measured bandwidth to use
  --measure-time SECS       Duration to measure initial download speed

Output Options:
  --quiet                   Suppress all output except errors
  --verbose                 Show detailed progress and debug information
  --dry-run                 Show what would be downloaded without downloading

Cross-Platform Compatibility

HFDL is designed to work seamlessly across different operating systems:

Windows, macOS, and Linux support
Path sanitization to handle OS-specific filename restrictions
Adaptive file handling that respects platform limitations

Error Handling

HFDL provides comprehensive error handling and recovery mechanisms:

Error Types

Download Errors:
- HFDownloadError: Base exception for all errors
- ThreadManagerError: Thread-related errors
- FileManagerError: File operation errors
- SpeedManagerError: Speed control errors
Specific Errors:
- FileSizeError: File size calculation issues
- FileTrackingError: Progress tracking issues
- SpeedMeasurementError: Speed measurement issues
- SpeedAllocationError: Speed allocation issues

Error Recovery

HFDL implements automatic error recovery:

Network retry mechanisms for transient failures
Resource cleanup to prevent leaks
State recovery to resume interrupted operations
Fallback to legacy mode when enhanced features encounter issues
OS-specific path handling to prevent filename-related errors

Thread Safety

All operations are thread-safe:

Resource protection with proper locking mechanisms
State consistency across concurrent operations
Safe cleanup even during interruptions
Error propagation to the appropriate handlers
Thread-aware progress tracking

Testing

HFDL includes comprehensive test coverage:

Running Tests

# Install test dependencies
pip install pytest pytest-mock

# Run all tests
pytest hfdl/tests/

# Run specific test categories
pytest -v -k "error" hfdl/tests/      # Error handling tests
pytest -v -k "thread_safety" hfdl/tests/  # Thread safety tests
pytest -v hfdl/tests/test_downloader.py   # Downloader tests

Test Categories

Unit Tests:
- Component functionality
- Error handling
- Input validation
- State management
Integration Tests:
- Component interaction
- Error propagation
- Resource management
- System behavior
Error Tests:
- Error scenarios
- Recovery mechanisms
- Resource cleanup
- State consistency
Thread Safety Tests:
- Concurrent operations
- Resource contention
- State consistency
- Error handling

Contributing

Fork the repository
Create your feature branch
Add tests for your changes
Ensure all tests pass
Submit a pull request

Development Setup

Clone the repository:

git clone https://github.com/yourusername/hfdl.git
cd hfdl

Create virtual environment:

python -m venv venv
source venv/bin/activate  # Linux/Mac
venv\Scripts\activate     # Windows

Install dependencies:

pip install -e ".[dev]"

Run tests:

pytest hfdl/tests/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
hfdl		hfdl
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HFDL - Hugging Face Download Library (v. 0.4.0)

Smart Systems

1. CPU-Based Thread Auto-Scaling

2. Size-Based File Categorization

3. Bandwidth Measurement and Control

4. Graceful Interruption Handling

5. Comprehensive Error Handling

6. Progress Tracking

Installation

Quick Start

Command Line Usage

Interactive Mode

Command Examples

Available Options

Cross-Platform Compatibility

Error Handling

Error Types

Error Recovery

Thread Safety

Testing

Running Tests

Test Categories

Contributing

Development Setup

License

About

Languages

License

MubarakHAlketbi/hfdl

Folders and files

Latest commit

History

Repository files navigation

HFDL - Hugging Face Download Library (v. 0.4.0)

Smart Systems

1. CPU-Based Thread Auto-Scaling

2. Size-Based File Categorization

3. Bandwidth Measurement and Control

4. Graceful Interruption Handling

5. Comprehensive Error Handling

6. Progress Tracking

Installation

Quick Start

Command Line Usage

Interactive Mode

Command Examples

Available Options

Cross-Platform Compatibility

Error Handling

Error Types

Error Recovery

Thread Safety

Testing

Running Tests

Test Categories

Contributing

Development Setup

License

About

Topics

Resources

License

Stars

Watchers

Forks

Languages