# Tolerance Parameter Fix - Update Notes ## What Was Wrong The original tolerance parameter used raw Euclidean distance in RGB space (0-255 scale), which was unintuitive: - Max possible distance in RGB = sqrt(3 × 255²) ≈ 441 - A tolerance of "30" was actually very strict (only ~7% of max distance) - For antialiasing around text, you needed values like 150+ which wasn't obvious ## What's Fixed **New Scale: 0-100** (percentage-based) - 0 = exact color match only - 30 = 30% of maximum color distance (default, RECOMMENDED) - 100 = maximum tolerance **Why This Matters for Antialiasing:** Example: Gray layer (150,150,150) with black text (0,0,0) - Antialiasing creates intermediate colors: (75,75,75), (100,100,100), (125,125,125) - Distance from gray (150,150,150) to (75,75,75) = sqrt(3 × 75²) ≈ 130 - Old scale: You'd need tolerance ~130 (not intuitive) - New scale: tolerance 30-45 captures these (makes sense!) ## Updated Recommendations ```bash # Default - good for most diagrams python layer_extractor.py diagram.pdf -t 30 # Heavy antialiasing (small text, complex diagrams) python layer_extractor.py diagram.pdf -t 45 # Extreme antialiasing (compressed PDFs, low quality) python layer_extractor.py diagram.pdf -t 60 # Very strict (clean diagrams, no antialiasing) python layer_extractor.py diagram.pdf -t 15 ``` ## Key Point **If you see missing pixels around text or edges → INCREASE tolerance (not decrease!)** The antialiased pixels are "far" from the target color in RGB space, so they need higher tolerance to be captured. ## Test Your Diagram Start with default (30), then: - Missing pixels/gaps around text? → Try 45 - Still missing details? → Try 60 - Layers bleeding together? → Try 20