- Early return when no entityRanges (skip unnecessary processing)
- Escape [ and ] in link labels to prevent nested bracket issues
- Encode ) in URLs as %29 to prevent malformed markdown links (e.g. Wikipedia)
- Add 3 new test cases for the above edge cases
Extract visibility metadata from TweetWithVisibilityResults wrapper
before unwrapping. Adds is_subscriber_only field to Tweet model,
with full serialization roundtrip and test coverage.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Twitter long tweets (>280 chars) store full text in
note_tweet.note_tweet_results.result.text rather than legacy.full_text.
The parser now prioritizes note_tweet text when available.
Split the monolithic client.py (1341 lines) into three focused modules:
- graphql.py (~200 lines): queryId resolution, URL building, JS bundle
scanning, feature flag management
- parser.py (~270 lines): Tweet/User/Media/Article parsing, utility functions
(_deep_get, _parse_int, _extract_cursor, _extract_media)
- client.py (~700 lines): TwitterClient class with HTTP engine, anti-detection,
session management, and all public API methods
Backward compatibility: client.py re-exports all previously public symbols
so existing test imports work without modification. 88/88 tests pass.