PDF Phishing Triage
Static-first checklist to quickly classify phishing PDFs, extract redirects, and identify exploit indicators.
2/21/2026 • 3 min read
•X
PDF Phishing Triage
Goal
Quickly determine:
- Is this an exploit PDF?
- Or just a visual lure with a hyperlink?
- Does it embed payloads?
- Where does it redirect?
Keep it static unless proven otherwise.
Phase 1 - Identify File Type
file sample.pdf
Expected:
- PDF document
If you see:
- Zip archive data
- Java archive
- PE32 executable
Extension is lying.
Phase 2 - Quick Suspicion Scan
pdfid.py sample.pdf
Look for non-zero values:
/JavaScript/JS/OpenAction/Launch/EmbeddedFile/AA
If all are zero, it is likely a hyperlink lure.
Phase 3 - Extract URI Targets (Cleanly)
pdf-parser.py -s URI --filter sample.pdf
Ultra-clean extraction:
pdf-parser.py -s URI --filter sample.pdf | sed -n 's/.*\/URI (\(.*\)).*/\1/p'
Fallback:
strings sample.pdf | grep -Eo 'https?://[^") ]+'
Phase 4 - Identify Clickable Overlays
pdf-parser.py -s "/Subtype /Link" sample.pdf
Look for:
/Rect [0 0 612 792]
This often indicates a full-page invisible hyperlink.
Phase 5 - Check for Auto-Execution
pdf-parser.py -s OpenAction sample.pdf
If result is greater than 0, execution may occur on open.
If 0, user click is usually required.
Phase 6 - Check for Embedded Files
pdf-parser.py -s EmbeddedFile sample.pdf
If found:
pdf-parser.py -o <object#> -d sample.pdf
Phase 7 - JavaScript Review
Search for JS:
pdf-parser.py -s JavaScript --filter sample.pdf
Common benign decoy:
app.alert("Browser not compatible")
Malicious JS often shows:
eval- obfuscation
- long encoded strings
- shellcode-like blobs
Phase 8 - Render Safely
Open visually (REMnux):
evince sample.pdf
Or render as image:
pdftoppm sample.pdf page -png
Phase 9 - Redirect Chain Mapping
If URI is found:
curl -Iks https://domain.tld
Then pivot:
- DNS
- Hosting provider
- CDN usage
- TLS filtering behavior
Common Patterns Observed
1. Visual Lure Only
- Blurred invoice
- "Cannot display PDF"
- Full-page link overlay
- No exploit
2. Redirect to Stager
- PDF to raw IP
- Downloads archive (
.7z,.jar)
3. SPA Frontend
- Netlify / Vercel
- Vue/React bundle
- JS POST to backend API
- Credential harvesting endpoint
Triage Decision Tree
| Indicator | Likely Type |
|---|---|
/Launch |
Exploit / payload |
/EmbeddedFile |
Dropper PDF |
Only /URI |
Click-through phishing |
| High entropy streams | Obfuscation |
| SPA backend POST found | Credential kit |
Reporting Checklist
- SHA256 of PDF
- Extracted URIs
- Hosting provider
- Backend endpoints
- Screenshots
- Submission to hosting abuse
Reminder
Most phishing PDFs are social engineering, not exploit chains.
Do not overcomplicate.
Related posts
- Honey-Pi Notes: Turning a Spare Raspberry Pi into a Cloud DFIR BeaconWhy I turned an idle Raspberry Pi into a honeypot that ships to Azure Log Analytics, plus the tiny set of commands/aliases I’ll actually use.
Keep reading