PDF Phishing Triage

Static-first checklist to quickly classify phishing PDFs, extract redirects, and identify exploit indicators.

2/21/20263 min read

X

PDF Phishing Triage

Goal

Quickly determine:

  • Is this an exploit PDF?
  • Or just a visual lure with a hyperlink?
  • Does it embed payloads?
  • Where does it redirect?

Keep it static unless proven otherwise.


Phase 1 - Identify File Type

file sample.pdf

Expected:

  • PDF document

If you see:

  • Zip archive data
  • Java archive
  • PE32 executable

Extension is lying.


Phase 2 - Quick Suspicion Scan

pdfid.py sample.pdf

Look for non-zero values:

  • /JavaScript
  • /JS
  • /OpenAction
  • /Launch
  • /EmbeddedFile
  • /AA

If all are zero, it is likely a hyperlink lure.


Phase 3 - Extract URI Targets (Cleanly)

pdf-parser.py -s URI --filter sample.pdf

Ultra-clean extraction:

pdf-parser.py -s URI --filter sample.pdf | sed -n 's/.*\/URI (\(.*\)).*/\1/p'

Fallback:

strings sample.pdf | grep -Eo 'https?://[^") ]+'

Phase 4 - Identify Clickable Overlays

pdf-parser.py -s "/Subtype /Link" sample.pdf

Look for:

  • /Rect [0 0 612 792]

This often indicates a full-page invisible hyperlink.


Phase 5 - Check for Auto-Execution

pdf-parser.py -s OpenAction sample.pdf

If result is greater than 0, execution may occur on open.

If 0, user click is usually required.


Phase 6 - Check for Embedded Files

pdf-parser.py -s EmbeddedFile sample.pdf

If found:

pdf-parser.py -o <object#> -d sample.pdf

Phase 7 - JavaScript Review

Search for JS:

pdf-parser.py -s JavaScript --filter sample.pdf

Common benign decoy:

  • app.alert("Browser not compatible")

Malicious JS often shows:

  • eval
  • obfuscation
  • long encoded strings
  • shellcode-like blobs

Phase 8 - Render Safely

Open visually (REMnux):

evince sample.pdf

Or render as image:

pdftoppm sample.pdf page -png

Phase 9 - Redirect Chain Mapping

If URI is found:

curl -Iks https://domain.tld

Then pivot:

  • DNS
  • Hosting provider
  • CDN usage
  • TLS filtering behavior

Common Patterns Observed

1. Visual Lure Only

  • Blurred invoice
  • "Cannot display PDF"
  • Full-page link overlay
  • No exploit

2. Redirect to Stager

  • PDF to raw IP
  • Downloads archive (.7z, .jar)

3. SPA Frontend

  • Netlify / Vercel
  • Vue/React bundle
  • JS POST to backend API
  • Credential harvesting endpoint

Triage Decision Tree

Indicator Likely Type
/Launch Exploit / payload
/EmbeddedFile Dropper PDF
Only /URI Click-through phishing
High entropy streams Obfuscation
SPA backend POST found Credential kit

Reporting Checklist

  • SHA256 of PDF
  • Extracted URIs
  • Hosting provider
  • Backend endpoints
  • Screenshots
  • Submission to hosting abuse

Reminder

Most phishing PDFs are social engineering, not exploit chains.

Do not overcomplicate.


Related posts

Keep reading