Content Discovery

Resources

# Fuzzing Wordlists
https://github.com/fuzzdb-project/fuzzdb

# Fuzzing and Content Discovery
https://github.com/kaimi-io/web-fuzz-wordlists


Tips

# Fuzz non-printable characters in any user input
# Could result in regex bypass, account takeover...
0x00, 0x2F, 0x3A, 0x40, 0x5B, 0x60, 0x7B, 0xFF
%00, %2F, %3A, %40, %5B, %60, %7B, %FF


JS extraction

# Extract endpoint from JS files (https://github.com/jobertabma/relative-url-extractor)
ruby extract.rb https://hackerone.com/some-file.js


# Check for broken links and domain takeover
# For twitter, TwitterBFTD is great
https://github.com/stevenvachon/broken-link-checker
$ blc -rof --filter-level 3 https://example.com/
$ blc -rfoi --exclude linkedin.com --exclude youtube.com --filter-level 3 https://example.com/


Dirsearch

$ python3 dirsearch.py -u https://www.target.fr -f -e php,xml,txt -t 10 -w wordpress.fuzz.txt


gau

https://github.com/lc/gau
# getallurls (gau) fetches known URLs from AlienVault's Open Threat Exchange,
# the Wayback Machine, and Common Crawl for any given domain

# It can be used to map and discover new targets (endpoints, domains, subdomains...)

$ printf example.com | gau
$ cat domains.txt | gau
$ gau example.com