# You can also uses as CLI tool
https://github.com/tomnomnom/waybackurls
cat domains.txt | waybackurls > urls
# Cache pages
http://cachedview.com/
https://www.giftofspeed.com/cache-checker/
Getting PDF on Web Archive
# Great resource
https://openfacto.fr/2020/04/19/recuperer-des-fichiers-pdf-en-masse-sur-archive-org/
# Step 1# By adding '*' at the and of a company URL, you can get all indexed documents# Then you can filter by "PDF" (right search bar)
https://web.archive.org/web/*/https://testcompany.fr/*
# Step 2# Here you want to get URL list# In the Firefox developer tools -> Network# You can get an HTTP request to a JSON file containing URLs# Copy as curl and get the file# Step 3# OpenRefine can help to parse and process the file# Filter on PDF# Step 4# NEVER download directly# You can do it through archived document# Add the prefix for every line
https://web.archive.org/web/
# Step 5# To get the document, the '*' in URL must be replaced by the timestamp# If several documents have been indexed (you can download the first, or the last)# Also, add "if_"
https://web.archive.org/web/20160102030102if_/http://www.xxx.fr/document.pdf