| Wget ExamplesKrazyWorks

Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Commands & Shells

Wget Examples

Submitted by Igor on March 1, 2017 – 9:25 pm

This is a follow-up to my previous wget notes (1, 2, 3, 4). From time to time I find myself googling wget syntax even though I think I’ve used every option of this excellent utility over the years. Perhaps my memory is not what it used to be, but I’m probably the most frequent visitor to my own Web site… Anyway, here’s the grand list of the more useful wget snippets.

Download tar.gz and uncompress with a single command:

wget -q ${url}/archive.tar.gz -O - | tar xz

Download tar.bz2 and uncompress with a single command:

wget -q ${url}/archive.tar.bz2 -O - | tar xj

Download in background, limit bandwidth to 200KBps, do not ascend to parent URL, download only newer files, do not create new directories, download only htm*,php and, pdf, set 5-second timeout per link:

wget -b --limit-rate=200k -np -N -m -nd --accept=htm,html,php,pdf --wait=5 "${url}"

Download recursively, span multiple hosts, convert links to local, limit recursion level to 4, fake “mozilla” user agent, ignore “robots” directives:

wget -r -H --convert-links --level=4 --user-agent=Mozilla "${url}" -e robots=off

Generate a list of broken links:

wget --spider -o broken_links.log --wait 2 -r -p "${url}" -e robots=off

Download new PDFs from a list of URLs:

wget -r --level=1 -H --timeout=2 -nd -N -np --accept=pdf -e robots=off -i urls.txt

Save and use authentication cookie:

wget -O ~/.domain_cookie_tmp "https://domain.com/login.cgi?login=${username}&amp;password=${password}"

grep "^cookie" ~/.domain_cookie_tmp | awk -F'=' '{print $2}' > ~/.domain_cookie
wget -c --no-cookies --header="Cookie: enc=`cat ~/.domain_cookie`" -i "${url_file}" -nc

Use wget with anonymous proxy:

export http_proxy=proxy_server:port
wget -Y -O /tmp/yahoo.htm "http://www.yahoo.com"

Use wget with authorized proxy:

export http_proxy=proxy_server:port
wget -Y --proxy-user=${username} --proxy-passwd=${password} \
-O /tmp/yahoo.htm "http://www.yahoo.com"

Make a local mirror of a Web site, including FTP links; limit rate to 50kbps; set link timeout to 5s; ignore robots directive; randomize access rate:

wget -U Mozilla -m -k -D ${domain} --follow-ftp \
--limit-rate=50k --wait=5 --random-wait -np "${url}" -e robots=off

Download images from a Web site:

wget -r -l 0 -U Mozilla -t 1 -nd -D ${domain} \
-A jpg,jpeg,gif,png "${url}" -e robots=off

Extract a list of HTTP(S) and FTP(S) links from a single URL:

wget -qO- "${url}" | grep -oE "(https?|ftps?)://[^\<\>\"\' ]+" | sort -u

Mirror a subfolder of a site:

wget -mk -w 20 -np ${url}

Update only changed files:

wget -mk -w 20 -N "${url}"

Mirror site with random delay between requests:

wget -w 20 --random-wait -mk "${url}"

Download a list of URLs from a file:

wget -i "${url_file}"

Resume interrupted file download:

wget -c "${file_url}"

Download files in the background:

wget -c "${url}"

Download the first two levels of pages from a site:

wget -r -l2 "${url}"

Make a static copy of a dynamic Web site two levels deep:

wget -P /var/www/html/ -mpck -l2 --user-agent="Mozilla" -e robots=off -E "${url}"

Wget Examples

Leave a Reply Cancel reply

Commands & Shells »

Awk & sed Snippets for SysAdmins

Synology NAS Hacks

Applications »

Synology NAS Hacks

WordPress: Post-processing of the image failed

Data »

Synology NAS Hacks

Finding Duplicate Photos

Monitoring »

NFS I/O Stats with Logging

Finding Cron Jobs

Networking »

NFS I/O Stats with Logging

Inventorying NFS Mounts and Mount Options

Editor

Log In

Interesting »

Awesome Shell

Latest Comments

Username
Password

	Remember Me Lost your password?

Wget Examples

Share this:

Leave a Reply Cancel reply

Editor

Log In

Latest Comments