GNU wget is an unavoidably common command-line download tool. With the -r flag, wget will basically spindle around the entire internet and download it to your computer. Moreover, without extra parameters given, it will preserve the folder structures it finds on the source locations, so that you stand no chance of finding anything on your computer. If you are like me and you want to download things at a specific location in the source server to a specific folder at your destination, just the files, not the folders, then here is an example how to reduce wget’s quirky behaviour.
wget -r -l1 -np -nd --accept=pdf -c -nc $URL
The example does not specify a destination, so it should be run in your exact destination folder. Before the example command, run pwd to see where you are at and, if needed, cd $DESTINATION. Only then run the example.
- -r means recursive; without this flag, wget is only useful to download a single file
- -l1 limits the recursiveness level to 1, i.e. don’t leave the location folder; this can be cautiously expanded to -l2 or -l3 if needed
- -np means the same as –no-parent, i.e. only proceed downwards to subdirectories at the location, not upwards to the parent directories
- -nd means the same as –no-directories, i.e. do not re-create the directory structure at the destination
- –accept=pdf downloads only files that have “pdf” in their names and ignores the rest; if needed, replace “pdf” with anything or omit the entire –accept flag
- -c means continue, in order to avoid starting the download all over from the begging in case it was disconnected last time
- -nc means that, if a file with the same name already exists at the destination, do not create another with a name like File.1
- $URL can point to a directory, not necessarily to a specific file; the preceding flags in the example will ensure that the contents of the pointed directory will be downloaded without creating any subdirectories at the destination