wget: URL Format
2.1 URL Format
==============
“URL” is an acronym for Uniform Resource Locator. A uniform resource
locator is a compact string representation for a resource available via
the Internet. Wget recognizes the URL syntax as per RFC1738. This is
the most widely used form (square brackets denote optional parts):
http://host[:port]/directory/file
ftp://host[:port]/directory/file
You can also encode your username and password within a URL:
ftp://user:password@host/path
http://user:password@host/path
Either USER or PASSWORD, or both, may be left out. If you leave out
either the HTTP username or password, no authentication will be sent.
If you leave out the FTP username, ‘anonymous’ will be used. If you
leave out the FTP password, your email address will be supplied as a
default password.(1)
*Important Note*: if you specify a password-containing URL on the
command line, the username and password will be plainly visible to all
users on the system, by way of ‘ps’. On multi-user systems, this is a
big security risk. To work around it, use ‘wget -i -’ and feed the URLs
to Wget’s standard input, each on a separate line, terminated by ‘C-d’.
You can encode unsafe characters in a URL as ‘%xy’, ‘xy’ being the
hexadecimal representation of the character’s ASCII value. Some common
unsafe characters include ‘%’ (quoted as ‘%25’), ‘:’ (quoted as ‘%3A’),
and ‘@’ (quoted as ‘%40’). Refer to RFC1738 for a comprehensive list of
unsafe characters.
Wget also supports the ‘type’ feature for FTP URLs. By default, FTP
documents are retrieved in the binary mode (type ‘i’), which means that
they are downloaded unchanged. Another useful mode is the ‘a’ (“ASCII”)
mode, which converts the line delimiters between the different operating
systems, and is thus useful for text files. Here is an example:
ftp://host/directory/file;type=a
Two alternative variants of URL specification are also supported,
because of historical (hysterical?) reasons and their widespreaded use.
FTP-only syntax (supported by ‘NcFTP’):
host:/dir/file
HTTP-only syntax (introduced by ‘Netscape’):
host[:port]/dir/file
These two alternative forms are deprecated, and may cease being
supported in the future.
If you do not understand the difference between these notations, or
do not know which one to use, just use the plain ordinary format you use
with your favorite browser, like ‘Lynx’ or ‘Netscape’.
---------- Footnotes ----------
(1) If you have a ‘.netrc’ file in your home directory, password will
also be searched for there.