curl and its associated library libcurl offer many very powerful features and support HTTP, HTTPS, FTP, FTPS, GOPHER, TFTP, SCP, SFTP, SMB, TELNET, DICT, LDAP, LDAPS, FILE, IMAP, SMTP, POP3, RTSP and RTMP. A good place to start understanding curl is cURL - Wikipedia but the official site is curl. It is worth noting that curl is open source and available from GitHub at curl/curl: A command line tool and library for transferring data with URL syntax. I recommend looking at curl - Tool Documentation for the documentation, FAQ and a Tutorial.
A very handy little command line utility for sending data to or getting it from a URL using a variety of protocols. For example curl http://www.example.com/
gets the default HTML for the example URL. If you re-direct this to a file you can capture the output as follows curl http://www.example.com/ > output.txt
. However there are other neat tricks like curl -I http://www.example.com/
which displays the response headers. The curl command is also ideal for use in scripts as it has some handy exit codes.
If all you want is to download files, then wcurl is an option well worth looking into, it is part of the curl solution but provides a simple command line. On macOS you can install this with brew install wcurl
.
Handy curl examples are as follows:
curl -O "https://example.com/files/binaryfile.dat"
- this downloads a file using the filename in the url. You can use --output filename.dat
to save the file to a different name, or "-o" if you prefer.
Sometimes a URL redirects to a file, in this case the "-L" option is useful, for example:
curl -L "https://example.com/files/latestdata" --output latestdata.txt
You can also use curl to do ftp, some basic examples follow:
Directory Listing: curl --ssl --ftp-pasv --user username:password ftp://ftpserver.example.com
Download File: curl --ssl --ftp-pasv --user username:password ftp://ftpserver.example.com/remote_file.txt -o ./local_file.txt
Upload File: curl --ssl --ftp-pasv --user username:password ftp://ftpserver.example.com/ -T ./file_to_upload.txt
These examples all use FTPS Explicit in Passive mode. Specifying ftps in the url will switch to FTPS Implicit which you probably don't want. Leaving the password off means curl will prompt for it.
If you wish to use a proxy server, then something like this would work:
curl --proxy hostname:8080 http://www.example.com
This will get the HTML of the home page of www.example.com via a proxy server running on port 8080 on a server called hostname.
The very useful and common Linux utility curl is available on Windows, where it works equally well. You can download it from curl - Download, I have used the Windows 64 build from Viktor Szakáts but feel free to choose your own option or build it yourself. Note that Viktor's build setup is on GitHub at vszakats/curl-for-win: Reproducible curl/libcurl (and OpenSSL) binaries for Windows
The best way to install curl on a Mac is to use brew, however once this is done you may still find the default curl is still being used, so to switch to the brew installed curl just execute the following:
ln -s /usr/local/opt/curl/bin/curl /usr/local/bin/curl
This works because the "/usr/local/bin" directory comes before the directory where the default curl is installed. To undo this simply execute the following:
rm /usr/local/bin/curl
After which you will be back to the default curl.
A good tip is to use --verbose
to see what is going on and possibly get some feedback.
The HTTP cheat sheet has some good general advice.
In general it is good to first try the --verbose
option, as this will show you what is happening, and it is helpful for seeing what is sent and received. However sometimes you need to go to the next level of detail in seeing what is going on, including how all the curl options are processed, this is where Trace options - Everything curl comes into its own.
In short "For POST, HEAD, PUT, GET: never use -X", see Unnecessary use of curl -X | daniel.haxx.se for an explanation.
Firstly you will need to surround your URLs with double quotes, so "http://www.example.com"
for example. However if you have a complex URL for an API call or just something unusual then you will need to "url encode" your text. So a space character becomes a "+" or "%20" and a single quote becomes "%27", see HTML URL Encoding Reference for a more complete table. I have had issues with the $ (dollar) symbol but using %24 fixed this. Note that if you have trouble using things like %27 then it is worth also trying to surround your URL with single quotes.
If you get an HTTP response code of "401 Unauthorized" then you need to add some credentials, there are two ways to do this:
curl --user username:password "http://www.example.com/protected"
curl --header "Authorization: Basic dXNlcm5hbWU6cGFzc3dvcmQ=" "http://www.example.com/protected"
That random looking string is username:password
but encoded in Base64, which is a common standard, however one should never rely on security by obscurity. I believe curl also has options for passing credentials to a proxy server as well as taking the logged on user when running on Windows.
There are two main types of HTTP header, there are more but I will focus on two. Firstly ones that control the request and secondly those that control the response. Of course there are subtleties to this depending on whether you are working with an HTTP PUT/POST or the default HTTP GET. When working with HTTP GET then you may want to request a specific type of response from the server, maybe you want JSON as opposed to XML. In this case browse to HTTP headers - HTTP | MDN and scroll down to the "Content Negotiation" section, here you will see a header called "Accept" which can be used as follows:
--header "Accept: application/json"
which basically asks the server for a JSON response
--header "Content-Type: application/json"
confirms the type of the payload being sent for an HTTP PUT/POST
You can get the response headers by using --verbose
however they are not written to stdout but stderr. An easier option is to use --dump-header
to save the response headers to a file, this has the added advantage of not putting "less than" symbols before each one.
curl --dump-header headers.txt https://www.geoffdoesstuff.com
This will return the webpage to the console and write the response headers to "headers.txt". There is also another option --include
, which will include the response headers in the output and they are written to stdout, this is possible the cleanest solution.
When processing the response headers I noticed that they have Windows line endings (CRLF or Carriage Return, Line Feed), which on Linux is a pain, so piping through tr -d '\015'
fixes this, you can use "^M" instead of "\015" but the latter is easier to type in! If you want to see the issue on Linux then cat -v headers.txt
will show the CRLF characters as "^M". This is documented in RFC 822: STANDARD FOR THE FORMAT OF ARPA INTERNET TEXT MESSAGES which is the relevant specification document.
Having used this I was impressed at how simple and easy this was. There are two switches to work with --cookie-jar "cookies.txt"
which will write cookie data to the file and --cookie "cookies.txt"
which will read the file and send them to the server. You can use both switches, which means that cookies are sent with the request and saved in the response. Cookie data is sent to the server in HTTP headers called "Cookie" and returned in the response in "Set-Cookie".
This is all explained in curl - HTTP Cookies if you want the official documentation.
You can use --output
to redirect the output to a file of your choosing. When used with --remote-name
you can in effect rename the file on download. However sometimes you want to throw the output away, in which case the following work:
--output /dev/null
for Linux/Mac
--output nul
for Windows
So, the obvious and easy way to get extra output is to use the verbose option, however there are other possibilites:
--verbose
this shows everything that is going on, so what the hostname resolves to, TLS information, request and response HTTP headers and more
--include
a simpler option to display just the response headers
First off, put the following into a file called "curl_output.txt":
\n size_download: %{size_download} (bytes)\n speed_download: %{speed_download} (bytes per second)\n \n time_namelookup: %{time_namelookup}\n time_connect: %{time_connect}\n time_appconnect: %{time_appconnect}\n time_pretransfer: %{time_pretransfer}\n time_redirect: %{time_redirect}\n time_starttransfer: %{time_starttransfer}\n --------\n time_total: %{time_total} (seconds)\n \nThen change into the directory with that file and execute the following:
curl --write-out "@curl_output.txt" --silent --output /dev/null "https://www.geoffdoesstuff.com"
size_download: 9941 (bytes) speed_download: 20057.000 (bytes per second) time_namelookup: 0.005119 time_connect: 0.006878 time_appconnect: 0.190811 time_pretransfer: 0.190856 time_redirect: 0.000000 time_starttransfer: 0.487045 -------- time_total: 0.495617 (seconds)There are several more variables that you can use, however I have included all the time related ones in curl 7.54, the full list of available variables is at curl - How To Use.
If the output you require is shorter then you can put this into the curl command line, so to get the http response status code, the size and time as follows:
curl --write-out "HTTP: %{http_code} Size: %{size_download} Time: %{time_total}" --silent --output /dev/null "https://www.geoffdoesstuff.com"
Above I have used the longer command line switch options, as generally these are clearer and easer, however at times you just want something quick and simple, so here is a conversion table:
Full Switch | Short | Description |
---|---|---|
--cookie | -b | The file (or data) of the cookies to send |
--cookie-jar | -c | The filename where any received cookies are stored |
--data | -d | Data to be sent with a POST, can use @filename.ext instead to text |
--dump-header | -D | Needs to be followed by a file name to save the HTTP response headers in |
--head | -I | This will return headers only, no response payload. Note it is a capital letter i. |
--header | -H | Used to specify an HTTP header in the request |
--include | -i | Include the HTTP response headers in the output |
--insecure | -k | Ideally this should not be used, however it is handy when self-signed certs are being used |
--location | -L | Follow HTTP Response Codes 3xx to a new location |
--output | -o | Allows specification of filename to write output to |
--request | -X | Specify a custom request method, the default is GET unless you use --data when the default changes to POST |
--silent | -s | Hide the progress meter |
--user | -u | Pass a username/password for server authentication |
--verbose | -v | Verbose Output for diagnosis |
There are times when you need to stitch a number of options together, so these are discussed and outlined here.
The first task is to get the WSDL, so do something like this:
curl --silent "https://www.example.com/soapService?wsdl"
From this you need to find the "soapAction" and the "soap:address", where the latter is the url to call and probably matches the wsdl location. Now you need to put the request XML into a file, say request.xml and then you can use curl again to get the response, as follows:
curl --request POST --header "Content-Type: text/xml; charset=utf-8" --header "SOAPAction:thesoapaction" --data @request.xml "https://www.example.com/soapService"