Wikia

How To Wiki

Get timestamp-of-file from web-server

1,795pages on
this wiki
Talk3

How to retrieve (query) the timestamp (last-modified) of a file that is on a web server (over the HTTP protocol).

wget Edit

This is actually wget's default behavior. The -N switch is not really necessary (in most cases). That switch may make an effective difference in determining how wget responds (behaves) when the local copy of that file (bearing the same name and size in bytes) already exists (is downloaded, was previously downloaded retrieved from the web server) and
-- whether it will download a new copy (and, to avoid naming collisions with existing copy in local storage, append ".1" to the end of its filename) OR
report no difference with corresponding copy on server and thus not transfer (will not download a new copy (retrieve from the web server again)).

Once the download is complete, wget (and the timestamp header for the resource/file is provided by the web server('s response headers) will stamp (assign) (change, from what would otherwise be, which is the time of the completion of the download, locally) (by the web user agent (app, usually a GUI web browser)) will give the remote (server's) timestamp of/for the file to the local (just-saved/downloaded) copy on your computer. just apply it to the mtime, not atime nor ctime(s).


curl Edit

Unfortunately applying the mtime (last-modified) timestamp of a file on a web server to a local copy that is downloaded from that server is NOT the default behavior of curl.

Curl can, however, achieve this functionality by being called/invoked with the -R switch.

curl -OR <remote web path>

Will download a copy of the file to local storage (pwd) and, if possible (if the server allows) the local copy's timestamp(s*) will resemble that of the file on the server (as reported in the HTTP headers in the server's response to the client's request).

add -v switch(es) to actually see the HTTP headers. <-- there is another switch
-D? is equivalent to wget's --spider switch which just queries server (and with -v can show server's HTTP headers -- but does not actually download transfer the file).

extra technical detail/note: in contrast with the behavior I described above for wget, curl applies the remote timestamp (mtime on remote host/server) to both (identically) atime and mtimes of local copy (why? Fleetwoodta (talk) 21:01, June 24, 2014 (UTC) ).

extra note Edit

curl (in comparison with other http (web) user agents (download apps)) (including regular web browsers like I.E. Firefox and Chrome) as well as wget and aria (see below) is more like manual transmission. By default, it outputs whatever it retrieves from the server to standard output (unless the "-o"" switch is included in its command line invocation) (spits it out) and

"-L" switch is necessary to complete HTTP re-directs (301 and 302). By default, curl will NOT follow H.T.T.P. redirects. This can be of use to web developers and SysOp/administrators to identify problems. It is not particularly desirable behavior to most end-users, though. That's why the other softwares' default behavior is to follow the re-directs seamlessly (automatically). <-- those graphical web browsers don't add the re-direct as a separate event (location) in the browser history.

collisions Edit

Speaking of curl's "manual transmission" -like behavior/character(istics), beware naming collisions,

such as having a file with the same name already in the current/present working directory ("pwd") in your local storage. In other words, if the file that curl downloads/retrieve (from the remote web server) would be written to the local storage bearing a name that is identical to an existing file in the same directory/folder, it will overwrite it -- and do that without prompting you (asking for confirmation from end-user), or even a warning. (In/for comparison/contrast, wget's "-N" switch may effect how it handles such collisions.)

aria Edit

Actually, aria2c uses the same switch name as curl: "-R"

This switch is available to/for aria2c since a release in Sept.? 2012.

man aria2c

just mtime , locally

download Edit

how to obtain: from SourceForge.net

Also available in standard (?) application software package repositories (repos) of most GNU/Linux distros.

latest available: 1.18.5

man Edit

documentation / help / reference :

official on the developer's SoruceForge.net (called Aria2 for Windows?) README document

other notes Edit

Observations about the behavior of aria(2c):

(one windows build of wget does this as well) will re-name file (even if weird characters are in its filename when making request to http/web server) it returns a file with a more human-readable name.

Take, for example: http://s3.amazonaws.com/kajabi-media/attachments/9606/your-living-kitchen.pdf?AWSAccessKeyId=AKIAJQJ7TPUNH4FUNP6A&Expires=1411779744&Signature=cVgX0UYPgljcwedBdx1dhOqwMMQ%3D


09/26 17:06:03 [NOTICE] Allocating disk space. Use --file-allocation=none to disable it. See --file-allocation option in
 man page for more details.
[#349305 4.8MiB/5.6MiB(86%) CN:1 DL:1.8MiB]
09/26 17:06:06 [NOTICE] Download complete: c:/hp/School of Juicing _ Jay Kordich/Book Library/Your Living Kitchen/your-l
iving-kitchen.pdf

Download Results:
gid   |stat|avg speed  |path/URI
======+====+===========+=======================================================
349305|OK  |   1.8MiB/s|c:/hp/School of Juicing _ Jay Kordich/Book Library/Your Living Kitchen/your-living-kitchen.pdf

Status Legend:
(OK):download completed.


redirects Edit

Also, aria2c automatically resolves (follows) HTTP redirects (301s and 302s). This contrasts with the equivalent behavior to/for/of curl which requires the presence of the -L switch.

collisions Edit

aria's default behavior to avoid local storage file naming collisions is to simply append ".1" (and so-forth, for additional would-be collisions of filenames) to the end of the file that it writes to the local storage.

o switch Edit

"-o" allows for manual specification of what name that resulting downloaded (output) retrieved file will have when stored on local storage. This is the same syntax as curl. (I wonder, is this syntax (choice of letter for the switch) borrowed from curl?)

example:

aria2c -o "results-log-2014sep25.txt" -R ""

Firefox Edit

DownThemAll Edit

There is a wonderful browser extension for Mozilla Firefox called "DownThemAll!" This can be a thought of as an in-browser download manager. DTA is easier to figure out (configure) than FlashGot (which relies on other programs like the ones mentioned above/previously to do the downloading -- it just triggers those).

http://www.DownThemAll.net/main/install-it/downthemall-2-0-17/

official page on Mozilla's extensions repository

Yes, by default (in fact, default configuration, when installed within Firefox) is to (when possible -- when the server offers it over HTTP in the response header from the host/server, to the client) is to preserve (or inherit) the timestamp (modified time / last-modified timestamp / mtime). In fact, there is no need for the now non-working PDMT (see directly below, what follows this/here) ...

Firefox old Edit

A (now-stale) browser extension for Mozilla Firefox

was made by Bluefang/ Sparky Bluefang

called Preserve Download Modification Timestamp

last stable release 2011.03.21.22.

A newer beta (in development channel) stands at: 2013.05.11.19b .

Unfortunately this extension no longer seems to work (at all) (with newer Firefoxes ? 24 ? and onward? Fleetwoodta (talk) 21:01, June 24, 2014 (UTC) )

Around Wikia's network

Random Wiki