README.md 1.98 KB
Newer Older
Chris Jewell's avatar
Chris Jewell committed
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
# ddgrab README

(c) Chris Jewell <c.jewell@lancaster.ac.uk> 2020

## Introduction

`ddgrab`, or "Dated Data Grab", provides a short script that connects to a SFTP server and copies the most recent ISO8601 date-named folder to a local machine.For example, if the remote server has directory structure

```
/-
 |--some_folder
  |--2020-02-01
  |--2020-10-02
```
then the script will download the folder named `2020-10-02` given a
`remote_folder` setting of `some_folder`. 

It is designed to be run periodically, and uses filename/filesize combinations
to determine if any new or modified files are available.  If they are, then
the whole data is re-downloaded and replaces that on the local system.

## Installation

Installation is via `pip`, for example:

```
pip install <path to ddgrab>
```

Note that this project uses the new PEP517 standard,
and supports `pip>=10.0.0` only.

## Usage
 
There are two ways to use `ddgrab`, from within a Python script, and from the command line.

### Within Python

To poll a server `ftp.somewhere.com` for a remote folder `some_folder`, and
download to a local folder `local_dir/myprefix_2020-10-02`, use

```
from phecovid import get_data

server = "ftp.somewhere.com"
remote_folder = "some_folder"
secrets = {"username": "foo",
           "password": "bar"}
localdir = "/home/user/data"
prefix = "myprefix_"

get_data(server, remote_folder, secrets, localdir, prefix)
```

If called repeatedly with the same local directory and prefix,
the data will only be downloaded if something in the remote directory
changes.


### From the command line

`ddgrab` installs a script which can be used to run it from the command
line.  For command line use, place your secrets (username/password) in 
a yaml file like so:

```
username: fred
password: pa$$w0rd
```

Assuming we've saved our secrets in a file `secrets.yaml`, use the command
line utility like so:

```
$ ddgrab --server sftp.domain.com --remote_folder foo \
Chris Jewell's avatar
Chris Jewell committed
76
   --secrets secrets.yaml --localdir downloads --dir-prefix data_
Chris Jewell's avatar
Chris Jewell committed
77
```