✨ Summarize with ChatGPT

This is a short explanation on how to setup a postgres database using Docker, in a way that even if you kill Docker and remove the docker volumes, the data will still exist: we talk about persistent data. And that’s what you want for a database. If you want to know what docker is, check this link out.

Tested Configuration:
MacOS: Sierra 10.12.06
Docker: 18.09.2 CE
Docker image version of postgres: 9.6
Docker-compose: 1.23.2

1. Requirements

Make sure you have docker and docker-compose installed on your machine. To do this, simply type

docker --version

and make sure you have something like Docker version 18.09.2-ce, build 6247962

Then type

docker-compose --version

and make sure you have something like docker-compose version 1.23.2, build 1110ad01

2. Using the Postgres Image

First, create a new folder by typing :

mkdir docker_folder
cd docker_folder

Create a file called docker-compose.yml and then edit it:

sudo nano docker-compose.yaml

Inside, you can write this:

version: "3"

services:
  db:
    image: postgres:9.6
    environment:
      PGDATA: /var/lib/postgresql/data/test_pgdata
      POSTGRES_USER: yourusername
      POSTGRES_PASSWORD: yourpassword
      POSTGRES_DB: test
    ports:
      - 5432:5432

PGDATA:

This environment variable modifies the postgres container: inside the container, it will tell postgres where to store the data. In this case, inside the docker container of postgres, you will be able to see the data at the location /var/lib/postgresql/data/test_pgdata. So no worries if you don’t have /var/lib defined on your machine -> it doesn’t matter :)

3. Run your DATABASE

docker-compose run --service-ports db

Now you can access your database using a GUI like psequel

psquel

In the GUI, you can click on the tab “query”, and Run this query:

CREATE TABLE test_table
(
    id INT PRIMARY KEY NOT NULL,
    email VARCHAR(255)
)

refresh the GUI and you should see the Table. Good. It’s working: you cas see test_table on the left!

psquel

So, what’s the problem ?

Well, if you exit your docker, and launch it again, you will see that your table disapeared: all data was deleted -> that’s because data isn’t persistent yet.

psquel

Why is it important?

If you want to keep your data, you need to bypass this issue:

In case your docker crashes
You want to update postgres from version 9 to version 10
You want to restart your docker_folder
…

Is my data really lost ?

Well, not exactly. The postgres image comes with a default docker VOLUME. So, when you run a postgres container, it creates its own volume and stores data in there.
If you want to see the name of the volume attached to your container, type docker container inspect db

You will see a big all the configuration for your container. Look for this part:

"Mounts": [
            {
                "Type": "volume",
                "Name": "d57756a458346d0f5727278ea0b5a688b5b591c1c79dbb1c6791969c8f1402e8",
                "Source": "/var/lib/docker/volumes/d57756a458346d0f5727278ea0b5a688b5b591c1c79dbb1c6791969c8f1402e8/_data",
                "Destination": "/var/lib/postgresql/data",
                "Driver": "local",
                "Mode": "",
                "RW": true,
                "Propagation": ""
            }
        ],

Three important elements:

Source: your machine path, where it is physically stored on your machine
Destination: your PGDATA
Name: the name of the volume (you should see it also typing docker volume ls)

So you data stays inside the volume attached to your container. But Each new container has its new volume, that’s why it looks like you lost your data. In case you need to get a volume back, you can attach a new container to an existing volume using this
docker run -v d57756a458346d0f5727278ea0b5a688b5b591c1c79dbb1c6791969c8f1402e8:/var/lib/postgresql/data
Learn more by reading the offical doc

4. Persistent Data

There are two options for having data persistent.

1 - Use a nemed Docker Volume

Requirement: Please read the paragraph Is my data really lost ? above.

You want to create a docker VOLUME with a specific name (for instance datavolume), instead of an auto-generated name like d57756a458346d0f5727278ea0b5a688b5b591c1c79dbb1c6791969c8f1402e8

Now, to every new created container, you attach the same volume datavolume containing your data. This is how: in our docker-compose file we define the volume inside db, and we declare the volume in the volumes area (otherwise you’ll have an error no declaration was found in the volumes section).

version: "3"

services:
  db:
    image: postgres:9.6
    environment:
      PGDATA: /var/lib/postgresql/data/test_pgdata
      POSTGRES_USER: yourusername
      POSTGRES_PASSWORD: yourpassword
      POSTGRES_DB: test
    volumes:
      - datavolume:/var/lib/postgresql/data/test_pgdata
    ports:
      - 5432:5432

volumes:
  datavolume:
    driver: local

Note: The last line “driver: local” is completely optional.

Advantage of this way:
you can reuse a docker volume, and share it across multiple services.

2 - File mount

This time, no need to define the last part of the yml file: the top level volumes key

version: "3"

services:
  db:
    image: postgres:9.6
    environment:
      PGDATA: /var/lib/postgresql/data/test_pgdata
      POSTGRES_USER: yourusername
      POSTGRES_PASSWORD: yourpassword
      POSTGRES_DB: test
    volumes:
      - /usr/local/var/postgres/data/test_pgdata:/var/lib/postgresql/data/test_pgdata
    ports:
      - 5432:5432

Note: you can also specify relative paths, but they should always begin with . or ..

5. Full volumes definition

There is a more complete, detailed way to define volumes. The docker documentation is pretyy clear, here is a sample:

version: "3.7"
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    volumes:
      - type: volume
        source: mydata
        target: /data
        volume:
          nocopy: true
      - type: bind
        source: ./static
        target: /opt/app/static

networks:
  webnet:

volumes:
  mydata:

As you can see on this example there is one file bind (static) and one volume definition (mydata)
type: volume or bind (other options exists but are less used)
target: always a path (refers to path inner the container)
source: path if bind, name of volume otherwise

Ressources:

Guide: linuxhint