TechDocs/Docker/Debug

Docker debugging

The goal of this document is to give troubleshooting methods for our Docker environment. This applies to when a service managed by Docker is not running properly.

Get the list of running containers

First, one should check if the containers needed for the service are running. To know the name of containers associated with a service, go to the service list and look for the documentation or the source code of the service. For example, if Wekan is not working, the documentation points you to this repository. There you can look for the README.md, or the ansible playbook. The playbook or the README will give the name of the docker containers associated with the service. For example, the playbook for wekan shows here and here that the containers responsible for wekan have the names wekandb and wekan, respectively. Once you know what you're looking for, you should check the list of containers.

You can then get the list of Docker containers in two ways:

From the Docker status page

Please go to docker.fsfe.org. You'll find a list of docker containers along with their logs.

Directly on lund

On the lund server, type the following command:

docker ps -a
docker logs <contains name>

Without the -a flag the stopped containers are excluded from the list.

If the container is running, check the logs of the container for errors.

If the container is stopped, then you will have to figure out why it does not start. Below is a list of methods to troubleshoot why a container is not starting.

Simply start the container

If the container is stopped and you want to start it, use the following command:

docker start <container name>

Then you can use the docker ps  command to make sure the container has been started. If the service does not work after using the docker start command, there is two possibilities:

1- docker start reports an error, then the error is related to the docker configuration / the container.

2- docker start works, then the error is in the application running in the container, and the application make the container crash (exit code != 0).

Recreate the new the container

If 1, then docker start will give you indications about the error. The error is probably in the configuration of the container. If the container was deployed on lund, it was probably deployed using an ansible playbook. Fix the playbook (depending on the error you got from the docker start command) and redeploy the playbook. The command to do this is in the README file of the service. It must be something like ansible-playbook -i hosts drone.deploy.yml.

Run the container manually

If 2., the docker start command worked but the container crashes immediately, then you will have to run the container manually to get more information about the error. One way to do that is to find out how what were the parameters used to run the container, and use them in a docker run command so you can know what is going on. All parameters for containers are given by the docker inspect <container name> command, but the best way to get them is to use the ansible playbook, and run the corresponding docker run command.

For example, if the docker container art13 does not start, you can find all the parameters required to run it in the playbook:

- name: run the wekan container
  docker_container:
    name: wekan
    image: wekanteam/wekan:v0.75
    state: started
    restart: yes
    restart_policy: always
    networks:
      - name: wekan-net
        ipv4_address: '192.168.201.20'
        links:
          - wekandb:wekandb
    env:
      VIRTUAL_HOST: kan.fsfe.org
      LETSENCRYPT_HOST: kan.fsfe.org
      LETSENCRYPT_EMAIL: max.mehl@fsfe.org
      MONGO_URL: mongodb://wekandb/wekan
      ROOT_URL: https://kan.fsfe.org
      MAIL_URL: smtp://mail.fsfe.org:25/
      MAIL_FROM: admin@fsfe.org

Then use docker run --help to find the correct arguments:

docker run --name wekan wekanteam/wekan:v0.75 \
           --restart always \
           --network wekan-net \
           --ip 292.168.201.20 \
           --link wekandb:wekandb \
           -e VIRTUAL_HOST=kan.fsfe.org \
           -e LETSENCRYPT_HOST= kan.fsfe.org \
           -e LETSENCRYPT_EMAIL=max.mehl@fsfe.org \
           -e MONGO_URL=mongodb://wekandb/wekan \
           -e ROOT_URL=https://kan.fsfe.org \
           -e MAIL_URL smtp://mail.fsfe.org:25/ \
           -e MAIL_FROM=admin@fsfe.org \
           wekanteam/wekan:v0.75

This command will start the docker container manually, so you can see all errors and hopefully understand what is causing the errors.

TechDocs/Docker/Debug (last edited 2018-05-08 17:54:32 by vincent)