/ README.md
README.md
  1  <div align="center"><img height="180" src="https://gitlab.iscpif.fr/gargantext/main/raw/master/images/logo.png"></div>
  2  
  3  &nbsp;
  4  # Gargantext with Haskell (Backend instance)
  5  
  6  ![Haskell](https://img.shields.io/badge/Code-Haskell-informational?style=flat&logo=haskell&color=6144b3)&nbsp;&nbsp;![Nix](https://img.shields.io/badge/Package%20manager-Nix-informational?style=flat&logo=nixos&color=6586c8)&nbsp;&nbsp;![Cabal](https://img.shields.io/badge/Tools-Cabal-informational?style=flat&logo=cabal&color=567dd9)&nbsp;&nbsp;![Stack](https://img.shields.io/badge/Tools-Stack-informational?style=flat&logo=stack&color=6144b3)&nbsp;&nbsp;![GHC](https://img.shields.io/badge/Tools-GHC-informational?style=flat&logo=&color=2E677B)&nbsp;&nbsp;![Docker](https://img.shields.io/badge/Tools-Docker-informational?style=flat&logo=docker&color=003f8c)
  7  
  8  #### Table of Contents
  9  1. [About the project](#about)
 10  2. [Installation](#install)
 11  3. [Initialization](#init)
 12  4. [Launch & develop GarganText](#launch)
 13  5. [Uses cases](#use-cases)
 14  6. [GraphQL](#graphql)
 15  7. [PostgreSQL](#postgresql)
 16  
 17  ## About the project <a name="about"></a>
 18  
 19  GarganText is a collaborative web-decentralized-based macro-service platform for the exploration of unstructured texts. It combines tools from natural language processing, text-data-mining bricks, complex networks analysis algorithms and interactive data visualization tools to pave the way toward new kinds of interactions with your textual and digital corpora.
 20  
 21  This software is free (as "Libre" in French) software, developed by the CNRS Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its partners.
 22  
 23  GarganText Project: this repo builds the backend for the frontend server built by [backend](https://gitlab.iscpif.fr/gargantext/haskell-gargantext).
 24  
 25  
 26  ## Installation <a name="install"></a>
 27  
 28  Disclaimer: since this project is still in development, this document remains in progress. Please report and improve this documentation if you encounter any issues.
 29  
 30  #### Prerequisites
 31  
 32  - Install:
 33    - git (https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
 34    - curl (https://everything.curl.dev/get)
 35  - Clone the project.
 36    ```shell
 37    git clone https://gitlab.iscpif.fr/gargantext/haskell-gargantext.git
 38    cd haskell-gargantext
 39    ```
 40  ### Installation
 41  
 42  This project can be built with either Stack or Cabal. We keep up-to-date the `cabal.project` (which allows us
 43  to build with `cabal` by default) but we support `stack` thanks to thanks to
 44  [cabal2stack](https://github.com/iconnect/cabal2stack), which allows us to generate a valid `stack.yaml` from
 45  a `cabal.project`. Due to the fact gargantext requires a particular set of system dependencies (C++ libraries,
 46  toolchains, etc) we use [nix](https://nixos.org/) to setup an environment with all the required system
 47  dependencies, in a sandboxed and isolated fashion.
 48  
 49  #### Install Nix 
 50  
 51  As said, Gargantext requires [Nix](https://github.com/NixOS/nix) to provide system dependencies (for example, C libraries), but its use is limited to that. In order to install [Nix](https://nixos.org/download.html):
 52  
 53  ```shell
 54  sh <(curl -L https://nixos.org/nix/install) --daemon
 55  ```
 56  
 57  Verify the installation is complete with
 58  ```shell
 59  nix-env --version
 60  nix-env (Nix) 2.19.2
 61  ```
 62  
 63  **Important:** Before building the project with either `stack` or `cabal` you need to be in the correct Nix shell, which will fetch all the required system dependencies. To do so, just type **inside your haskell-gargantext folder**:
 64  
 65  ```shell
 66  nix-shell
 67  ```
 68  
 69  This will take a bit of time as it has to download/build the dependencies, but this will be needed only the first time.
 70  
 71  ### Build: choose cabal (new) or stack (old)
 72  
 73  #### With Cabal (recommanded)
 74  
 75  ##### Turning off optimization flags
 76  
 77  Create a `cabal.project.local` file (don't commit it to git!):
 78  ```
 79  package gargantext
 80      ghc-options: -fwrite-ide-info -hiedir=".stack-work/hiedb" -O0
 81  
 82  package gargantext-admin
 83      ghc-options: -O0
 84  
 85  package gargantext-cli
 86      ghc-options: -O0
 87  
 88  package gargantext-db-obfuscation
 89      ghc-options: -O0
 90  
 91  package gargantext-import
 92      ghc-options: -O0
 93  
 94  package gargantext-init
 95      ghc-options: -O0
 96  
 97  package gargantext-invitations
 98      ghc-options: -O0
 99  
100  package gargantext-phylo
101      ghc-options: -O0
102  
103  package gargantext-server
104      ghc-options: -O0
105  
106  package gargantext-upgrade
107      ghc-options: -O0
108  
109  package gargantext-graph
110      ghc-options: -O0
111  
112  package hmatrix
113      ghc-options: -O0
114  
115  package sparse-linear
116      ghc-options: -O0
117  ```
118  
119  ##### Building
120  
121  First, into `nix-shell`:
122  ```shell
123  cabal update
124  cabal install
125  ```
126  
127  Alternatively, if you want to run the command "from the outside", in your current shell:
128  
129  ```
130  nix-shell --run "cabal update"
131  nix-shell --run "cabal install"
132  ```
133  
134  #### With Stack
135  
136  Install [Stack (or Haskell Tool Stack)](https://docs.haskellstack.org/en/stable/):
137  
138  ```shell
139  curl -sSL https://get.haskellstack.org/ | sh
140  ```
141  
142  Verify the installation is complete with
143  ```shell
144  stack --version
145  Version 2.9.1
146  ```
147  
148  NOTE: Default build (with optimizations) requires large amounts of RAM (16GB at least). To avoid heavy compilation times and swapping out your machine, it is recommended to `stack build` with the `--fast` flag, i.e.:
149  
150  ```shell
151  stack build --fast
152  ```
153  
154  
155  #### Keeping the stack.yaml updated with the cabal.project
156  
157  (Section for Developers using stack only)
158  
159  Once you have a valid version of `stack`, building requires generating a valid `stack.yaml`.
160  This can be obtained by installing `cabal2stack`:
161  
162  ```shell
163  git clone https://github.com/iconnect/cabal2stack.git
164  cd cabal2stack
165  ```
166  
167  Then, depending on what build system you are using, either build with `cabal install --overwrite-policy=always` or `stack install`.
168  
169  And finally:
170  
171  ```shell
172  cabal2stack --system-ghc --allow-newer --resolver lts-21.17 --resolver-file devops/stack/lts-21.17.yaml -o stack.yaml
173  stack build
174  ```
175  
176  The good news is that you don't have to do all of this manually; during development, after modifying the
177  `cabal.project`, it's enough to do:
178  
179  ```shell
180  ./bin/update-project-dependencies
181  ```
182  
183  ## Initialization <a name="init"></a>
184  
185  #### 1. Docker-compose will configure your database and some NLP bricks (such as CoreNLP):
186  
187  ``` sh
188  # If docker is not installed:
189  # curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/install_docker | sh
190  cd devops/docker
191  docker compose up
192  ```
193  Initialization schema should be loaded automatically (from `devops/postgres/schema.sql`).
194  
195  ##### (Optional) If using stack, then install:
196  ``` sh
197  stack install
198  ```
199  
200  #### 2. Copy the configuration file:
201  ``` sh
202  cp gargantext.ini_toModify gargantext.ini
203  ```
204  > Do not worry, `.gitignore` avoids adding this file to the repository by mistake, then you can change the passwords in gargantext.ini safely.
205  
206  #### 3. A user have to be created first as instance:
207  ``` sh
208  ~/.local/bin/gargantext-init "gargantext.ini"
209  ```
210  Now, `user1` is created with password `1resu`
211  
212  #### 4. Clone FRONTEND repository:
213  
214  From the Backend root folder (haskell-gargantext):
215  
216  ```shell
217  git clone ssh://git@gitlab.iscpif.fr:20022/gargantext/purescript-gargantext.git
218  ```
219  &nbsp;
220  
221  ## Launch & develop GarganText <a name="launch"></a>
222  
223  >  **Note:** here, the method with Cabal is used as default
224  
225  
226  From the Backend root folder (haskell-gargantext):
227  
228  ``` shell
229  ./start
230  # The start script runs following commands:
231  # - `./bin/install` to update and build the project
232  # - `docker compose up` to run the Docker for postgresql from devops/docker folder
233  # - `cabal run gargantext-server -- --ini gargantext.ini --run Prod` to run other services through `nix-shell`
234  ```
235  
236  For frontend development and compilation, see the [Frontend Readme.md](https://gitlab.iscpif.fr/gargantext/purescript-gargantext#dev)
237  
238  ### Running tests
239  
240  From nix shell:
241  
242  ```
243  cabal v2-test --test-show-details=streaming
244  ```
245  
246  Or, from "outside":
247  ```
248  nix-shell --run "cabal v2-test --test-show-details=streaming"
249  ```
250  ### Working on libraries
251  
252  When a devlopment is needed on libraries (for instance, the HAL crawler in https://gitlab.iscpif.fr/gargantext/crawlers):
253  
254  1. Ongoing devlopment (on local repo):
255     1. In `cabal.project`:
256        - add `../hal` to `packages:`
257        - turn off (temporarily) the `hal` in `source-repository-package` 
258     2. When changes work and tests are OK, commit in repo `hal`
259  2. When changes are commited / merged:
260     1. Get the hash id, and edit `cabal.project` with the **new commit id**
261     2. run `./bin/update-project-dependencies`
262        - get an error that sha256 don't match, so update the `./bin/update-project-dependencies` with new sha256 hash
263        - run again `./bin/update-project-dependencies` (to make sure it's a fixed point now)
264  
265  > Note: without `stack.yaml` we would have to only fix `cabal.project` -> `source-repository-package` commit id. Sha256 is there to make sure CI reruns the tests.
266  
267  ## Use Cases <a name="use-cases"></a>
268  
269  ### Multi-User with Graphical User Interface (Server Mode)
270  
271  ``` sh
272  ~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod
273  ```
274  
275  Then you can log in with `user1` / `1resu`
276  
277  
278  ### Command Line Mode tools
279  
280  #### Simple cooccurrences computation and indexation from a list of Ngrams
281  
282  ``` sh
283  stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json
284  ```
285  
286  ### Analyzing the ngrams table repo
287  
288  We store the repository in directory `repos` in the [CBOR](https://cbor.io/) file format. To decode it to JSON and analyze, say, using [jq](https://shapeshed.com/jq-json/), use the following command:
289  
290  ``` sh
291  cat repos/repo.cbor.v5 | stack exec gargantext-cbor2json | jq .
292  ```
293  ### Documentation
294  
295  To build documentation, run:
296  
297  ```sh
298  stack build --haddock --no-haddock-deps --fast
299  ```
300  
301  (in `.stack-work/dist/x86_64-linux-nix/Cabal-3.2.1.0/doc/html/gargantext`).
302  
303  ## GraphQL <a name="graphql"></a>
304  
305  Some introspection information.
306  
307  Playground is located at http://localhost:8008/gql
308  
309  ### List all GraphQL types in the Playground
310  
311  ```
312  {
313    __schema {
314      types {
315        name
316      }
317    }
318  }
319  ```
320  
321  ### List details about a type in GraphQL
322  
323  ```
324  {
325    __type(name:"User") {
326    	fields {
327      	name
328        description
329        type {
330          name
331        }
332    	}
333  	}
334  }
335  ```
336  ## PostgreSQL <a name="pgsql"></a>
337  
338  ### Upgrading using Docker
339  
340  https://www.cloudytuts.com/tutorials/docker/how-to-upgrade-postgresql-in-docker-and-kubernetes/
341  
342  To upgrade PostgreSQL in Docker containers, for example from 11.x to 14.x, simply run:
343  ```sh
344  docker exec -it <container-id> pg_dumpall -U gargantua > 11-db.dump
345  ```
346  
347  Then, shut down the container, replace `image` section in `devops/docker/docker-compose.yaml` with `postgres:14`. Also, it is a good practice to create a new volume, say `garg-pgdata14` and bind the new container to it. If you want to keep the same volume, remember about removing it like so:
348  ```sh
349  docker-compose rm postgres
350  docker volume rm docker_garg-pgdata
351  ```
352  
353  Now, start the container and execute:
354  ```sh
355  # need to drop the empty DB first, since schema will be created when restoring the dump
356  docker exec -i <new-container-id> dropdb -U gargantua gargandbV5
357  # recreate the db, but empty with no schema
358  docker exec -i <new-container-id> createdb -U gargantua gargandbV5
359  # now we can restore the dump
360  docker exec -i <new-container-id> psql -U gargantua -d gargandbV5 < 11-db.dump
361  ```
362  
363  ### Upgrading using 
364  
365  There is a solution using pgupgrade_cluster but you need to manage the clusters version 14 and 13. Hence here is a simple solution to upgrade.
366  
367  First save your data:
368  ```
369  sudo su postgres
370  pg_dumpall > gargandb.dump
371  ```
372  
373  Upgrade postgresql:
374  ```
375  sudo apt install postgresql-server-14 postgresql-client-14
376  sudo apt remove --purge postgresql-13
377  ```
378  Restore your data:
379  ```
380  sudo su postgres
381  psql < gargandb.dump
382  ```
383  
384  Maybe you need to restore the gargantua password
385  ```
386  ALTER ROLE gargantua PASSWORD 'yourPasswordIn_gargantext.ini'
387  ```
388  Maybe you need to change the port to 5433 for database connection in your gargantext.ini file.
389  
390  
391  
392