/ README.md
README.md
1 <div align="center"><img height="180" src="https://gitlab.iscpif.fr/gargantext/main/raw/master/images/logo.png"></div> 2 3 4 # Gargantext with Haskell (Backend instance) 5 6       7 8 #### Table of Contents 9 1. [About the project](#about) 10 2. [Installation](#install) 11 3. [Initialization](#init) 12 4. [Launch & develop GarganText](#launch) 13 5. [Uses cases](#use-cases) 14 6. [GraphQL](#graphql) 15 7. [PostgreSQL](#postgresql) 16 17 ## About the project <a name="about"></a> 18 19 GarganText is a collaborative web-decentralized-based macro-service platform for the exploration of unstructured texts. It combines tools from natural language processing, text-data-mining bricks, complex networks analysis algorithms and interactive data visualization tools to pave the way toward new kinds of interactions with your textual and digital corpora. 20 21 This software is free (as "Libre" in French) software, developed by the CNRS Complex Systems Institute of Paris Île-de-France (ISC-PIF) and its partners. 22 23 GarganText Project: this repo builds the backend for the frontend server built by [backend](https://gitlab.iscpif.fr/gargantext/haskell-gargantext). 24 25 26 ## Installation <a name="install"></a> 27 28 Disclaimer: since this project is still in development, this document remains in progress. Please report and improve this documentation if you encounter any issues. 29 30 #### Prerequisites 31 32 - Install: 33 - git (https://git-scm.com/book/en/v2/Getting-Started-Installing-Git) 34 - curl (https://everything.curl.dev/get) 35 - Clone the project. 36 ```shell 37 git clone https://gitlab.iscpif.fr/gargantext/haskell-gargantext.git 38 cd haskell-gargantext 39 ``` 40 ### Installation 41 42 This project can be built with either Stack or Cabal. We keep up-to-date the `cabal.project` (which allows us 43 to build with `cabal` by default) but we support `stack` thanks to thanks to 44 [cabal2stack](https://github.com/iconnect/cabal2stack), which allows us to generate a valid `stack.yaml` from 45 a `cabal.project`. Due to the fact gargantext requires a particular set of system dependencies (C++ libraries, 46 toolchains, etc) we use [nix](https://nixos.org/) to setup an environment with all the required system 47 dependencies, in a sandboxed and isolated fashion. 48 49 #### Install Nix 50 51 As said, Gargantext requires [Nix](https://github.com/NixOS/nix) to provide system dependencies (for example, C libraries), but its use is limited to that. In order to install [Nix](https://nixos.org/download.html): 52 53 ```shell 54 sh <(curl -L https://nixos.org/nix/install) --daemon 55 ``` 56 57 Verify the installation is complete with 58 ```shell 59 nix-env --version 60 nix-env (Nix) 2.19.2 61 ``` 62 63 **Important:** Before building the project with either `stack` or `cabal` you need to be in the correct Nix shell, which will fetch all the required system dependencies. To do so, just type **inside your haskell-gargantext folder**: 64 65 ```shell 66 nix-shell 67 ``` 68 69 This will take a bit of time as it has to download/build the dependencies, but this will be needed only the first time. 70 71 ### Build: choose cabal (new) or stack (old) 72 73 #### With Cabal (recommanded) 74 75 ##### Turning off optimization flags 76 77 Create a `cabal.project.local` file (don't commit it to git!): 78 ``` 79 package gargantext 80 ghc-options: -fwrite-ide-info -hiedir=".stack-work/hiedb" -O0 81 82 package gargantext-admin 83 ghc-options: -O0 84 85 package gargantext-cli 86 ghc-options: -O0 87 88 package gargantext-db-obfuscation 89 ghc-options: -O0 90 91 package gargantext-import 92 ghc-options: -O0 93 94 package gargantext-init 95 ghc-options: -O0 96 97 package gargantext-invitations 98 ghc-options: -O0 99 100 package gargantext-phylo 101 ghc-options: -O0 102 103 package gargantext-server 104 ghc-options: -O0 105 106 package gargantext-upgrade 107 ghc-options: -O0 108 109 package gargantext-graph 110 ghc-options: -O0 111 112 package hmatrix 113 ghc-options: -O0 114 115 package sparse-linear 116 ghc-options: -O0 117 ``` 118 119 ##### Building 120 121 First, into `nix-shell`: 122 ```shell 123 cabal update 124 cabal install 125 ``` 126 127 Alternatively, if you want to run the command "from the outside", in your current shell: 128 129 ``` 130 nix-shell --run "cabal update" 131 nix-shell --run "cabal install" 132 ``` 133 134 #### With Stack 135 136 Install [Stack (or Haskell Tool Stack)](https://docs.haskellstack.org/en/stable/): 137 138 ```shell 139 curl -sSL https://get.haskellstack.org/ | sh 140 ``` 141 142 Verify the installation is complete with 143 ```shell 144 stack --version 145 Version 2.9.1 146 ``` 147 148 NOTE: Default build (with optimizations) requires large amounts of RAM (16GB at least). To avoid heavy compilation times and swapping out your machine, it is recommended to `stack build` with the `--fast` flag, i.e.: 149 150 ```shell 151 stack build --fast 152 ``` 153 154 155 #### Keeping the stack.yaml updated with the cabal.project 156 157 (Section for Developers using stack only) 158 159 Once you have a valid version of `stack`, building requires generating a valid `stack.yaml`. 160 This can be obtained by installing `cabal2stack`: 161 162 ```shell 163 git clone https://github.com/iconnect/cabal2stack.git 164 cd cabal2stack 165 ``` 166 167 Then, depending on what build system you are using, either build with `cabal install --overwrite-policy=always` or `stack install`. 168 169 And finally: 170 171 ```shell 172 cabal2stack --system-ghc --allow-newer --resolver lts-21.17 --resolver-file devops/stack/lts-21.17.yaml -o stack.yaml 173 stack build 174 ``` 175 176 The good news is that you don't have to do all of this manually; during development, after modifying the 177 `cabal.project`, it's enough to do: 178 179 ```shell 180 ./bin/update-project-dependencies 181 ``` 182 183 ## Initialization <a name="init"></a> 184 185 #### 1. Docker-compose will configure your database and some NLP bricks (such as CoreNLP): 186 187 ``` sh 188 # If docker is not installed: 189 # curl -sSL https://gitlab.iscpif.fr/gargantext/haskell-gargantext/raw/dev/devops/docker/install_docker | sh 190 cd devops/docker 191 docker compose up 192 ``` 193 Initialization schema should be loaded automatically (from `devops/postgres/schema.sql`). 194 195 ##### (Optional) If using stack, then install: 196 ``` sh 197 stack install 198 ``` 199 200 #### 2. Copy the configuration file: 201 ``` sh 202 cp gargantext.ini_toModify gargantext.ini 203 ``` 204 > Do not worry, `.gitignore` avoids adding this file to the repository by mistake, then you can change the passwords in gargantext.ini safely. 205 206 #### 3. A user have to be created first as instance: 207 ``` sh 208 ~/.local/bin/gargantext-init "gargantext.ini" 209 ``` 210 Now, `user1` is created with password `1resu` 211 212 #### 4. Clone FRONTEND repository: 213 214 From the Backend root folder (haskell-gargantext): 215 216 ```shell 217 git clone ssh://git@gitlab.iscpif.fr:20022/gargantext/purescript-gargantext.git 218 ``` 219 220 221 ## Launch & develop GarganText <a name="launch"></a> 222 223 > **Note:** here, the method with Cabal is used as default 224 225 226 From the Backend root folder (haskell-gargantext): 227 228 ``` shell 229 ./start 230 # The start script runs following commands: 231 # - `./bin/install` to update and build the project 232 # - `docker compose up` to run the Docker for postgresql from devops/docker folder 233 # - `cabal run gargantext-server -- --ini gargantext.ini --run Prod` to run other services through `nix-shell` 234 ``` 235 236 For frontend development and compilation, see the [Frontend Readme.md](https://gitlab.iscpif.fr/gargantext/purescript-gargantext#dev) 237 238 ### Running tests 239 240 From nix shell: 241 242 ``` 243 cabal v2-test --test-show-details=streaming 244 ``` 245 246 Or, from "outside": 247 ``` 248 nix-shell --run "cabal v2-test --test-show-details=streaming" 249 ``` 250 ### Working on libraries 251 252 When a devlopment is needed on libraries (for instance, the HAL crawler in https://gitlab.iscpif.fr/gargantext/crawlers): 253 254 1. Ongoing devlopment (on local repo): 255 1. In `cabal.project`: 256 - add `../hal` to `packages:` 257 - turn off (temporarily) the `hal` in `source-repository-package` 258 2. When changes work and tests are OK, commit in repo `hal` 259 2. When changes are commited / merged: 260 1. Get the hash id, and edit `cabal.project` with the **new commit id** 261 2. run `./bin/update-project-dependencies` 262 - get an error that sha256 don't match, so update the `./bin/update-project-dependencies` with new sha256 hash 263 - run again `./bin/update-project-dependencies` (to make sure it's a fixed point now) 264 265 > Note: without `stack.yaml` we would have to only fix `cabal.project` -> `source-repository-package` commit id. Sha256 is there to make sure CI reruns the tests. 266 267 ## Use Cases <a name="use-cases"></a> 268 269 ### Multi-User with Graphical User Interface (Server Mode) 270 271 ``` sh 272 ~/.local/bin/stack --docker exec gargantext-server -- --ini "gargantext.ini" --run Prod 273 ``` 274 275 Then you can log in with `user1` / `1resu` 276 277 278 ### Command Line Mode tools 279 280 #### Simple cooccurrences computation and indexation from a list of Ngrams 281 282 ``` sh 283 stack --docker exec gargantext-cli -- CorpusFromGarg.csv ListFromGarg.csv Ouput.json 284 ``` 285 286 ### Analyzing the ngrams table repo 287 288 We store the repository in directory `repos` in the [CBOR](https://cbor.io/) file format. To decode it to JSON and analyze, say, using [jq](https://shapeshed.com/jq-json/), use the following command: 289 290 ``` sh 291 cat repos/repo.cbor.v5 | stack exec gargantext-cbor2json | jq . 292 ``` 293 ### Documentation 294 295 To build documentation, run: 296 297 ```sh 298 stack build --haddock --no-haddock-deps --fast 299 ``` 300 301 (in `.stack-work/dist/x86_64-linux-nix/Cabal-3.2.1.0/doc/html/gargantext`). 302 303 ## GraphQL <a name="graphql"></a> 304 305 Some introspection information. 306 307 Playground is located at http://localhost:8008/gql 308 309 ### List all GraphQL types in the Playground 310 311 ``` 312 { 313 __schema { 314 types { 315 name 316 } 317 } 318 } 319 ``` 320 321 ### List details about a type in GraphQL 322 323 ``` 324 { 325 __type(name:"User") { 326 fields { 327 name 328 description 329 type { 330 name 331 } 332 } 333 } 334 } 335 ``` 336 ## PostgreSQL <a name="pgsql"></a> 337 338 ### Upgrading using Docker 339 340 https://www.cloudytuts.com/tutorials/docker/how-to-upgrade-postgresql-in-docker-and-kubernetes/ 341 342 To upgrade PostgreSQL in Docker containers, for example from 11.x to 14.x, simply run: 343 ```sh 344 docker exec -it <container-id> pg_dumpall -U gargantua > 11-db.dump 345 ``` 346 347 Then, shut down the container, replace `image` section in `devops/docker/docker-compose.yaml` with `postgres:14`. Also, it is a good practice to create a new volume, say `garg-pgdata14` and bind the new container to it. If you want to keep the same volume, remember about removing it like so: 348 ```sh 349 docker-compose rm postgres 350 docker volume rm docker_garg-pgdata 351 ``` 352 353 Now, start the container and execute: 354 ```sh 355 # need to drop the empty DB first, since schema will be created when restoring the dump 356 docker exec -i <new-container-id> dropdb -U gargantua gargandbV5 357 # recreate the db, but empty with no schema 358 docker exec -i <new-container-id> createdb -U gargantua gargandbV5 359 # now we can restore the dump 360 docker exec -i <new-container-id> psql -U gargantua -d gargandbV5 < 11-db.dump 361 ``` 362 363 ### Upgrading using 364 365 There is a solution using pgupgrade_cluster but you need to manage the clusters version 14 and 13. Hence here is a simple solution to upgrade. 366 367 First save your data: 368 ``` 369 sudo su postgres 370 pg_dumpall > gargandb.dump 371 ``` 372 373 Upgrade postgresql: 374 ``` 375 sudo apt install postgresql-server-14 postgresql-client-14 376 sudo apt remove --purge postgresql-13 377 ``` 378 Restore your data: 379 ``` 380 sudo su postgres 381 psql < gargandb.dump 382 ``` 383 384 Maybe you need to restore the gargantua password 385 ``` 386 ALTER ROLE gargantua PASSWORD 'yourPasswordIn_gargantext.ini' 387 ``` 388 Maybe you need to change the port to 5433 for database connection in your gargantext.ini file. 389 390 391 392