outlines_llama-cpp-python_Q_and_A_with_citations.ipynb
1 { 2 "cells": [ 3 { 4 "cell_type": "markdown", 5 "id": "5c8a48bb-484c-4135-ad03-5380f4348070", 6 "metadata": {}, 7 "source": [ 8 "# Generate Synthetic Data and Q&A with Citations" 9 ] 10 }, 11 { 12 "cell_type": "markdown", 13 "id": "52f5766f-a222-40d0-bb7b-87aea7f6acbb", 14 "metadata": {}, 15 "source": [ 16 "## Adapted from the Ollama notebook" 17 ] 18 }, 19 { 20 "cell_type": "markdown", 21 "id": "aff56341-244e-475b-b0ca-63e41a59ef5f", 22 "metadata": {}, 23 "source": [ 24 "## Requirements" 25 ] 26 }, 27 { 28 "cell_type": "markdown", 29 "id": "040befa2-f90d-480f-8c0e-9b2384b6e1ff", 30 "metadata": {}, 31 "source": [ 32 "### Install llama-cpp-python and outlines" 33 ] 34 }, 35 { 36 "cell_type": "code", 37 "execution_count": 1, 38 "id": "876492fb-569f-4034-9660-0c30191521dc", 39 "metadata": { 40 "execution": { 41 "iopub.execute_input": "2024-07-13T17:57:14.112190Z", 42 "iopub.status.busy": "2024-07-13T17:57:14.111193Z", 43 "iopub.status.idle": "2024-07-13T17:57:14.117229Z", 44 "shell.execute_reply": "2024-07-13T17:57:14.115930Z", 45 "shell.execute_reply.started": "2024-07-13T17:57:14.112138Z" 46 } 47 }, 48 "outputs": [], 49 "source": [ 50 "# RUN IT ONLY ONCE TO INSTALL THE REQUIREMENTS\n", 51 "# %pip install llama-cpp-python outlines" 52 ] 53 }, 54 { 55 "cell_type": "markdown", 56 "id": "a633a79a-f12f-493a-b8e3-029a008a3610", 57 "metadata": {}, 58 "source": [ 59 "For detailed installation instructions, see [llama-cpp-python installation](https://llama-cpp-python.readthedocs.io/en/stable/) and [outlines installation](https://outlines-dev.github.io/outlines/installation/)" 60 ] 61 }, 62 { 63 "cell_type": "markdown", 64 "id": "fa19089d-997d-41b9-9b1c-21f09ef03105", 65 "metadata": {}, 66 "source": [ 67 "### Pull the model from HuggingFace" 68 ] 69 }, 70 { 71 "cell_type": "markdown", 72 "id": "1a2d88f6-39df-4dde-a206-48283d9f1b8d", 73 "metadata": {}, 74 "source": [ 75 "Download a GGUF model from HuggingFace [here](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/tree/main), for example, the `Q4_K_M` one (it requires 4.92 GB):" 76 ] 77 }, 78 { 79 "cell_type": "code", 80 "execution_count": 2, 81 "id": "9beb92f1-79d5-4ddd-9b3c-3b1723acd620", 82 "metadata": { 83 "execution": { 84 "iopub.execute_input": "2024-07-13T17:57:14.119085Z", 85 "iopub.status.busy": "2024-07-13T17:57:14.118657Z", 86 "iopub.status.idle": "2024-07-13T17:57:14.142574Z", 87 "shell.execute_reply": "2024-07-13T17:57:14.141425Z", 88 "shell.execute_reply.started": "2024-07-13T17:57:14.119044Z" 89 } 90 }, 91 "outputs": [], 92 "source": [ 93 "# RUN IT ONLY ONCE TO DOWNLOAD THE GGUF MODEL, IN THIS CASE THE Q4_K_M\n", 94 "# !wget https://hf.co/NousResearch/Hermes-2-Pro-Llama-3-8B-GGUF/resolve/main/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf" 95 ] 96 }, 97 { 98 "cell_type": "markdown", 99 "id": "00c1d29a-11f5-4e67-b968-1d7475ec105e", 100 "metadata": {}, 101 "source": [ 102 "## Usage" 103 ] 104 }, 105 { 106 "cell_type": "markdown", 107 "id": "e71b912b-5225-4257-87bb-1327d6a4e42a", 108 "metadata": {}, 109 "source": [ 110 "### Generate Synthetic Data" 111 ] 112 }, 113 { 114 "cell_type": "markdown", 115 "id": "5e744906-879e-4a4b-90cc-1151dbdc6020", 116 "metadata": {}, 117 "source": [ 118 "#### Define Pydantic class" 119 ] 120 }, 121 { 122 "cell_type": "code", 123 "execution_count": 3, 124 "id": "26091031-2eaa-424c-9789-1f502d4697ba", 125 "metadata": { 126 "execution": { 127 "iopub.execute_input": "2024-07-13T17:57:14.144768Z", 128 "iopub.status.busy": "2024-07-13T17:57:14.144286Z", 129 "iopub.status.idle": "2024-07-13T17:57:14.246788Z", 130 "shell.execute_reply": "2024-07-13T17:57:14.245330Z", 131 "shell.execute_reply.started": "2024-07-13T17:57:14.144722Z" 132 } 133 }, 134 "outputs": [], 135 "source": [ 136 "from pydantic import BaseModel, Field\n", 137 "\n", 138 "class UserDetail(BaseModel):\n", 139 " id: int = Field(..., description=\"Unique identifier\") # so the model keeps track of the number of fake users\n", 140 " first_name: str\n", 141 " last_name: str\n", 142 " age: int" 143 ] 144 }, 145 { 146 "cell_type": "code", 147 "execution_count": 4, 148 "id": "fbc58b63-d071-4098-aa18-6c24115c3777", 149 "metadata": { 150 "execution": { 151 "iopub.execute_input": "2024-07-13T17:57:14.248435Z", 152 "iopub.status.busy": "2024-07-13T17:57:14.248075Z", 153 "iopub.status.idle": "2024-07-13T17:57:14.254668Z", 154 "shell.execute_reply": "2024-07-13T17:57:14.253769Z", 155 "shell.execute_reply.started": "2024-07-13T17:57:14.248401Z" 156 } 157 }, 158 "outputs": [], 159 "source": [ 160 "from typing import List\n", 161 "\n", 162 "class Users(BaseModel):\n", 163 " users: List[UserDetail]" 164 ] 165 }, 166 { 167 "cell_type": "markdown", 168 "id": "fc35fdaf-c980-41d8-9b08-25b3956fd3fa", 169 "metadata": {}, 170 "source": [ 171 "#### Load the model" 172 ] 173 }, 174 { 175 "cell_type": "code", 176 "execution_count": 5, 177 "id": "16d810bb-9189-4af4-9734-1a031bca3cec", 178 "metadata": { 179 "execution": { 180 "iopub.execute_input": "2024-07-13T17:57:14.256312Z", 181 "iopub.status.busy": "2024-07-13T17:57:14.255955Z", 182 "iopub.status.idle": "2024-07-13T17:57:19.851467Z", 183 "shell.execute_reply": "2024-07-13T17:57:19.850453Z", 184 "shell.execute_reply.started": "2024-07-13T17:57:14.256282Z" 185 } 186 }, 187 "outputs": [ 188 { 189 "name": "stderr", 190 "output_type": "stream", 191 "text": [ 192 "Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.\n" 193 ] 194 } 195 ], 196 "source": [ 197 "import llama_cpp\n", 198 "from llama_cpp import Llama\n", 199 "from outlines import generate, models\n", 200 "\n", 201 "llm = Llama(\n", 202 " \"/home/asilva/models/Hermes-2-Pro-Llama-3-8B-Q4_K_M.gguf\", # replace with your /path/to/the/model\n", 203 " n_gpu_layers=-1,\n", 204 " tokenizer=llama_cpp.llama_tokenizer.LlamaHFTokenizer.from_pretrained(\n", 205 " \"NousResearch/Hermes-2-Pro-Llama-3-8B\"\n", 206 " ),\n", 207 " use_mlock=True,\n", 208 " flash_attn=True,\n", 209 " verbose=False\n", 210 ")\n", 211 "model = models.LlamaCpp(llm)" 212 ] 213 }, 214 { 215 "cell_type": "code", 216 "execution_count": 6, 217 "id": "25e4dac2-e2c4-4b80-89fa-b5c538660b28", 218 "metadata": { 219 "execution": { 220 "iopub.execute_input": "2024-07-13T17:57:19.853544Z", 221 "iopub.status.busy": "2024-07-13T17:57:19.853186Z", 222 "iopub.status.idle": "2024-07-13T17:57:19.857034Z", 223 "shell.execute_reply": "2024-07-13T17:57:19.856308Z", 224 "shell.execute_reply.started": "2024-07-13T17:57:19.853526Z" 225 } 226 }, 227 "outputs": [], 228 "source": [ 229 "import warnings\n", 230 "warnings.filterwarnings(\"ignore\", category=RuntimeWarning) # ignore runtime warnings" 231 ] 232 }, 233 { 234 "cell_type": "code", 235 "execution_count": 7, 236 "id": "4c1ad604-ddd1-4dc1-a6cf-3c8a41a57553", 237 "metadata": { 238 "execution": { 239 "iopub.execute_input": "2024-07-13T17:57:19.857699Z", 240 "iopub.status.busy": "2024-07-13T17:57:19.857559Z", 241 "iopub.status.idle": "2024-07-13T17:57:22.652983Z", 242 "shell.execute_reply": "2024-07-13T17:57:22.652084Z", 243 "shell.execute_reply.started": "2024-07-13T17:57:19.857686Z" 244 } 245 }, 246 "outputs": [], 247 "source": [ 248 "generator = generate.json(model, Users)\n", 249 "response = generator(\"Create 5 fake users\", max_tokens=1024, temperature=0, seed=42)" 250 ] 251 }, 252 { 253 "cell_type": "code", 254 "execution_count": 8, 255 "id": "8774e605-6b1a-4eb8-b20e-6bfdbecf27df", 256 "metadata": { 257 "execution": { 258 "iopub.execute_input": "2024-07-13T17:57:22.654065Z", 259 "iopub.status.busy": "2024-07-13T17:57:22.653871Z", 260 "iopub.status.idle": "2024-07-13T17:57:22.661043Z", 261 "shell.execute_reply": "2024-07-13T17:57:22.660505Z", 262 "shell.execute_reply.started": "2024-07-13T17:57:22.654048Z" 263 } 264 }, 265 "outputs": [ 266 { 267 "data": { 268 "text/plain": [ 269 "[UserDetail(id=1, first_name='John', last_name='Doe', age=25),\n", 270 " UserDetail(id=2, first_name='Jane', last_name='Doe', age=30),\n", 271 " UserDetail(id=3, first_name='Bob', last_name='Smith', age=40),\n", 272 " UserDetail(id=4, first_name='Alice', last_name='Smith', age=35),\n", 273 " UserDetail(id=5, first_name='John', last_name='Smith', age=20)]" 274 ] 275 }, 276 "execution_count": 8, 277 "metadata": {}, 278 "output_type": "execute_result" 279 } 280 ], 281 "source": [ 282 "response.users" 283 ] 284 }, 285 { 286 "cell_type": "code", 287 "execution_count": 9, 288 "id": "81d3e795-1d52-4b51-8010-0c9ca093d903", 289 "metadata": { 290 "execution": { 291 "iopub.execute_input": "2024-07-13T17:57:22.662139Z", 292 "iopub.status.busy": "2024-07-13T17:57:22.661708Z", 293 "iopub.status.idle": "2024-07-13T17:57:22.687663Z", 294 "shell.execute_reply": "2024-07-13T17:57:22.686377Z", 295 "shell.execute_reply.started": "2024-07-13T17:57:22.662120Z" 296 } 297 }, 298 "outputs": [ 299 { 300 "name": "stdout", 301 "output_type": "stream", 302 "text": [ 303 "John\n", 304 "Doe\n", 305 "25\n", 306 "\n", 307 "Jane\n", 308 "Doe\n", 309 "30\n", 310 "\n", 311 "Bob\n", 312 "Smith\n", 313 "40\n", 314 "\n", 315 "Alice\n", 316 "Smith\n", 317 "35\n", 318 "\n", 319 "John\n", 320 "Smith\n", 321 "20\n", 322 "\n" 323 ] 324 } 325 ], 326 "source": [ 327 "for user in response.users:\n", 328 " print(user.first_name)\n", 329 " print(user.last_name)\n", 330 " print(user.age)\n", 331 " print()" 332 ] 333 }, 334 { 335 "cell_type": "markdown", 336 "id": "6caf5758-a047-45ca-b0ca-52e5173e8456", 337 "metadata": {}, 338 "source": [ 339 "### QA with Citations" 340 ] 341 }, 342 { 343 "cell_type": "markdown", 344 "id": "cb92a6ea-f10c-4c5d-b56b-08189fea42be", 345 "metadata": {}, 346 "source": [ 347 "#### Define Pydantic class" 348 ] 349 }, 350 { 351 "cell_type": "code", 352 "execution_count": 10, 353 "id": "f743b0c2-5690-4c9b-82be-118044577d5c", 354 "metadata": { 355 "execution": { 356 "iopub.execute_input": "2024-07-13T17:57:22.689614Z", 357 "iopub.status.busy": "2024-07-13T17:57:22.689161Z", 358 "iopub.status.idle": "2024-07-13T17:57:22.712493Z", 359 "shell.execute_reply": "2024-07-13T17:57:22.711490Z", 360 "shell.execute_reply.started": "2024-07-13T17:57:22.689570Z" 361 } 362 }, 363 "outputs": [ 364 { 365 "data": { 366 "text/plain": [ 367 "{'properties': {'question': {'title': 'Question', 'type': 'string'},\n", 368 " 'answer': {'title': 'Answer', 'type': 'string'},\n", 369 " 'citations': {'items': {'type': 'string'},\n", 370 " 'title': 'Citations',\n", 371 " 'type': 'array'}},\n", 372 " 'required': ['question', 'answer', 'citations'],\n", 373 " 'title': 'QuestionAnswer',\n", 374 " 'type': 'object'}" 375 ] 376 }, 377 "execution_count": 10, 378 "metadata": {}, 379 "output_type": "execute_result" 380 } 381 ], 382 "source": [ 383 "from typing import List\n", 384 "\n", 385 "from pydantic import BaseModel\n", 386 "\n", 387 "\n", 388 "class QuestionAnswer(BaseModel):\n", 389 " question: str\n", 390 " answer: str\n", 391 " citations: List[str]\n", 392 "\n", 393 "schema = QuestionAnswer.model_json_schema()\n", 394 "schema" 395 ] 396 }, 397 { 398 "cell_type": "markdown", 399 "id": "9df7357d-6631-49d2-be43-b17bc0435e21", 400 "metadata": {}, 401 "source": [ 402 "#### Create function to generate final prompt" 403 ] 404 }, 405 { 406 "cell_type": "code", 407 "execution_count": 11, 408 "id": "746f0e22-a6a5-4925-9f97-19abde8c79ee", 409 "metadata": { 410 "execution": { 411 "iopub.execute_input": "2024-07-13T17:57:22.714215Z", 412 "iopub.status.busy": "2024-07-13T17:57:22.713820Z", 413 "iopub.status.idle": "2024-07-13T17:57:22.728386Z", 414 "shell.execute_reply": "2024-07-13T17:57:22.727263Z", 415 "shell.execute_reply.started": "2024-07-13T17:57:22.714177Z" 416 } 417 }, 418 "outputs": [], 419 "source": [ 420 "def my_final_prompt(question, context):\n", 421 " return (\n", 422 " \"<|im_start|>system\\n\"\n", 423 " \"You are a world class AI model who answers questions in JSON with correct and exact citations \"\n", 424 " \"extracted from the `Context`. \"\n", 425 " f\"Here's the json schema you must adhere to:\\n<schema>\\n{schema}\\n</schema><|im_end|>\\n\"\n", 426 " \"<|im_start|>user\\n\"\n", 427 " + \"`Context`: \"\n", 428 " + context\n", 429 " + \"\\n`Question`: \"\n", 430 " + question + \"<|im_end|>\"\n", 431 " + \"\\n<|im_start|>assistant\\n\"\n", 432 " \"<schema>\"\n", 433 " )" 434 ] 435 }, 436 { 437 "cell_type": "code", 438 "execution_count": 12, 439 "id": "ec7caf8e-635a-4568-8a96-ecfda57addfb", 440 "metadata": { 441 "execution": { 442 "iopub.execute_input": "2024-07-13T17:57:22.730390Z", 443 "iopub.status.busy": "2024-07-13T17:57:22.729934Z", 444 "iopub.status.idle": "2024-07-13T17:57:22.745117Z", 445 "shell.execute_reply": "2024-07-13T17:57:22.743828Z", 446 "shell.execute_reply.started": "2024-07-13T17:57:22.730347Z" 447 } 448 }, 449 "outputs": [], 450 "source": [ 451 "question = \"What did the author do during college?\"\n", 452 "context = \"\"\"\n", 453 "My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.\n", 454 "I went to an arts high school but in university I studied Computational Mathematics and physics.\n", 455 "As part of coop I worked at many companies including Stitchfix, Facebook.\n", 456 "I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\n", 457 "\"\"\"" 458 ] 459 }, 460 { 461 "cell_type": "code", 462 "execution_count": 13, 463 "id": "7324eb3c-bf00-42c9-80e8-171019bd14cf", 464 "metadata": { 465 "execution": { 466 "iopub.execute_input": "2024-07-13T17:57:22.747463Z", 467 "iopub.status.busy": "2024-07-13T17:57:22.746635Z", 468 "iopub.status.idle": "2024-07-13T17:57:22.769163Z", 469 "shell.execute_reply": "2024-07-13T17:57:22.767814Z", 470 "shell.execute_reply.started": "2024-07-13T17:57:22.747414Z" 471 } 472 }, 473 "outputs": [ 474 { 475 "name": "stdout", 476 "output_type": "stream", 477 "text": [ 478 "<|im_start|>system\n", 479 "You are a world class AI model who answers questions in JSON with correct and exact citations extracted from the `Context`. Here's the json schema you must adhere to:\n", 480 "<schema>\n", 481 "{'properties': {'question': {'title': 'Question', 'type': 'string'}, 'answer': {'title': 'Answer', 'type': 'string'}, 'citations': {'items': {'type': 'string'}, 'title': 'Citations', 'type': 'array'}}, 'required': ['question', 'answer', 'citations'], 'title': 'QuestionAnswer', 'type': 'object'}\n", 482 "</schema><|im_end|>\n", 483 "<|im_start|>user\n", 484 "`Context`: \n", 485 "My name is Jason Liu, and I grew up in Toronto Canada but I was born in China.\n", 486 "I went to an arts high school but in university I studied Computational Mathematics and physics.\n", 487 "As part of coop I worked at many companies including Stitchfix, Facebook.\n", 488 "I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.\n", 489 "\n", 490 "`Question`: What did the author do during college?<|im_end|>\n", 491 "<|im_start|>assistant\n", 492 "<schema>\n" 493 ] 494 } 495 ], 496 "source": [ 497 "print(my_final_prompt(question, context))" 498 ] 499 }, 500 { 501 "cell_type": "code", 502 "execution_count": 14, 503 "id": "9c87dbe9-e7f7-4496-a5fd-1df1d66e155b", 504 "metadata": { 505 "execution": { 506 "iopub.execute_input": "2024-07-13T17:57:22.771242Z", 507 "iopub.status.busy": "2024-07-13T17:57:22.770727Z", 508 "iopub.status.idle": "2024-07-13T17:57:23.002395Z", 509 "shell.execute_reply": "2024-07-13T17:57:23.001084Z", 510 "shell.execute_reply.started": "2024-07-13T17:57:22.771195Z" 511 } 512 }, 513 "outputs": [], 514 "source": [ 515 "from outlines import generate, models\n", 516 "\n", 517 "model = models.LlamaCpp(llm)\n", 518 "generator = generate.json(model, QuestionAnswer)" 519 ] 520 }, 521 { 522 "cell_type": "code", 523 "execution_count": 15, 524 "id": "e1a2de78-c638-4a42-a2f4-3ff25191cb5e", 525 "metadata": { 526 "execution": { 527 "iopub.execute_input": "2024-07-13T17:57:23.003436Z", 528 "iopub.status.busy": "2024-07-13T17:57:23.003244Z", 529 "iopub.status.idle": "2024-07-13T17:57:26.021910Z", 530 "shell.execute_reply": "2024-07-13T17:57:26.021061Z", 531 "shell.execute_reply.started": "2024-07-13T17:57:23.003420Z" 532 }, 533 "scrolled": true 534 }, 535 "outputs": [], 536 "source": [ 537 "answer = generator(my_final_prompt(context, question), max_tokens=1024, temperature=0, seed=42)" 538 ] 539 }, 540 { 541 "cell_type": "code", 542 "execution_count": 16, 543 "id": "ce7c29c8-2720-480c-a009-98713c9892a1", 544 "metadata": { 545 "execution": { 546 "iopub.execute_input": "2024-07-13T17:57:26.022960Z", 547 "iopub.status.busy": "2024-07-13T17:57:26.022769Z", 548 "iopub.status.idle": "2024-07-13T17:57:26.027693Z", 549 "shell.execute_reply": "2024-07-13T17:57:26.027020Z", 550 "shell.execute_reply.started": "2024-07-13T17:57:26.022943Z" 551 } 552 }, 553 "outputs": [ 554 { 555 "data": { 556 "text/plain": [ 557 "QuestionAnswer(question='What did Jason Liu do during college?', answer='During college, Jason Liu studied Computational Mathematics and Physics. He also worked at companies such as Stitchfix and Facebook, and started the Data Science club at the University of Waterloo, serving as its president for two years.', citations=['I went to an arts high school but in university I studied Computational Mathematics and physics.', 'As part of coop I worked at many companies including Stitchfix, Facebook.', 'I also started the Data Science club at the University of Waterloo and I was the president of the club for 2 years.'])" 558 ] 559 }, 560 "execution_count": 16, 561 "metadata": {}, 562 "output_type": "execute_result" 563 } 564 ], 565 "source": [ 566 "answer" 567 ] 568 }, 569 { 570 "cell_type": "code", 571 "execution_count": 17, 572 "id": "bd061544-76e4-4b0e-b06c-23775bccf1d2", 573 "metadata": { 574 "execution": { 575 "iopub.execute_input": "2024-07-13T17:57:26.029655Z", 576 "iopub.status.busy": "2024-07-13T17:57:26.029019Z", 577 "iopub.status.idle": "2024-07-13T17:57:26.055293Z", 578 "shell.execute_reply": "2024-07-13T17:57:26.054042Z", 579 "shell.execute_reply.started": "2024-07-13T17:57:26.029616Z" 580 } 581 }, 582 "outputs": [], 583 "source": [ 584 "question1 = \"Where was John born?\"\n", 585 "context1 = \"\"\"\n", 586 "John Doe is a software engineer who was born in New York, USA. \n", 587 "He studied Computer Science at the Massachusetts Institute of Technology. \n", 588 "During his studies, he interned at Google and Microsoft. \n", 589 "He also founded the Artificial Intelligence club at his university and served as its president for three years.\n", 590 "\"\"\"\n", 591 "\n", 592 "\n", 593 "question2 = \"What did Emily study in university?\"\n", 594 "context2 = \"\"\"\n", 595 "Emily Smith is a data scientist from London, England. \n", 596 "She attended the University of Cambridge where she studied Statistics and Machine Learning. \n", 597 "She interned at IBM and Amazon during her summer breaks. \n", 598 "Emily was also the head of the Women in Tech society at her university.\n", 599 "\"\"\"\n", 600 "\n", 601 "question3 = \"Which companies did Robert intern at?\"\n", 602 "context3 = \"\"\"\n", 603 "Robert Johnson, originally from Sydney, Australia, is a renowned cybersecurity expert. \n", 604 "He studied Information Systems at the University of Melbourne. \n", 605 "Robert interned at several cybersecurity firms including NortonLifeLock and McAfee. \n", 606 "He was also the leader of the Cybersecurity club at his university.\n", 607 "\"\"\"\n", 608 "\n", 609 "\n", 610 "question4 = \"What club did Alice start at her university?\"\n", 611 "context4 = \"\"\"\n", 612 "Alice Williams, a native of Dublin, Ireland, is a successful web developer. \n", 613 "She studied Software Engineering at Trinity College Dublin. \n", 614 "Alice interned at several tech companies including Shopify and Squarespace. \n", 615 "She started the Web Development club at her university and was its president for two years.\n", 616 "\"\"\"\n", 617 "\n", 618 "\n", 619 "question5 = \"What did Michael study in high school?\"\n", 620 "context5 = \"\"\"\n", 621 "Michael Brown is a game developer from Tokyo, Japan. \n", 622 "He attended a specialized high school where he studied Game Design. \n", 623 "He later attended the University of Tokyo where he studied Computer Science. \n", 624 "Michael interned at Sony and Nintendo during his university years. \n", 625 "He also started the Game Developers club at his university.\n", 626 "\"\"\"" 627 ] 628 }, 629 { 630 "cell_type": "code", 631 "execution_count": 18, 632 "id": "71990857-0e45-4ef9-92c9-ad688ae2d9d5", 633 "metadata": { 634 "execution": { 635 "iopub.execute_input": "2024-07-13T17:57:26.057195Z", 636 "iopub.status.busy": "2024-07-13T17:57:26.056750Z", 637 "iopub.status.idle": "2024-07-13T17:57:33.586142Z", 638 "shell.execute_reply": "2024-07-13T17:57:33.585021Z", 639 "shell.execute_reply.started": "2024-07-13T17:57:26.057152Z" 640 } 641 }, 642 "outputs": [ 643 { 644 "data": { 645 "text/plain": [ 646 "'Where was John born?'" 647 ] 648 }, 649 "metadata": {}, 650 "output_type": "display_data" 651 }, 652 { 653 "data": { 654 "text/plain": [ 655 "'John Doe was born in New York, USA.'" 656 ] 657 }, 658 "metadata": {}, 659 "output_type": "display_data" 660 }, 661 { 662 "data": { 663 "text/plain": [ 664 "['John Doe is a software engineer who was born in New York, USA.']" 665 ] 666 }, 667 "metadata": {}, 668 "output_type": "display_data" 669 }, 670 { 671 "name": "stdout", 672 "output_type": "stream", 673 "text": [ 674 "\n", 675 "\n", 676 "\n" 677 ] 678 }, 679 { 680 "data": { 681 "text/plain": [ 682 "'What did Emily study in university?'" 683 ] 684 }, 685 "metadata": {}, 686 "output_type": "display_data" 687 }, 688 { 689 "data": { 690 "text/plain": [ 691 "'Emily studied Statistics and Machine Learning in university.'" 692 ] 693 }, 694 "metadata": {}, 695 "output_type": "display_data" 696 }, 697 { 698 "data": { 699 "text/plain": [ 700 "['She attended the University of Cambridge where she studied Statistics and Machine Learning.']" 701 ] 702 }, 703 "metadata": {}, 704 "output_type": "display_data" 705 }, 706 { 707 "name": "stdout", 708 "output_type": "stream", 709 "text": [ 710 "\n", 711 "\n", 712 "\n" 713 ] 714 }, 715 { 716 "data": { 717 "text/plain": [ 718 "'Which companies did Robert intern at?'" 719 ] 720 }, 721 "metadata": {}, 722 "output_type": "display_data" 723 }, 724 { 725 "data": { 726 "text/plain": [ 727 "'Robert interned at NortonLifeLock and McAfee.'" 728 ] 729 }, 730 "metadata": {}, 731 "output_type": "display_data" 732 }, 733 { 734 "data": { 735 "text/plain": [ 736 "['Robert Johnson, originally from Sydney, Australia, is a renowned cybersecurity expert. He interned at several cybersecurity firms including NortonLifeLock and McAfee.']" 737 ] 738 }, 739 "metadata": {}, 740 "output_type": "display_data" 741 }, 742 { 743 "name": "stdout", 744 "output_type": "stream", 745 "text": [ 746 "\n", 747 "\n", 748 "\n" 749 ] 750 }, 751 { 752 "data": { 753 "text/plain": [ 754 "'What club did Alice start at her university?'" 755 ] 756 }, 757 "metadata": {}, 758 "output_type": "display_data" 759 }, 760 { 761 "data": { 762 "text/plain": [ 763 "'Alice started the Web Development club at her university.'" 764 ] 765 }, 766 "metadata": {}, 767 "output_type": "display_data" 768 }, 769 { 770 "data": { 771 "text/plain": [ 772 "['Alice Williams, a native of Dublin, Ireland, is a successful web developer. She started the Web Development club at her university and was its president for two years.']" 773 ] 774 }, 775 "metadata": {}, 776 "output_type": "display_data" 777 }, 778 { 779 "name": "stdout", 780 "output_type": "stream", 781 "text": [ 782 "\n", 783 "\n", 784 "\n" 785 ] 786 }, 787 { 788 "data": { 789 "text/plain": [ 790 "'What did Michael study in high school?'" 791 ] 792 }, 793 "metadata": {}, 794 "output_type": "display_data" 795 }, 796 { 797 "data": { 798 "text/plain": [ 799 "'Michael studied Game Design in high school.'" 800 ] 801 }, 802 "metadata": {}, 803 "output_type": "display_data" 804 }, 805 { 806 "data": { 807 "text/plain": [ 808 "['Michael Brown is a game developer from Tokyo, Japan. He attended a specialized high school where he studied Game Design.']" 809 ] 810 }, 811 "metadata": {}, 812 "output_type": "display_data" 813 }, 814 { 815 "name": "stdout", 816 "output_type": "stream", 817 "text": [ 818 "\n", 819 "\n", 820 "\n" 821 ] 822 } 823 ], 824 "source": [ 825 "for question, context in [\n", 826 " (question1, context1),\n", 827 " (question2, context2),\n", 828 " (question3, context3),\n", 829 " (question4, context4),\n", 830 " (question5, context5),\n", 831 "]:\n", 832 " final_prompt = my_final_prompt(question, context)\n", 833 " generator = generate.json(model, QuestionAnswer)\n", 834 " response = generator(final_prompt, max_tokens=1024, temperature=0, seed=42)\n", 835 " display(question)\n", 836 " display(response.answer)\n", 837 " display(response.citations)\n", 838 " print(\"\\n\\n\")" 839 ] 840 } 841 ], 842 "metadata": { 843 "kernelspec": { 844 "display_name": "Python 3 (ipykernel)", 845 "language": "python", 846 "name": "python3" 847 }, 848 "language_info": { 849 "codemirror_mode": { 850 "name": "ipython", 851 "version": 3 852 }, 853 "file_extension": ".py", 854 "mimetype": "text/x-python", 855 "name": "python", 856 "nbconvert_exporter": "python", 857 "pygments_lexer": "ipython3", 858 "version": "3.10.12" 859 } 860 }, 861 "nbformat": 4, 862 "nbformat_minor": 5 863 }