Looking at the PHP Code to Make Your Data Chat-able with Laravel, OpenAI API

Alfred Nutile
8 min readJun 15, 2023

--

You can read the first part here this one will 🤿 into the code.

tl;dr

In this article, we’ll discuss how you can utilize Laravel, and in this case, the OpenAI API, to make your document chatable using embedding. Although we’re using LaraChain in this example, the code provided should work in any Laravel application.

Starting from the Request

Let’s examine what occurs when a user poses a question through the UI and trace the code from there. This approach is how I typically follow logic, particularly when debugging an issue from the outside in.

The web.php file isn’t particularly noteworthy in this case, so we’ll bypass that and proceed directly to the ProjectController. Yes, this is a busy controller. I am confident it will be refactored soon.

See the code here

  $outbound = $project->outbounds()
->whereType(OutboundEnum::ChatUi->value)
->first();

/** @var Outbound $outbound */
$outbound->run(
auth()->user(),
$validated['question'],
Filters::from($validated['filters'])
);

return response()->json([], 200);

This simply passes off the request and returns a temporary response. In this manner, the user isn’t left waiting due to a blocking request. Instead, it pushes information back to the UI via Websockets. We’ll discuss more on this shortly.

Note: Your code can be a lot more simplified, LaraChain tries to be chainable passing a Data Object from one attached link to the next, this is overkill for most projects.

There is also a Message model. This is because I aim to create a chat experience and, consequently, we need to have some history of the user and context for the chat API.


return new Message([
'role' => 'user',
'content' => $request,
'user_id' => $user->id,
'project_id' => $this->project_id,
'embedding' => [] //basically null for now
]);

NOTE: Message model is key and I will talk more about the “role” column shortly.

Therefore, I pass that message into the first part of this “chain”. In a simple Chain, the chat request from the user has to perform a few tasks. Firstly, it needs to be “vectorized” and turned into something that we can store as an embedding in the Message model. Then, after that, we need to search the Vector database (Postgres with the vector plugin set up) and use those results to provide some potential context for the chat API.

Here is the code:


$query = DocumentChunk::query()
->join('documents', 'documents.id', '=', 'document_chunks.document_id')
->join('sources', 'sources.id', '=', 'documents.source_id')
->selectRaw('document_chunks.embedding <-> ? as distance, document_chunks.content, document_chunks.embedding as embedding, document_chunks.id as id',
[$this->response_dto->message->embedding])
->where('sources.project_id', $this->project->id)
->when(! empty($this->response_dto->filters->getSources()), function ($query) {
$query->whereIn('sources.id', $this->response_dto->filters->getSources());
})
->limit($this->limit)
->orderByRaw('distance');

$results = $query->get();

But that might be return a lot of data if my data chunks are not small enough. So we trim it down a little. I use a library https://github.com/alnutile/larachain-trim-text to reduce the text down as much as possible but still good enough for the LLM to read it.

NOTE: Token limitations are key, I will talk about more later in the token and they are changing. Read more here.

After that I combine all the content to make one final “Prompt”. Using this library here https://github.com/alnutile/larachain/wiki/Prompt-Template

With this we can make a simple prompt

The user seeks information about recipes from our database, which is stored in CSV format. 
The context provided below includes data related to their search.
Please answer their question using only this context, and format
your response for easy readability.

Context: {context}
\n
\n
###
\n
\n

With this we can ask a question of the system giving it context from the results above and putting that into the prompt {context} area.

All of this is limited as much as possible so we “tokens” space remaining for the response to fit. If we do not leave enough tokens then the response might be one word for example.

The default length is fixed at 2048 tokens, while the maximum can be set at 4096 tokens.

There is no simple answer the how many characters make a token. Using ChatGPT I made this library to help count tokens https://github.com/alnutile/larachain-token-count but again it is a rough guess.

So we have our context from the vector search, we have our question from the user as well. Both of these are in our Message table like this

Message ID: 1
Role: User
Content: The question here
Embedding: We did this using the OpenAI Embedding API

Message ID: 2
Role: System
Content: This is the prompt above

Note the sorting will not be by ID but first System then User.

$messages = 
[
'role' => 'system',
'content' => 'You are an AI Historian assistant...',
],
[
'role' => 'user',
'content' => "What other makers are around the time of O'Keeffe, Georgia?",
],

Making that a very simple array we send it to the API.

$stream = OpenAI::chat()->createStreamed([
'model' => 'gpt-3.5-turbo',
'temperature' => $this->temperature,
'messages' => $messages,
]);

Using this library here https://github.com/openai-php/laravel

But since it is a stream I wait and cycle through until it is done.

try {
$stream = OpenAI::chat()->createStreamed([
'model' => 'gpt-3.5-turbo',
'temperature' => $this->temperature,
'messages' => $messages,
]);

$count = 0;
$reply = '';
$data = [];
foreach ($stream as $response) {
$step = $response->choices[0]->toArray();
$content = data_get($step, 'delta.content');

$data[] = $content;
$reply = $reply.$content;
if ($count >= 25) {
ChatReplyEvent::dispatch($project, $user, $reply);
$count = 0;
$reply = '';
} else {
$count = $count + 1;
}
}

ChatReplyEvent::dispatch($project, $user, $reply);

return implode(' ', $data);
} catch (\Exception $e) {
logger('Error talking to api', [
$e->getMessage(),
]);

if ($tries > 2) {
return 'Error with API try again later';
} else {
//try again code here

return 'Trying again due to error';
}
}

ChatReplyEvent::dispatch($project, $user, $reply) simply broadcast the results to the UI see the code here and here.

I wait for a count “25” so that I can output a bit more in the UI at a time but not sure it is a good idea since it does kind of mess up the formatting.

After this response is done though we have one more Message to make.

Message ID: 3
Role: assistant
Content: The full response

This then allows us to continue the chat so now let’s look at what happens when the user asks the next related question. Like before we vectorize the users question, query the database to see if there are any matches. And as before we trim up the text. But now when we update the Message model we will have this.

Message ID: 1
Role: User
Content: The question here

Message ID: 2
Role: System
Content: The content for the new question //NOTE this is updated

Message ID: 3
Role: Assistant
Content: The previous results from the chat system.

Message ID: 4
Role: User
Content: The users current question.

So we updated the System content, added the user question and queried the vector database for some “context”. Then we have first the system message, then the user’s first question, then the assistant and then the latest user question.

$messages =
[
'role' => 'system',
'content' => 'You are an AI Historian assistant...', //This is updated
],
[
'role' => 'user',
'content' => "What other makers are around the time of O'Keeffe, Georgia?",
],
[
'role' => 'assistant',
'content' => "The previous answer to above",
],
[
'role' => 'user',
'content' => "The most current user question here",
]

And once again when we get a response we will add the new Message to the table so we have more history for the role “assistant”

One thing to keep in mind the more data here the less token space for the response of the OpenAi API LLM so at some point you might, keeping the System message, return just the latest one or two results from the user and assistant.

Now let’s look at the database part of this

So now that we can see how the request pattern works let’s just follow up on how we got the data in the database and what database is being used.

Using this library here https://github.com/pgvector/pgvector I enabled the vectorization feature in Postgres (this seems to be stronger than MySQL’s options)

For example I used Forge https://forge.laravel.com to setup a Postgres database and ran this:

#!/bin/bash

# Define your database password and name here
DB_PASSWORD=""
DB_NAME=""

# Update package list
apt update
apt install postgresql-15-pgvector

export PGPASSWORD=$DB_PASSWORD && sudo -u postgres psql -d $DB_NAME -c "CREATE EXTENSION vector;"

echo "PostgreSQL and Vector installation complete"

NOTE: You will see thanks to Sail I just run the Postgres locally using that and run the rest of my Laravel via Valet here

About the data you are working with, whether it is CSV, PDF or whatever the end results is the same. Take the content, break it into small chunks say 200 tokens with overlap into a table called document_chunks (for example). This would represent the data.

Let’s consider a PDF. In LaraChain when you upload a PDF it stores the name of the file in a table called “documents”. Then when you run the “PDF Transformer” it takes each page (soon this will be smaller as I noted above) and breaks them into ordered “document_chunks”. Then each document_chunks row has it’s content “vectorized”. This simply means using the OpenAI API for embedding we send the data and get the results for each row and save those embeddings to the table.

Here is a look at the code to vectorize the data https://github.com/alnutile/larachain/blob/main/app/Transformer/Types/EmbedTransformer.php

It calls the “ClientWrapper”

    public function handle(): void
{
/** @var EmbeddingsResponseDto $embeddings */
$embeddings = ClientWrapper::getEmbedding($this->chunk->content);
$this->chunk->embedding = $embeddings->embedding;
$this->chunk->token_count = $embeddings->token_count;
$this->chunk->save();
}

NOTE: I use a wrapper since soon I want this to work with any LLM

You can see the migration here

<?php

use App\Models\Project;
use Illuminate\Database\Migrations\Migration;
use Illuminate\Database\Schema\Blueprint;
use Illuminate\Support\Facades\Schema;

return new class extends Migration
{
/**
* Run the migrations.
*/
public function up(): void
{
Schema::create('documents', function (Blueprint $table) {
$table->id();
$table->integer('token_count');
$table->string('status')->default('pending'); //complete,running
$table->string('type')->default('web_scrape'); //pdf_scrape
$table->string('guid')->nullable();
$table->foreignIdFor(Project::class);
$table->json('meta_data')->nullable();
$table->vector('embedding', 1536)->nullable();
$table->timestamps();
});
}

/**
* Reverse the migrations.
*/
public function down(): void
{
Schema::dropIfExists('documents');
}
};

Later on I change $table->vector(‘embedding’, 1536)->nullable(); to $table->vector(‘embedding’, 2048)->nullable();

Ok that is about it.

Getting data out:

  1. Take the users question
  2. Vectorize it using the OpenAi Embedding API
  3. Then query your database for “like” content
  4. Trim it down to we can chat with the OpenAI Chat API
  5. Then save the questions, results, and context to the Message table so we have a history of the conversation.

Getting data in:

  1. Break up your data into small chunks of text
  2. Embed / Vectorize the data using OpenAi Embedding API
  3. Save it to the embed column in the chunks table
  4. Query against that.

--

--

Alfred Nutile

LaraLlama.io Open-Source Product Created Day to day developer using Laravel, PHP, Vue.js, Inertia