Using pgvector embeds search in Laravel
Hello everyone, I encounter a problem, in fact here is my function which is responsible for carrying out the similarity search public function findEmbedding($document_path, $query_embedding): array {
$query = <<<EOT
SELECT embeddings.collection_id, embeddings.embedding, embeddings.document, embeddings.cmetadata, embeddings.custom_id, embeddings.uuid, embeddings.embedding <=> '%s'::vector AS distance
FROM embeddings JOIN embedding_collections ON embeddings.collection_id = embedding_collections.uuid
WHERE embeddings.collection_id = '{collection_id}'::UUID ORDER BY distance ASC
LIMIT 4
EOT;
$embedding_collections = DB::table('embedding_collections')->where('name', $document_path)->first();
$query = str_replace("%s", json_encode($query_embedding), $query);
$query = str_replace("{collection_id}", $embedding_collections->uuid, $query);
$records = DB::cursor($query);
$context = "";
$metadata = [];
foreach ($records as $record) {
$context .= $record->document;
$meta = json_decode($record->cmetadata);
$metadata[] = ['page' => $meta->page];
}
return ['context' => $context, 'metadata' => $metadata];
} and it works correctly on vector types hence my table public function up(): void
{
Schema::create('embeddings', function (Blueprint $table) {
$table->uuid("uuid")->primary();
$table->uuid("collection_id");
$table->uuid("custom_id");
$table->json("cmetadata");
$table->text("document");
$table->vector('embedding')->nullable(); // Ajoutez la longueur de la colonne vectorielle ici
$table->timestamps();
});
DB::statement('ALTER TABLE embeddings ADD COLUMN embedding vector');
}
public function down(): void
{
Schema::dropIfExists('embeddings');
} but the problem is located at this level I cannot store the data in my database especially the embedding column here is part of the code // Extract text from the specific page
$pages = $pdf->getPages();
$text = $pages[$page - 1]->getText();
$total_token = Tiktoken::count($text);
$total_token_embed += $total_token;
$vectors = $documentRepository->getQueryEmbedding($text);
// Créez un objet Vector pour pgvector
// $embedding = new Vector($vectors);
Log::warning("vector data: $vectors");
Embedding::create([
"collection_id" => $collection->uuid,
"document" => $text,
"embedding" => $vectors,
"cmetadata" => json_encode([
"total_token" => $total_token,
"page" => $page,
"path" => $this->document->path,
"title" => $this->document->title
])
]);
and I get this error in the laravel log [2024-07-11 11:49:58] local.INFO: SQLSTATE[22P02]: Invalid text representation: 7 ERREUR: enregistrement litéral invalide : « [-0.030853271484375,0.032867431640625,0.02447509765625,0.0289306640625,0.045806884765625,0.037628173828125,0.04974365234375,,-0.044830322265625,-0.0279541015625,0.0143585205078125,-0.016510009765625,0.06842041015625,0.0174560546875,0.01617431640625,-0.01189422607421875,0.0001742839813232422,0.032684326171875,0.00904083251953125,-0.006702423095703125,-0.0212554931640625,0.02020263671875,0.007717132568359375,-0.0214996337890625,0.0015468597412109375,0.006137847900390625] » DETAIL: Parenthèse gauche manquante. CONTEXT: paramètre de portail non nommé $3 = '...' (Connection: pgsql, SQL: insert into "embeddings" ("collection_id", "document", "embedding", "cmetadata", "uuid", "custom_id") values (0c87e881-79da-43d1-af58-91096b40ecd7, What is this Promotion about? Promotion Period. , [-0.030853271484375,0.032867431640625,0.02447509765625,0.0289306640625,0.045806884765625,0.037628173828125,0.01224517822265625,-0.0304107666015625,-0.029083251953125,0.0008301734924316406,0.0017004013061523438,-0.05047607421875,-0.03558349609375,0.006839752197265625,-0.029571533203125-0.0092010498046875,-0.043701171875,0.03692626953125,0.0037384033203125,0.04693603515625,0.0212554931640625,0.02020263671875,0.007717132568359375,-0.0214996337890625,0.0015468597412109375,0.006137847900390625], "{"total_token":666,"page":1,"path":"public\/documents\/DdBnUsT02Hm2SMeCDdiHJREywld4fNakzxi43MAA.pdf","title":"revolut_ltd_general_partner_promotion_terms_free_trial_1.1.0_1657622768_en.pdf"}", 7de7bf56-060a-4ea6-aa57-d857b75c1e45, 70ddc18c-2392-4077-b7a6-0262a0e95655)) This does not store any data in the embedding column of my table I have tried everything, need help please thank you in advance. small precision the getQueryEmbedding function returns the data in table form so it is operational
Please or to participate in this conversation.