Hi @johnwiggity,
Yes you can add custom fields to the schema, but the contents of the PDF itself will not be searched. To be able to search for PDFs, the PDF needs to be parsed read and it’s content added to the new field in schema.
You may need to use a php library such as https://github.com/smalot/pdfparser
Ok thanks. Would any other file type be easier to work with? I’m trying to add my VTT caption files that are used for my videos in the search.
Hi John,
You can theoretically directly read the VTT file, read and process the text and store in a custom field. And when sending the data read the file and save the data in your schema as a something like “captions” field.
So you could directly upload the VTT file and have it read.
Ofcourse this is all theoretical, but theoritically i see no reason why the VTT file can’t be read something like the following:
<?php
// Path to your VTT file
$vttFilePath = 'path/to/your/file.vtt';
// Read the VTT file
$fileContent = file_get_contents($vttFilePath);
// Split the content into individual lines
$lines = explode("\n", $fileContent);
// Loop through each line
foreach ($lines as $line) {
// Remove any leading/trailing white spaces
$line = trim($line);
// Ignore empty lines or lines starting with "WEBVTT" or timestamps (typically in the format '00:00:00.000 --> 00:00:00.000')
if ($line !== '' && strpos($line, 'WEBVTT') !== 0 && strpos($line, '-->') === false) {
// Output or store the text from the VTT file
echo $line . "\n";
}
}
?>
Ok thanks a lot! I will ask my developer to see if we can make this work.