Skip to main content

Search Api ve Apache Tika

With Drupal 8, it is possible to search for words in the document.
Seyfettin Kahveci
Seyfettin Kahveci
20 min. read
Search Api ve Apache Tika

By integrating the following modules with Drupal 8, you can search for words in the files you upload.

You can install the related modules as follows:

1. Search API Setup

In order to install the module, you must first download the files of your module. Usually 2 methods are used to download module files, such as uploading with composer and normal ftp. However, the most accurate of these is to install the module on the system with composer. When you install the module with composer, it will prevent you from wasting time because it also downloads dependencies. Also, when you install modules with composer, you can easily update the modules with a command.

Download Search API with Composer

You can download the Search API module to /modules/contrib/ path with the following command using composer. The module download method we recommend is the module download method with composer.

composer require drupal/search_api

Download Module and Upload via FTP

https://www.drupal.org/project/search_api The module required for drupal 8 is downloaded from drupal 8 and must be unzipped and installed under the modules/ folder.

After downloading the relevant module to the server, it must be activated from the Modules section of the interface.

2. Search API Attachments Setup

Search API Attachments must be installed to ensure the integration of your search structure with Apache Tika.

Download Search API Attachments with Composer

Using Composer, you can download the module to /modules/contrib/ with the following command.

composer require drupal/search_api_attachments

Download Module and Upload via FTP

https://www.drupal.org/project/search_api_attachments The module required for drupal 8 is downloaded from drupal 8 and must be unzipped and installed under the modules/ folder.

After the Search API Attachments module is downloaded and installed on the server, it must be activated from the Plugins section of the interface.

3. Apache Tika Integration with Search API Attachments

https://mvnrepository.com/artifact/org.apache.tika/tika-app/1.19.1 You can download apache tika version 1.19.1 from You need to upload the relevant file to the sites/default/files directory.

After uploading the relevant java file to the directory, you should enter the settings page of your Search API Attachments module (/admin/config/search/search_api_attachments). On the relevant page, you should select Tika Extractor from Extraction Method section. You should type java in Path to java executable text. In the Path to Tika .jar file section, you must specify the address and file name where we upload our apache tika file (sites/default/files/tika-app-1.19.1.jar).

Apache Tika must be able to run java in order to search in files. If you have JDK installed on your server, you can skip chapter 4.

4. JDK Installation on Server for Apache Tika

Since Apache Tika's plugin that finds words in files is written using java, your server must be able to run java. JDK must be installed to run java on your server.

JDK Installation for Centos

If your server's operating system is Centos, you can install jdk on your server using the following command.

sudo yum install java-1.8.0-openjdk-devel

JDK Installation for Ubuntu

If your server's operating system is Ubuntu/redhat etc., you can install jdk on your server using the following command.

sudo apt install default-jre

After using the command according to the operating system of your server, you can use the following command to check whether the JDK is installed on the server.

java -version

If you get a result showing the version of your JDK as below after running the relevant command, it means that JDK has been successfully installed.

openjdk version "1.8.0_191"

OpenJDK Runtime Environment (build 1.8.0_191-b12)

OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)

If you encounter an error related to running Java on Apache, you should run the following command( Java Runtime Environment).

setsebool -P httpd_execmem on

5. To configure the Search API settings

The Search API allows us to search using View. To add a new server to the Search API, you need to activate the Database Search module.

After enabling the Database Search module, go to the /admin/config/search/search-api address. You will see the "Add Server" and "Add Index" buttons. The "Add Server" button allows you to define the server for searching. The "Add Index" button allows you to define which indexes you can search within the server you have defined with "Add Server". You can define multiple index configurations within a server. Click on the "Add Server" button at the specified address.

You can provide any desired name in the "Server name" section. Since we have enabled the Database Search module under the Backend options, "Database" is displayed. If we had installed other search plugins like the Apache Solr module, they would appear among these options. Under the "CONFIGURE DATABASE BACKEND" section, you will find settings where you can specify the minimum number of words required to perform a search and define the relationship of the search.

Database’ımızden arama yapan Database Server’ımızı tanımladıktan sonra /admin/config/search/search-api adresinden Add Index butonuna tıklıyoruz.

After defining our Database Server, which performs searches from our database, we click on the "Add Index" button at the /admin/config/search/search-api address.

In the "Server" section, we select the Server that our index will perform searches on. We choose the Database server that we previously created and proceed to the screen where we can save our Index and add our fields.

In the "Fields" tab, click on the "Add fields" button.

Since the machine name of the field where our files are uploaded is "field_basic_page_dosya", in the "General" section, find "Search API Attachments" from the available options and click on the "Add" button.

After that, click on the "Save Changes" button. Then, go to the "Processors" tab. From there, select "File attachments".

You can adjust which file extensions will not be included in the search by configuring the section that appears below.

To make the searched keyword appear in bold in the search results, you can select "Highlight" and configure its settings in the lower section.

After defining the relevant settings, we move to the "View" tab. Before this configuration, click on the "Index now" button to scan and add the previously uploaded files to the database again.

Then, in the "View settings" section where we can see the search results, we create a view using the "Search" path, which is the path of the "Index Database Index Search File" view.

We add the "Filename" field to the View, which will display the name of our file.

Then, we add the "Excerpt" field to the View. From its settings, we select "Use highlighted field data".

To enable search functionality in our View, we add the "FullText Search Field" to the "Filter criteria" section.

To enable search using the relevant field, we open the exposed filter of the Field.

To retrieve searches that contain any of the words we enter in this field, we select the option "Contains any of these words" in the "Contains" section.

After that, we add files to our Basic Page content type. Then, we go to the /search address, which is created using the View, and use the Fulltext Search field, which is an Exposed filter, to search within our files.

Our Offices

Drupart Locations

Our Officess

London

151 West Green Road, London, England

442038156478

[email protected]

Drupart R&D

GOSB Teknopark Hi-Tech Bina 3.Kat B3 Gebze - KOCAELİ

+90 262 678 8872

[email protected]

Newark

112 Capitol Trail Suite, A437 Newark DE, 19711

+17406666255

[email protected]

Wiesbaden

Hinterbergstraße 27
65207 Wiesbaden
Deutschland

+49 (0) 6151 – 492 70 23

[email protected]