Solr 4, Jetty 9 Multicore Drupal Solr Search API on Ubuntu

First let me say I love the new UI for the Solr Admin pages (download at Apache Solr).
Solr Multicore
Install Jetty 9

Jetty 9 requires JDK 7, so if you don’t have it installed, install it first. In Ubuntu, it is very convenient to install openjdk using apt-get.

apt-get install -y openjdk-7-jdk
mkdir -p /usr/java
ln -s /usr/lib/jvm/java-7-openjdk-amd64 /usr/java/default

Extract the tarball of Jetty 9 to a preferred directory, e.g. /opt. Then set $JETTY_HOME, create a new user called “jetty” and make it the owner of $JETTY_HOME.

tar zxvf jetty-distribution-9.0.3.v20130506.tar.gz -C /opt
mv /opt/jetty-distribution-9.0.3.v20130506/ /opt/jetty/
useradd jetty -U -s /bin/false
chown -R jetty:jetty /opt/jetty

Installing Solr Manually

Lets put all this in /opt

# cd /opt

Let’s make a Solr user

# adduser solr

We can now start the real installation of Solr. First, download all files and uncompress them:

cd /opt
wget http://archive.apache.org/dist/lucene/solr/4.9.1/solr-4.9.1.tgz
tar -xzvf solr-4.9.1.tgz
cp -R solr-4.9.1/example /opt/solr
cd /opt/solr
java -jar start.jar

Check if it works by visiting http://YOUR_IP:8983/solr. When it works, go back into your SSH session and close the window with Ctrl+C.

Make a place for Solr

# mkdir /opt/solr

Lets put all this in /opt
# cd /opt
Get some files

# wget http://apache.mirror.serversaustralia.com.au/lucene/solr/4.9.1/solr-4.9.1.tgz
# wget http://ftp.drupal.org/files/projects/search_api_solr-7.x-1.6.tar.gz

After that, download the start file and set it to automatically start up if it hasn’t been done already:

sudo wget -O /etc/init.d/jetty http://dev.eclipse.org/svnroot/rt/org.eclipse.jetty/jetty/trunk/jetty-distribution/src/main/resources/bin/jetty.sh
sudo chmod a+x /etc/init.d/jetty
sudo update-rc.d jetty defaults

Finally start Jetty/Solr:

sudo /etc/init.d/jetty start
service jetty start

You can now access your installation just as before at http://YOUR_IP:8983/solr.

Copy the Solr application into Jetty (this is how we do it in Jetty 9)

# cp -a solr-4.9.1/dist/solr-4.9.1.war /opt/jetty/webapps/solr.war
# cp -a solr-4.9.1/example/contexts/solr-jetty-context.xml /opt/jetty/
# cp -a solr-4.9.1/example/lib/ext/* /opt/jetty/lib/ext/

Also, you will need the context and some modules to have Solr run properly on Jetty.

# cp -a solr-4.9.1/example/contexts/solr-jetty-context.xml /opt/jetty/webapps/solr.xml
# cp -a solr-4.9.1/example/lib/ext/* /opt/jetty/lib/ext/

Copy the Solr config into place

# cp -a solr-4.9.1/example/solr /opt/solr
# cp -a solr-4.9.1/dist/ /opt/solr
# cp -a solr-4.9.1/contrib/ /opt/solr

Make a place for logs

# mkdir /opt/solr/logs

Everything is owned by Solr

# chown -R solr:solr /opt/solr/logs

Open the /etc/default/jetty file (nano /etc/default/jetty) and paste this into it
# sudo nano /etc/default/jetty

NO_START=0 # Start on boot
JAVA_OPTIONS="-Dsolr.solr.home=/opt/solr/solr $JAVA_OPTIONS"
JAVA_HOME=/usr/java/default
JETTY_HOME=/opt/solr
JETTY_USER=solr
JETTY_LOGS=/opt/solr/logs

Open the file /opt/solr/etc/jetty-logging.xml (nano /opt/solr/etc/jetty-logging.xml) and paste this into it:

<?xml version="1.0"?>
  <!DOCTYPE Configure PUBLIC "-//Mort Bay Consulting//DTD Configure//EN" "http://jetty.mortbay.org/configure.dtd">
  <!-- =============================================================== -->
  <!-- Configure stderr and stdout to a Jetty rollover log file -->
  <!-- this configuration file should be used in combination with -->
  <!-- other configuration files.  e.g. -->
  <!--    java -jar start.jar etc/jetty-logging.xml etc/jetty.xml -->
  <!-- =============================================================== -->
  <Configure id="Server" class="org.mortbay.jetty.Server">

      <New id="ServerLog" class="java.io.PrintStream">
        <Arg>
          <New class="org.mortbay.util.RolloverFileOutputStream">
            <Arg><SystemProperty name="jetty.logs" default="."/>/yyyy_mm_dd.stderrout.log</Arg>
            <Arg type="boolean">false</Arg>
            <Arg type="int">90</Arg>
            <Arg><Call class="java.util.TimeZone" name="getTimeZone"><Arg>GMT</Arg></Call></Arg>
            <Get id="ServerLogName" name="datedFilename"/>
          </New>
        </Arg>
      </New>

      <Call class="org.mortbay.log.Log" name="info"><Arg>Redirecting stderr/stdout to <Ref id="ServerLogName"/></Arg></Call>
      <Call class="java.lang.System" name="setErr"><Arg><Ref id="ServerLog"/></Arg></Call>
      <Call class="java.lang.System" name="setOut"><Arg><Ref id="ServerLog"/></Arg></Call></Configure>

Then, create the Solr user and grant it permissions:

sudo useradd -d /opt/solr -s /sbin/false solr
sudo chown solr:solr -R /opt/solr

Solr is now one directory closer to Jetty than the default configuration shipped with Solr

# sudo nano solr/solr/collection1/conf/solrconfig.xml
<!– modify the path as following as we are a directory closer now –>

<lib dir=”../../contrib/extraction/lib” regex=”.*\.jar” />
<lib dir=”../../dist/” regex=”solr-cell-\d.*\.jar” />

<lib dir=”../../contrib/clustering/lib/” regex=”.*\.jar” />
<lib dir=”../../dist/” regex=”solr-clustering-\d.*\.jar” />

<lib dir=”../../contrib/langid/lib/” regex=”.*\.jar” />
<lib dir=”../../dist/” regex=”solr-langid-\d.*\.jar” />

<lib dir=”../../contrib/velocity/lib” regex=”.*\.jar” />
<lib dir=”../../dist/” regex=”solr-velocity-\d.*\.jar” />

<!– <lib dir=”/non/existent/dir/yields/warning” /> –>
Start Solr

# service solr start

View the Solr admin interface  (try also port 8080 jetty)

http://solr.example.com:8983/solr/

==============================================

Add a core for a Drupal Search API project in a multicore solr server

# mkdir -p /opt/solr/solr/[corename]/data

Copy the search_api_solr config into place

# cp -a /opt/search_api_solr/solr-conf/4.x/ /opt/solr/solr/[corename]
# chown -R solr:solr /opt/solr/solr/[corename]

Visit the Solr admin page

http://solr.example.com:8983/solr/#/~cores/[corename]

If corename = drupal so to connect on a remote Solr server (not locahost), you must remove # in the url solr admin :

connect_solr_remote

The same thing happens with swagger.io to check an API the adding character # in url/rest/# is important to redirect you to the swagger folder API !!!

NOTA : Be aware of removing the symbol # in solr server url when using calls via SolrPhpClient library which causes annoying json validate errors around line 206 in Service.php file.


$LWS = new Apache_Solr_Service( ‘localhost’, ‘8983’, ‘solr/collection1’ );
Without any # in solr/collection1

Same thing for multicore :
$LWS = new Apache_Solr_Service( ‘localhost’, ‘8983’, ‘solr/~cores/core2/’ )
and not
$LWS = new Apache_Solr_Service( ‘localhost’, ‘8983’, ‘solr/#/~cores/core2/’ );


Configuring a schema.xml for Solr

Add your database fields to index :

<field name=”typeressource” type=”string” indexed=”true” stored=”true” />
<field name=”titre” type=”text_general” indexed=”true” stored=”true” multiValued=”true”/>

 

Phrase Query

A phrase query matches multiple terms (words) in sequence.

text:”yonik seeley”

This query will match text containing Yonik Seeley but will not match Yonik C Seeley or Seeley Yonik.

Internally, a phrase query is created when the fieldType produces multiple terms for the given value. For the example above, the fieldType splits on whitespace and lowercases the result, and we get two terms… [yonik, seeley].
If our fieldType had been string, a single term of yonik seeley would have been produced since string fields do not change values in any way.

Add a core into a multicore solr server environnement

1- First, Stop tomcat or jetty service before everything :

sudo service jetty stop

2- Copy / rename the /opt/solr-distribution/example/collection1 to an understandable name like stores for example. ..  In that case, you can execute the following command :

* NotaBena :  if you installed it using apt-get, this can be skipped by going to cd /usr/share/solr …
* NotaBene : the collection1 core conf is more complete than the multicore (core0/core1 samples) so try to get collection1/conf files by copy .. in this example I suppose that I have collection1 saved from mono instance basic install …

cd /opt/solr/solr
cp -R collection1 stores or mv collection1 stores
cd stores

3- Also, if you installed Solr manually, open the file core.properties :

sudo nano core.properties

and change the core name inside to the same name above (stores)

4- Then, remove the data directory and change the schema.xml :
Nota bene : with this method you can delete a whole index solr instead using curl and delete Rest calls …

rm -R data
nano conf/schema.xml

5- Paste your own schema.xml in here and think to correct directories if needed according to your installation (example drupal solr search) see above what to modify in solrconfig.xml.  There is a very advanced schema.xml in the Solr Repository for monocore collection1. You can probably find a lot more of them on the internet …

7- Do not forget to make solr owner of the new core :

sudo chown -R solr:solr /opt/solr/solr/stores

8- Add the new core in  solr.xml on /opt/solr/solr: (remember this solr.xml is a copy of the multicore file in examples)

<cores adminPath=”/admin/cores” host=”${host:}” hostPort=”${jetty.port:8983}” hostContext=”${hostContext:solr}”>
<core name=”core0″ instanceDir=”core0″ />
<core name=”core1″ instanceDir=”core1″ />
<core name=”stores” instanceDir=”stores” />
<shardHandlerFactory name=”shardHandlerFactory”>
<str name=”urlScheme”>${urlScheme:}</str>
</shardHandlerFactory>
</cores>

9- Start Jetty :

sudo /etc/init.d/jetty start

When you now visit your Solr instance, you should see the Dashboard with the collection somewhere under cores. think to check logs if there are any errors …

JETTY PORTS and as service at boot

If you want Jetty to start when boot up, execute following code

update-rc.d jetty defaults

If you want to change the port Jetty runs on, edit start.ini in $JETTY_HOME and change jetty.port

sudo nano /opt/jetty/start.ini

#=============
# HTTP Connector
#———————–
jetty.port=9090
http.timeout=30000
etc/jetty-http.xml

>>> Multicore Install : Read this link

 

extradrmtech

Since 30 years I work on Database Architecture and data migration protocols. I am also a consultant in Web content management solutions and medias protecting solutions. I am experienced web-developer with over 10 years developing PHP/MySQL, C#, VB.Net applications ranging from simple web sites to extensive web-based business applications. Besides my work, I like to work freelance only on some wordpress projects because it is relaxing and delightful CMS for me. When not working, I like to dance salsa and swing and to have fun with my little family.

You may also like...