ExtraDRM : Design Ressource Management

Web content and data design solutions

Lucene and Solr are 2 differents Apache projects :

1) Lucene and Solr are NOT created to work together. Only Solr uses Lucene under the hood. Lucene has no clue about Solr API.

2) Lucene is a powerful search engine framework that lets us add search capability to our application. It exposes easy-to-use API while hiding all the search-related complex operations. Any application can use this library, not just Solr.

3) Solr is built around Lucene. It is not just a http-wrapper around Lucene but has been known to add more arsenal to Lucene. Solr is ready-to-use out of box. It is a web application that offers infrastructure related and a lot more features in addition to what Lucene offers.

4) Lucene doesn’t just create the Index for the consumption by Solr. Lucene handles all the search related operations. Any application can use lucene framework.

Examples are Solr, Elastic Search, LinkedIn (yes, under the hood), etc..

Lucene is a low level Java library (with ports to .NET, etc.) which implements indexing, analyzing, searching, etc.

Solr is a standalone pre-configured product/webapp which uses Lucene. If you prefer dealing with HTTP API instead of Java API, Solr is for you. Solr has also got some extra features on top (e.g. grouping).

A simple way to conceptualize the relationship between Solr and Lucene is that of a car and its engine. You can’t drive an engine, but you can drive a car. Similarly, Lucene is a programmatic library which you can’t use as-is, whereas Solr is a complete application which you can use out-of-box.

At the heart of Lucene is an Index. You pump your data into the Index, then do searches on the Index to get results out. Document objects are stored in the Index, and it is your job to “convert” your data into Document objects and store them to the Index.

Adding a Document/object to Index

Now you need to index your documents or business objects. To index an object, you use the Lucene Document class, to which you add the fields that you want indexed. As we briefly mentioned before, a Lucene Document is basically a container for a set of indexed fields. This is best illustrated by an example:

Document doc = new Document();
doc.add(new StringField("id", "Hotel-1345", Field.Store.YES));
doc.add(new TextField("description", "A beautiful hotel", Field.Store.YES));

In the above example, we add two fields, “id” and “description”, with the respective values “Hotel-1345″ and “A beautiful hotel” to the document.

More precisely, to add a field to a document, you create a new instance of the Field class, which can be either a StringField or a TextField (the difference between the two will be explained shortly). A field object takes the following three parameters:

    • Field name: This is the name of the field. In the above example, they are “id” and “description”.
    • Field value: This is the value of the field. In the above example, they are “Hotel-1345″ and “A beautiful hotel”. A value can be a String like our example or a Reader if the object to be indexed is a file.
    • Storage flag: The third parameter specifies whether the actual value of the field needs to be stored in the lucene index or it can be discarded after it is indexed. Storing the value is useful if you need the value later, like you want to display it in the search result list or you use the value to look up a tuple from a database table, for example. If the value must be stored, use Field.Store.YES. You can also use Field.Store.COMPRESS for large documents or binary value fields. If you don’t need to store the value, use Field.Store.NO.

 

StringField vs TextField: In the above example, the “id” field contains the ID of the hotel, which is a single atomic value. In contrast, the “description” field contains an English text, which should be parsed (or “tokenized”) into a set of words for indexing. Use StringField for a field with an atomic value that should not be tokenized. Use TextField for a field that needs to be tokenized into a set of words. (Read more Here) (lucene-tutorial)

When should I use Lucene then?

If you need to embed search functionality into a desktop application for example, Lucene is the more appropriate choice.

For situations where you have very customized requirements requiring low-level access to the Lucene API classes, Solr may be more a hindrance than a help, since it is an extra layer of indirection.

What is Solr?

Apache Solr is a web application built around Lucene with all kinds of goodies.

It adds functionality like

  • XML/HTTP and JSON APIs
  • Hit highlighting
  • Faceted Search and Filtering
  • Geospatial Search
  • Fast Incremental Updates and Index Replication
  • Caching
  • Replication
  • Web administration interface etc

Unlike Lucene, Solr is a web application (WAR) which can be deployed in any servlet container, e.g. Jetty, Tomcat, Resin, etc.

Solr can be installed and used by non-programmers. Lucene cannot.

Some Extra Links around the subjets of search differences and serach problems related to razuna using Lucene in his Digital assets management :

http://forums.razuna.org/topic/razuna-search-not-working

https://help.razuna.com/t/razuna-1-7-search-isnt-working-at-all/410/5

https://help.razuna.com/t/other-language-analyzer-of-lucene/772/9

http://www.sitefinity.com/blogs/laurent-poulain’s-blog/2015/10/12/troubleshooting-lucene-search-issues

http://www.lucenetutorial.com/lucene-vs-solr.html

http://stackoverflow.com/questions/4638671/search-engine-lucene-vs-database-search

http://stackoverflow.com/questions/15704644/difference-between-solr-and-lucene

https://www.linkedin.com/pulse/what-core-differences-among-lucenesolr-elasticsearch-nizam-muhammad

https://doc.sitecore.net/sitecore_experience_platform/setting_up__maintaining/search_and_indexing/indexing/using_solr_or_lucene

http://www.lucenetutorial.com/sample-apps/textfileindexer-java.html

https://lingpipe-blog.com/2014/03/08/lucene-4-essentials-for-text-search-and-indexing/

http://stackoverflow.com/questions/9066347/lucene-multi-word-phrases-as-search-terms

A session is a way to store information (in variables) to be used across multiple pages.

Unlike a cookie, the information is not stored on the users computer.

What is a PHP Session?

When you work with an application, you open it, do some changes, and then you close it. This is much like a Session. The computer knows who you are. It knows when you start the application and when you end. But on the internet there is one problem: the web server does not know who you are or what you do, because the HTTP address doesn’t maintain state.

Session variables solve this problem by storing user information to be used across multiple pages (e.g. username, favorite color, etc). By default, session variables last until the user closes the browser.

Finally, Session variables hold information about one single user, and are available to all pages in one application.

sessions sessions-1

Tip: If you need a permanent storage, you may want to store the data in a database.


Start a PHP Session

A session is started with the session_start() function.

Session variables are set with the PHP global variable: $_SESSION.

Now, let’s create a new page called “demo_session1.php”. In this page, we start a new PHP session and set some session variables:

Example

<?php
// Start the session
session_start();
?>
<!DOCTYPE html>
<html>
<body>

<?php
// Set session variables
$_SESSION["favcolor"] = "green";
$_SESSION["favanimal"] = "cat";
echo "Session variables are set.";
?>

</body>
</html>

Get PHP Session Variable Values

Next, we create another page called “demo_session2.php”. From this page, we will access the session information we set on the first page (“demo_session1.php”).

Notice that session variables are not passed individually to each new page, instead they are retrieved from the session we open at the beginning of each page (session_start()).

Also notice that all session variable values are stored in the global $_SESSION variable:

Example

<?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>

<?php
// Echo session variables that were set on previous page
echo "Favorite color is " . $_SESSION["favcolor"] . ".<br>";
echo "Favorite animal is " . $_SESSION["favanimal"] . ".";
?>

</body>
</html>

Another way to show all the session variable values for a user session is to run the following code:

Example

<?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>

<?php
print_r($_SESSION);
?>

</body>
</html>
lamp How does it work? How does it know it’s me?

Most sessions set a user-key on the user’s computer that looks something like this: 765487cf34ert8dede5a562e4f3a7e12. Then, when a session is opened on another page, it scans the computer for a user-key. If there is a match, it accesses that session, if not, it starts a new session.


Modify a PHP Session Variable

To change a session variable, just overwrite it:

Example

<?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>

<?php
// to change a session variable, just overwrite it
$_SESSION["favcolor"] = "yellow";
print_r($_SESSION);
?>

</body>
</html>

Destroy a PHP Session

To remove all global session variables and destroy the session, use session_unset() and session_destroy():

Example

<?php
session_start();
?>
<!DOCTYPE html>
<html>
<body>

<?php
// remove all session variables
session_unset();

// destroy the session
session_destroy();
?>

</body>
</html>

Introduction

Since PHP 5.4, the original MySQL extension is obsolete and will generate alerts E_DEPRECATED level when connecting to a database. Instead, we can use the MySQLi extension or PDO_MySQL extension.

If like me, you have sites with the MySQL extension, here are some small examples to switch to MySQL MySQLi (which I find easier to use on my little creations).

Database connection
Previously, MySQL, connect to the database was thus:

Code PHP :

 
// we connect to MySQL
$conn = mysql_connect('$host', '$user', '$passwd');

// we select  database
mysql_select_db('mabase',$conn);

Now, with MySQLi :

<?php

// Connection variables
$host = “localhost”; // MySQL host name eg. localhost
$user = “user1″; // MySQL user. eg. root ( if your on localserver)
$password = “”; // MySQL user password  (if password is not set for your root user then keep it empty )
$database = “zzz”; // MySQL Database name

// Connect to MySQL Database
$link = mysqli_connect($host, $user, $password, $database);

// Check connection
if (mysqli_connect_errno())
{
echo “Failed to connect to MySQL: ” . mysqli_connect_error();
}

?>

 

 

Les requêtes

Migrating to MySQLi Procedural Methods

MySQLi procedural methods use a parameter that references either an object link or a result object. We have seen the reference to the object link when we dealt with mysqli_select_db. The result object is similar to a MySQL result returned from a query, for example.

Many of the methods in MySQL have very similar procedural methods in MySQLi, and are as simple to migrate as adding the i to mysql and adding or moving the link or result to the first parameter. Remember that MySQLi requires the link for those methods that reference a link. In the following list, the MySQL statement is followed by the replacement MySQLi procedural method.

mysql_affected_rows -> mysqli_affected_rows($link)
mysql_close -> mysqli_close($link)
mysql_data_seek -> mysqli_data_seek( $result, $offset)
mysql_errno -> mysqli_errno( $link)
mysql_error -> mysqli_error( $link)
mysql_fetch_array -> mysqli_fetch_array( $result, $type)
mysql_fetch_assoc -> mysqli_fetch_assoc( $result)
mysql_fetch_lengths -> mysqli_fetch_lengths( $result )
mysql_fetch_object -> mysqli_fetch_object( $result, $class, $params)
mysql_fetch_row -> mysqli_fetch_row( $result)
mysql_field_seek -> mysqli_field_seek( $result, $number)
mysql_free_result -> mysqli_free_result(result)
mysql_get_client_info -> mysqli_get_client_info( $link)
mysql_get_host_info -> mysqli_get_host_info( $link)
mysql_get_proto_info -> mysqli_get_proto_info( $link)
mysql_get_server_info -> mysqli_get_server_info( $link)
mysql_info -> mysqli_info( $link)
mysql_insert_id -> mysqli_insert_id( $link)
mysql_num_rows ->  mysqli_num_rows( $result)
mysql_ping -> mysqli_ping( $link)
mysql_query -> mysqli_query( $link, $query)
mysql_real_escape_string -> mysqli_real_escape_string( $link)
mysql_select_db – > mysqli_select_db( $link, $database)
mysql_set_charset -> mysqli_set_charset( $link, $charset)
mysql_stat -> mysqli_stat( $link)
mysql_thread_id -> mysqli_thread_id( $link)