Full-Text Search

List of Contents

The example is actual text search using keyword index. The text search can be performed at a lows cost, which is incomparable with the LIKE syntax of a general database, since it searches for a specific string pattern at a specific time using "reverse index". The keyword index can be used for variable strings, varchar and text type. It is important that the search terms and the search target terms must match exactly. Machbase does not perform morphological analysis and uses keywords based on special characters.

Syntax:

SELECT column_name(s)
FROM table_name
WHERE column_name SEARCH pattern;

Example:

Mach> CREATE TABLE search_table (id INTEGER, name VARCHAR(20));
Created successfully.

Mach> CREATE KEYWORD INDEX idx_SEARCH ON SEARCH_table (name);
Created successfully.

Mach> INSERT INTO search_table VALUES(1, 'time flys');
1 row(s) inserted.

Mach> INSERT INTO search_table VALUES(1, 'time runs');
1 row(s) inserted.

Mach> SELECT * FROM search_table WHERE name SEARCH 'time' OR name SEARCH 'runs2' ;
ID          NAME
-------------------------------------
1           time runs
1           time flys
[2] row(s) selected.

Mach> SELECT * FROM search_table WHERE name SEARCH 'time' AND name SEARCH 'runs2' ;
ID          NAME
-------------------------------------
[0] row(s) selected.

Mach> SELECT * FROM search_table WHERE name SEARCH 'flys' OR name SEARCH 'runs2' ;
ID          NAME
-------------------------------------
1           time flys
[1] row(s) selected.

​Machbase allows you to search for variable length strings stored in UTF-8 in both ASCII and multilingual characters(encoded), where the most significant bit is 1. However, since multilingual characters do not recognize the morpheme of the word, Machbase supports search using 2-gram technique.

Syntax:

SELECT column_name(s)
FROM table_name
WHERE column_name SEARCH pattern;

Example:

Mach> CREATE table multi_table (message varchar(100));
Created successfully.

Mach> CREATE KEYWORD INDEX idx_multi ON multi_table(message);
Created successfully.

Mach> INSERT INTO multi_table VALUES("Machbase is the combination of ideal solutions");
1 row(s) inserted.

Mach> INSERT INTO multi_table VALUES("Machbase is a columnar DBMS");
1 row(s) inserted.

Mach> INSERT INTO multi_table VALUES("Machbaseは理想的なソリューションの組み合わせです");
1 row(s) inserted.

Mach> INSERT INTO multi_table VALUES("Machbaseは円柱状のDBMSです");
1 row(s) inserted.

Mach>  SELECT * from multi_table WHERE message SEARCH 'Machbase DBMS';
MESSAGE
------------------------------------------------------------------------------------
Machbaseは円柱状のDBMSです
Machbase is a columnar DBMS
[2] row(s) selected.

Mach> SELECT * from multi_table WHERE message SEARCH 'DBMS is';
MESSAGE
------------------------------------------------------------------------------------
Machbase is a columnar DBMS
[1] row(s) selected.

Mach> SELECT * from multi_table WHERE message SEARCH 'DBMS' OR message SEARCH 'ideal';
MESSAGE
------------------------------------------------------------------------------------
Machbaseは円柱状のDBMSです
Machbase is a columnar DBMS
Machbase is the combination of ideal solutions
[3] row(s) selected.

Mach> SELECT * from multi_table WHERE message SEARCH '組み合わせ';
MESSAGE
------------------------------------------------------------------------------------
Machbaseは理想的なソリューションの組み合わせです
[1] row(s) selected.
Elapsed time: 0.001
Mach> SELECT * from multi_table WHERE message SEARCH '円柱';
MESSAGE
------------------------------------------------------------------------------------
Machbaseは円柱状のDBMSです
[1] row(s) selected.

If the input data is "tax calculation", three words are saved. In other words, "tax", "calculation", and "-ion" are saved in the dictionary. Thus, users can have adequate results when they search a word as "tax" or "tax calculation". Basically, the Machbase's search method is an AND operation, so even when searching over three characters, the result is relatively accurate.

For example, consider the input record, "computer utilization guide" and search term is "computer". The dictionary contains "compu-","-puter", "utilization", "gui-", and "-ide", and search term are in the form of "compu-" AND "-puter", hence the corresponding record is successfully retrieved.

ESEARCH

The ESEARCH statement is a search keyword that enables extended searches on ASCII text. For this extension, search for the desired pattern is performed using the % character. In the LIKE operation, if a leading% is present, all records must be checked, but the advantage of ESEARCH is that it can quickly find the word. This feature can be very useful when looking for part of an English string (an error string or code).

Syntax:

SELECT column_name(s)
FROM table_name
WHERE column_name ESEARCH pattern;

Example:

Mach> CREATE TABLE esearch_table(id INTEGER, name VARCHAR(20), data VARCHAR(40));
Created successfully.

Mach> CREATE KEYWORD index idx1 ON esearch_table(name);
Created successfully.

Mach> CREATE KEYWORD index idx2 ON esearch_table(data);
Created successfully.

Mach> INSERT INTO esearch_table VALUES(1, 'machbase', 'Real-time search technology');
1 row(s) inserted.

Mach> INSERT INTO esearch_table VALUES(2, 'mach2flux', 'Real-time data compression');
1 row(s) inserted.

Mach> INSERT INTO esearch_table VALUES(3, 'DB MS', 'Memory cache technology');
1 row(s) inserted.

Mach> INSERT INTO esearch_table VALUES(4, 'ファ ッションアドバイザー、', 'errors');
1 row(s) inserted.

Mach> INSERT INTO esearch_table VALUES(5, '인피 니 플럭스', 'socket232');
1 row(s) inserted.

Mach> SELECT * FROM esearch_table where name ESEARCH '%mach';
ID          NAME                  DATA
--------------------------------------------------------------------------------
1           machbase            Real-time search technology
[1] row(s) selected.
Elapsed time: 0.001
Mach> SELECT * FROM esearch_table where data ESEARCH '%echn%';
ID          NAME                  DATA
--------------------------------------------------------------------------------
3           DB MS                 Memory cache technology
1           machbase            Real-time search technology
[2] row(s) selected.

Mach> SELECT * FROM esearch_table where name ESEARCH '%피니%럭스';
ID          NAME                  DATA
--------------------------------------------------------------------------------
[0] row(s) selected.

Mach> SELECT * FROM esearch_table where data ESEARCH '%232';
ID          NAME                  DATA
--------------------------------------------------------------------------------
5           인피 니 플럭스  socket232
[1] row(s) selected.

REGEXP

The REGEXP statement is used to perform searches on data using regular expressions. In general, patterns of particular columns are filtered using regular expressions. One thing to keep in mind is that you can not use indexes when you use the REGEXP clause, so you must lower the overall search cost by putting index conditions on other columns in order to reduce the overall search scope. When you want to check a specific pattern, use index by SEARCH or ESEARCH, and use REGEXP again in a state where the total number of data is small, thereby helping to improve system overall efficiency.

Example:

Mach> CREATE TABLE regexp_table(id INTEGER, name VARCHAR(20), data VARCHAR(40));
Created successfully.

Mach> INSERT INTO regexp_table VALUES(1, 'machbase', 'Real-time search technology');
1 row(s) inserted.

Mach> INSERT INTO regexp_table VALUES(2, 'mach2flux', 'Real-time data compression');
1 row(s) inserted.

Mach> INSERT INTO regexp_table VALUES(3, 'DBMS', 'Memory cache technology');
1 row(s) inserted.

Mach> INSERT INTO regexp_table VALUES(4, 'ファ ッショ', 'errors');
1 row(s) inserted.

Mach> INSERT INTO regexp_table VALUES(5, '인피 니 플럭스', 'socket232');
1 row(s) inserted.

Mach> SELECT * FROM regexp_table WHERE name REGEXP 'mach';
ID          NAME                  DATA
--------------------------------------------------------------------------------
2           mach2base           Real-time data compression
1           machbase            Real-time search technology
[2] row(s) selected.

Mach> SELECT * FROM regexp_table WHERE data REGEXP 'mach[1]';
ID          NAME                  DATA
--------------------------------------------------------------------------------
[0] row(s) selected.

Mach> SELECT * FROM regexp_table WHERE data REGEXP '[A-Za-z]';
ID          NAME                  DATA
--------------------------------------------------------------------------------
5           인피 니 플럭스  socket232
4           ファ ッショ      errors
3           DBMS                  Memory cache technology
2           mach2base           Real-time data compression
1           machbase            Real-time search technology
[5] row(s) selected.

LIKE

​You can use LIKE statement just like SQL LIKE operator. Machbase even supports Korean, Chinese, and Japanese as well.

Syntax:

SELECT column_name(s)
FROM table_name
WHERE column_name LIKE pattern;

Example:

Mach> CREATE TABLE lik_table (id INTEGER, name VARCHAR(20), data VARCHAR(40));
Created successfully.

Mach> INSERT INTO lik_table VALUES(1, 'machbase', 'Real-time search technology');
1 row(s) inserted.

Mach> INSERT INTO lik_table VALUES(2, 'mach2flux', 'Real-time data compression');
1 row(s) inserted.

Mach> INSERT INTO lik_table VALUES(3, 'DBMS', 'Memory cache technology');
1 row(s) inserted.

Mach> INSERT INTO lik_table VALUES(4, 'ファ ッションアドバイザー、', 'errors');
1 row(s) inserted.

Mach> INSERT INTO lik_table VALUES(5, '인피 니 플럭스', 'socket232');
1 row(s) inserted.

Mach> SELECT * FROM lik_table WHERE name LIKE 'mach%';
ID          NAME                  DATA
--------------------------------------------------------------------------------
2           mach2base           Real-time data compression
1           machbase            Real-time search technology
[2] row(s) selected.

Mach> SELECT * FROM lik_table WHERE name LIKE '%니%';
ID          NAME                  DATA
--------------------------------------------------------------------------------
5           인피 니 플럭스  socket232
[1] row(s) selected.

Mach> SELECT * FROM lik_table WHERE data LIKE '%technology';
ID          NAME                  DATA
--------------------------------------------------------------------------------
3           DBMS                  Memory cache technology
1           machbase            Real-time search technology
[2] row(s) selected.

The example is actual text search using keyword index. The text search can be performed at a lows cost, which is incomparable with the LIKE syntax of a general database, since it searches for a specific string pattern at a specific time using "reverse index". The keyword index can be used for variable strings, varchar and text type. It is important that the search terms and the search target terms must match exactly. Machbase does not perform morphological analysis and uses keywords based on special characters.

results matching ""

    No results matching ""