9.13. 文字檢索函式及運算子
Table 9.40,Table 9.41andTable 9.42summarize the functions and operators that are provided for full text searching. SeeChapter 12for a detailed explanation ofPostgreSQL's text search facility.
Table 9.40. Text Search Operators
@@
boolean
tsvector
matchestsquery
?
to_tsvector('fat cats ate rats') @@ to_tsquery('cat & rat')
t
@@@
boolean
deprecated synonym for@@
to_tsvector('fat cats ate rats') @@@ to_tsquery('cat & rat')
t
`
`
tsvector
concatenatetsvector
s
`'a:1 b:2'::tsvector
'c:1 d:2 b:3'::tsvector`
'a':1 'b':2,5 'c':3 'd':4
&&
tsquery
ANDtsquery
s together
`'fat
rat'::tsquery && 'cat'::tsquery`
`( 'fat'
'rat' ) & 'cat'`
`
`
tsquery
ORtsquery
s together
`'fat
rat'::tsquery
'cat'::tsquery`
`( 'fat'
'rat' )
'cat'`
!!
tsquery
negate atsquery
!! 'cat'::tsquery
!'cat'
<->
tsquery
tsquery
followed bytsquery
to_tsquery('fat') <-> to_tsquery('rat')
'fat' <-> 'rat'
@>
boolean
tsquery
contains another ?
'cat'::tsquery @> 'cat & rat'::tsquery
f
<@
boolean
tsquery
is contained in ?
'cat'::tsquery <@ 'cat & rat'::tsquery
t
Note
Thetsquery
containment operators consider only the lexemes listed in the two queries, ignoring the combining operators.
In addition to the operators shown in the table, the ordinary B-tree comparison operators (=
,<
, etc) are defined for typestsvector
andtsquery
. These are not very useful for text searching but allow, for example, unique indexes to be built on columns of these types.
Table 9.41. Text Search Functions
array_to_tsvector(text[]
)
tsvector
convert array of lexemes totsvector
array_to_tsvector('{fat,cat,rat}'::text[])
'cat' 'fat' 'rat'
get_current_ts_config()
regconfig
get default text search configuration
get_current_ts_config()
english
length(tsvector
)
integer
number of lexemes intsvector
length('fat:2,4 cat:3 rat:5A'::tsvector)
3
numnode(tsquery
)
integer
number of lexemes plus operators intsquery
`numnode('(fat & rat)
cat'::tsquery)`
5
plainto_tsquery([configregconfig
,]querytext
)
tsquery
producetsquery
ignoring punctuation
plainto_tsquery('english', 'The Fat Rats')
'fat' & 'rat'
phraseto_tsquery([configregconfig
,]querytext
)
tsquery
producetsquery
that searches for a phrase, ignoring punctuation
phraseto_tsquery('english', 'The Fat Rats')
'fat' <-> 'rat'
querytree(querytsquery
)
text
get indexable part of atsquery
querytree('foo & ! bar'::tsquery)
'foo'
setweight(vectortsvector
,weight"char"
)
tsvector
assignweight
_to each element ofvector
_
setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A')
'cat':3A 'fat':2A,4A 'rat':5A
setweight(vectortsvector
,weight"char"
,lexemestext[]
)
tsvector
assignweight
_to elements ofvector
that are listed inlexemes
_
setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}')
'cat':3A 'fat':2,4 'rat':5A
strip(tsvector
)
tsvector
remove positions and weights fromtsvector
strip('fat:2,4 cat:3 rat:5A'::tsvector)
'cat' 'fat' 'rat'
to_tsquery([configregconfig
,]querytext
)
tsquery
normalize words and convert totsquery
to_tsquery('english', 'The & Fat & Rats')
'fat' & 'rat'
to_tsvector([configregconfig
,]documenttext
)
tsvector
reduce document text totsvector
to_tsvector('english', 'The Fat Rats')
'fat':2 'rat':3
to_tsvector([configregconfig
,]documentjson(b)
)
tsvector
reduce each string value in the document to atsvector
, and then concatentate those in document order to produce a singletsvector
to_tsvector('english', '{"a": "The Fat Rats"}'::json)
'fat':2 'rat':3
ts_delete(vectortsvector
,lexemetext
)
tsvector
remove givenlexeme
_fromvector
_
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')
'cat':3 'rat':5A
ts_delete(vectortsvector
,lexemestext[]
)
tsvector
remove any occurrence of lexemes inlexemes
_fromvector
_
ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])
'cat':3
ts_filter(vectortsvector
,weights"char"[]
)
tsvector
select only elements with givenweights
_fromvector
_
ts_filter('fat:2,4 cat:3b rat:5A'::tsvector, '{a,b}')
'cat':3B 'rat':5A
ts_headline([configregconfig
,]documenttext
,querytsquery
[,optionstext
])
text
display a query match
ts_headline('x y z', 'z'::tsquery)
x y <b>z</b>
ts_headline([configregconfig
,]documentjson(b)
,querytsquery
[,optionstext
])
text
display a query match
ts_headline('{"a":"x y z"}'::json, 'z'::tsquery)
{"a":"x y <b>z</b>"}
ts_rank([weightsfloat4[]
,]vectortsvector
,querytsquery
[,normalizationinteger
])
float4
rank document for query
ts_rank(textsearch, query)
0.818
ts_rank_cd([weightsfloat4[]
,]vectortsvector
,querytsquery
[,normalizationinteger
])
float4
rank document for query using cover density
ts_rank_cd('{0.1, 0.2, 0.4, 1.0}', textsearch, query)
2.01317
ts_rewrite(querytsquery
,targettsquery
,substitutetsquery
)
tsquery
replacetarget
_withsubstitute
_within query
`ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo
bar'::tsquery)`
`'b' & ( 'foo'
'bar' )`
ts_rewrite(querytsquery
,selecttext
)
tsquery
replace using targets and substitutes from aSELECT
command
SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')
`'b' & ( 'foo'
'bar' )`
tsquery_phrase(query1tsquery
,query2tsquery
)
tsquery
make query that searches forquery1
_followed byquery2
_(same as<->
operator)
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))
'fat' <-> 'cat'
tsquery_phrase(query1tsquery
,query2tsquery
,distanceinteger
)
tsquery
make query that searches forquery1
_followed byquery2
at distancedistance
_
tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)
'fat' <10> 'cat'
tsvector_to_array(tsvector
)
text[]
converttsvector
to array of lexemes
tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector)
{cat,fat,rat}
tsvector_update_trigger()
trigger
trigger function for automatictsvector
column update
CREATE TRIGGER ... tsvector_update_trigger(tsvcol, 'pg_catalog.swedish', title, body)
tsvector_update_trigger_column()
trigger
trigger function for automatictsvector
column update
CREATE TRIGGER ... tsvector_update_trigger_column(tsvcol, configcol, title, body)
unnest(tsvector
, OUTlexemetext
, OUTpositionssmallint[]
, OUTweightstext
)
setof record
expand a tsvector to a set of rows
unnest('fat:2,4 cat:3 rat:5A'::tsvector)
(cat,{3},{D}) ...
Note
All the text search functions that accept an optionalregconfig
argument will use the configuration specified bydefault_text_search_configwhen that argument is omitted.
The functions inTable 9.42are listed separately because they are not usually used in everyday text searching operations. They are helpful for development and debugging of new text search configurations.
Table 9.42. Text Search Debugging Functions
ts_debug([configregconfig
,]documenttext
, OUTaliastext
, OUTdescriptiontext
, OUTtokentext
, OUTdictionariesregdictionary[]
, OUTdictionaryregdictionary
, OUTlexemestext[]
)
setof record
test a configuration
ts_debug('english', 'The Brightest supernovaes')
(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...
ts_lexize(dictregdictionary
,tokentext
)
text[]
test a dictionary
ts_lexize('english_stem', 'stars')
{star}
ts_parse(parser_nametext
,documenttext
, OUTtokidinteger
, OUTtokentext
)
setof record
test a parser
ts_parse('default', 'foo - bar')
(1,foo) ...
ts_parse(parser_oidoid
,documenttext
, OUTtokidinteger
, OUTtokentext
)
setof record
test a parser
ts_parse(3722, 'foo - bar')
(1,foo) ...
ts_token_type(parser_nametext
, OUTtokidinteger
, OUTaliastext
, OUTdescriptiontext
)
setof record
get token types defined by parser
ts_token_type('default')
(1,asciiword,"Word, all ASCII") ...
ts_token_type(parser_oidoid
, OUTtokidinteger
, OUTaliastext
, OUTdescriptiontext
)
setof record
get token types defined by parser
ts_token_type(3722)
(1,asciiword,"Word, all ASCII") ...
ts_stat(sqlquerytext
, [weightstext
,] OUTwordtext
, OUTndocinteger
, OUTnentryinteger
)
setof record
get statistics of atsvector
column
ts_stat('SELECT vector from apod')
(foo,10,15) ...