Tuesday, March 20, 2012

add words to FTS dictionary

we are medical company I need to a some words to the dictionary that SQL
containstable uses how can I do it? These are not noise words example the 2
letters SP is a word we use over and over also HiCNR
--
cindyYou don't add words to the SQL FTS "dictionary". You might want to replace
your noise word list with an empty space and then rebuild your catalog. SQL
FTS builds its own "dictionary" based on your content.
Hilary Cotter
Director of Text Mining and Database Strategy
RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
This posting is my own and doesn't necessarily represent RelevantNoise's
positions, strategies or opinions.
Looking for a SQL Server replication book?
http://www.nwsu.com/0974973602.html
Looking for a FAQ on Indexing Services/SQL FTS
http://www.indexserverfaq.com
"cindy" <cmello@.nospam.nospam> wrote in message
news:20D3663B-C12B-4282-9A9A-BFCFD5B6ABC3@.microsoft.com...
> we are medical company I need to a some words to the dictionary that SQL
> containstable uses how can I do it? These are not noise words example the
> 2
> letters SP is a word we use over and over also HiCNR
> --
> cindy|||for example SP or HiCNR are words in this industry. We have customized sql
to do a boolean search with containstable in stored procedure. A search for
"SP" and "HiCNR" returns correctly matching words in document content. Thes
e
documents are actually rows in a table where Sharepoint stores the content o
f
its document libraries. So will "Magnet" and "heat" for example return as
expected. The problem is with the use of AND NOT. For example "Magnet" and
not "heat" will not bring back content with magnet and without heat whereas
"Magnet" and not "SP" seems to have no effect in eliminating the "SP" word.
I am thinking if I could use a "custom" dictionary have SQL see "SP" as a
word. But then maybe I am not understand the technology. When FTS does a
return is it doing like a split string or REGEX word matching, just saying o
k
here is a word as defined by the word breaker ie there is a space on either
side of "SP" and it is not in the noise word list?
--
cindy
"Hilary Cotter" wrote:

> You don't add words to the SQL FTS "dictionary". You might want to replace
> your noise word list with an empty space and then rebuild your catalog. SQ
L
> FTS builds its own "dictionary" based on your content.
> --
> Hilary Cotter
> Director of Text Mining and Database Strategy
> RelevantNOISE.Com - Dedicated to mining blogs for business intelligence.
> This posting is my own and doesn't necessarily represent RelevantNoise's
> positions, strategies or opinions.
> Looking for a SQL Server replication book?
> http://www.nwsu.com/0974973602.html
> Looking for a FAQ on Indexing Services/SQL FTS
> http://www.indexserverfaq.com
>
> "cindy" <cmello@.nospam.nospam> wrote in message
> news:20D3663B-C12B-4282-9A9A-BFCFD5B6ABC3@.microsoft.com...
>
>|||Hello Cindy,
Based on my scope, SP shall be the same as other word and there is no
internal dictionary and it is built from word breaker.
I was not able to reproduce the issue. The following is the statements
===================================
create table documentTable(ftkey int not null, document ntext)
create unique index DTftkey_idx on documentTable(ftKey)
exec sp_fulltext_column 'documentTable', 'document', 'add'
exec sp_fulltext_table 'documentTable', 'start_change_tracking'
exec sp_fulltext_table 'documentTable', 'start_background_updateindex'
exec sp_fulltext_catalog 'myfulltext','start_full'
insert documenttable values (10, 'words letters SP Magnet')
insert documenttable values (11, 'HiCNR Magnet')
SELECT * FROM documenttable WHERE CONTAINS(document, 'Magnet AND NOT SP')
SELECT * FROM documenttable WHERE CONTAINS(document, 'Magnet')
====================================
I got only 1 record for the first select query.
Best Regards,
Peter Yang
MCSE2000/2003, MCSA, MCDBA
Microsoft Online Partner Support
When responding to posts, please "Reply to Group" via your newsreader so
that others may learn and benefit from your issue.
========================================
=============
This posting is provided "AS IS" with no warranties, and confers no rights.

No comments:

Post a Comment