Sunday, February 19, 2012

help me get rid of the noise words please!

Hi,
I am a programmer with no SQLServer dba experience and have set up a full
text index on two columns in the same table. The index is working fine
except for when a noise word is used in a search. My problem is that I have
been unable to remove the noise words. I have searched the drive and deleted
everything from all of the noise.enu files - they are zero length. I have
gone to a command prompt and typed "net stop mssearch" and then "net start
mssearch". After stopping and starting mssearch, I have repopulated my
indexes by going to the table where the indexes are and using the
edit-full-text indexing wizard. I'm sure there is a better way but at the
completion of the wizard it drops the index and rebuilds it. I do have a
scheduled repopulation that runs every morning. When I open this job it has
the following code:
use [central] exec sp_fulltext_table N'[dbo].[Product]', N'start_full'.
This has been running successfully for quite a while.
What am I doing incorrectly that I can't get rid of the noise words?
Your help would be greatly appreciated!
[code generated from ASP page]
select * from Product where contains(Desc1, 'dvd AND r') or contains(Desc2,
'dvd AND r')
[System Summary]
Item Value
OS Name Microsoft Windows 2000 Advanced Server
Version 5.0.2195 Service Pack 4 Build 2195
OS Manufacturer Microsoft Corporation
System Manufacturer Hewlett-Packard
System Model HP NetServer
System Type X86-based PC
Processor x86 Family 6 Model 8 Stepping 10 GenuineIntel ~933 Mhz
BIOS Version 10/15/01
Windows Directory C:\WINNT
System Directory C:\WINNT\system32
Boot Device \Device\Harddisk0\Partition2
Locale United States
Time Zone Eastern Standard Time
Total Physical Memory 1,310,188 KB
Available Physical Memory 30,864 KB
Total Virtual Memory 4,435,520 KB
Available Virtual Memory 1,987,148 KB
Page File Space 3,125,332 KB
Page File C:\pagefile.sys
you have several options
1) use a freetext query, but this may return results that are too fuzzy for
you.
2) strip the noise words out by using a client side script - search this
newsgroup for searchpage1.htm
3) using tsql parsing on the server possibly a UDF
4) write an extended proc to parse them out
5) I'm wondering if you got the correct noise file. noise.eng is for British
English, noise.enu is for American English. Change the noise word lists you
find in
C:\Program Files\Microsoft SQL Server\MSSQL\FTDATA\SQLServer\Config
You will have to stop and start the mssearch service to make these changes.
"woodysapsucker" <woody@.rohland.org> wrote in message
news:OMbPJgoEEHA.3344@.tk2msftngp13.phx.gbl...
> Hi,
> I am a programmer with no SQLServer dba experience and have set up a full
> text index on two columns in the same table. The index is working fine
> except for when a noise word is used in a search. My problem is that I
have
> been unable to remove the noise words. I have searched the drive and
deleted
> everything from all of the noise.enu files - they are zero length. I have
> gone to a command prompt and typed "net stop mssearch" and then "net start
> mssearch". After stopping and starting mssearch, I have repopulated my
> indexes by going to the table where the indexes are and using the
> edit-full-text indexing wizard. I'm sure there is a better way but at the
> completion of the wizard it drops the index and rebuilds it. I do have a
> scheduled repopulation that runs every morning. When I open this job it
has
> the following code:
> use [central] exec sp_fulltext_table N'[dbo].[Product]', N'start_full'.
> This has been running successfully for quite a while.
> What am I doing incorrectly that I can't get rid of the noise words?
> Your help would be greatly appreciated!
>
> [code generated from ASP page]
> select * from Product where contains(Desc1, 'dvd AND r') or
contains(Desc2,
> 'dvd AND r')
>
> [System Summary]
> Item Value
> OS Name Microsoft Windows 2000 Advanced Server
> Version 5.0.2195 Service Pack 4 Build 2195
> OS Manufacturer Microsoft Corporation
> System Manufacturer Hewlett-Packard
> System Model HP NetServer
> System Type X86-based PC
> Processor x86 Family 6 Model 8 Stepping 10 GenuineIntel ~933 Mhz
> BIOS Version 10/15/01
> Windows Directory C:\WINNT
> System Directory C:\WINNT\system32
> Boot Device \Device\Harddisk0\Partition2
> Locale United States
> Time Zone Eastern Standard Time
> Total Physical Memory 1,310,188 KB
> Available Physical Memory 30,864 KB
> Total Virtual Memory 4,435,520 KB
> Available Virtual Memory 1,987,148 KB
> Page File Space 3,125,332 KB
> Page File C:\pagefile.sys
>
|||Thanks Hilary,
I am trying to still use the noise words in the queries. I would like the
query to be on "dvd r" not just "dvd". That is why I've been trying to empty
out the noise word files.
I am from the US but also emptied noise.eng files - just incase - and reran
the indexing.
Any other suggestions?
"Hilary Cotter" <hilaryk@.att.net> wrote in message
news:eYFPX1oEEHA.688@.tk2msftngp13.phx.gbl...
> you have several options
> 1) use a freetext query, but this may return results that are too fuzzy
for
> you.
> 2) strip the noise words out by using a client side script - search this
> newsgroup for searchpage1.htm
> 3) using tsql parsing on the server possibly a UDF
> 4) write an extended proc to parse them out
> 5) I'm wondering if you got the correct noise file. noise.eng is for
British
> English, noise.enu is for American English. Change the noise word lists
you
> find in
> C:\Program Files\Microsoft SQL Server\MSSQL\FTDATA\SQLServer\Config
> You will have to stop and start the mssearch service to make these
changes.
> "woodysapsucker" <woody@.rohland.org> wrote in message
> news:OMbPJgoEEHA.3344@.tk2msftngp13.phx.gbl...
full
> have
> deleted
have
start
the
> has
> contains(Desc2,
>

No comments:

Post a Comment