Monday, January 26, 2009

SharePoint Search bug

Scenario:
I was trying to do a small demo for the search scope and create a new scope to only restrict results to pdf files.

So I enabled the File Extension property for Use in Scopes. Created a new Scope and added the rule to include only File Extension = pdf. When I tried searching for a file I uploaded 2 days back , I couldn't find it.

Bug :

Try 1:
a) Uploaded a file 'ITTP_SharePoint_Contents.pdf' ( underscores )
b) Search with Keyword : Sharepoint
c) File is not in the result set.

Try 2:
a) Uploaded a file 'ITTP SharePoint Contents.pdf' ( no underscores but spaces this time )
b) Search with Keyword : Sharepoint
c) File appears right on the top of result set.

Try 3:
a) Uploaded a file 'ITTPSharePointContents.pdf' ( no spaces this time )
b) Search with Keyword : Sharepoint
c) File is not in the result set.

I tried all these options in Google and I got the file in all three cases, which is what I was expecting.

Solution:
I will certainly call it a Bug, and will recommend the clients to avoid using Underscores and to keep spaces in the file name if you want to search by the file name. I know each space translate back to %20, which means 3 character space, but thats the way it is.

Update:
I got a chance to talk to my collegue Sahil over the same and he mentioned that this has been done intentionally in SharePoint Search to increase performance. Also as SharePoint search basically uses SQL Full Text, wildcard search will be limited as we have seen above.

You can certainly try to do some modification to search webpart or can write a custom search webpart, and modifying the query keyword
i.e SharePoint --->> %SharePoint% to force wild card search.

That way you will get results in all three scenerios's but performance might become an issue then.

:-(

4 comments:

AMU,  April 11, 2012 at 6:05 AM  

Sandeep,

Good article which explained why I'm having so many problems with searching files with underscores and spaces. This "feature" in Sharepoint is so disappointing. Went to Sharepoint 20110 developer workshop given by Sahil Malik (excellent!!) but never got a chance to bring this up with him. Was wondering if you had heard anything else regarding any fixes for this problem.

Thanks,
Andy

AMU,  April 11, 2012 at 6:12 AM  

Sandeep,

Good article explaining this disappointing "feature" in Sharepoint. I attended a Sharepoint Workshop at Devweek 2012 given by Sahil Malik, but didn't get the chance to bring this up. Do you know if Microsoft have changed this functionality or plan to in the future ?

Thanks,
Andy

Sandeep K Nahta April 11, 2012 at 6:18 AM  

Unfortunately still no answer to issues. One workaround you can surely do it write a custom handler to store the original name of the file in some hidden property / title > and rename such files when uploaded again :)

Anonymous,  March 8, 2013 at 11:42 AM  

I faced the same issue. I had multiple document libraries with all kind of files in it. It appreared that only for a particular doc lib advanced search on filename was not working. After a lot of time being spend on configuration and crawl settings I realised that it was all because of the under score in file names. I had Underscore in document library(LIBRAY_1 ) as well as file name(ABC_1_2012). The search by filename was not working. But amazingly evrything worked just fine in another enviorment with same data. But anayways replaced underscore with space and it started working again.