Wednesday, May 25, 2011

Using the default property extractors in FAST and how it works with the search refinements

Property extraction in Fast Search 2010 for SharePoint is a process that extracts information from the textual content of an item which is crawled and then adds this information into a number of crawled properties. Afterwards you map these crawled properties to SharePoint managed properties and use it in your search – within the search refiners, within the search results and even as input for sorting of your search results.

FAST has a number of standard extractors shipped with the product – all of these use a generic dictionary to recognize the different terms within a crawled document:

  • Companies – activated by default and present in the search refiners on the search result page
  • Locations – activated by default but not present in the  search refiners on the search result page
  • Person names  - this one is not activated by default. You will need to modify OptionalProcessing.xml which you can find in FASTSearch\etc\config_data\DocumentProcessor\ . Switch <processor name=”personnameextraction” active=”no”/> to <processor name=”personnameextraction” active=”yes”/>. Afterwards you will need to reset the processorserver using the FAST command line – psctrl –reset (For reference take a look at psctrl.exe reference on Technet). Afterwards you will need to do a full crawl.

Although these extractors use a generic dictionary you still have some control over the way that they work – by defining include and exclude lists of items within the FAST search administration screens – this is nicely explained in this article - Manage property extraction. You can off course also use PowerShell – to fill these include/exclude lists as explained in this blog post -  Property extraction in FS4SP.

If you want to use information from the person names and locations extractors in the search refinemens you will need to modify them and add the following.

For the people refiner:

<Category Title="People" Description="Use this filter to restrict results by people" Type="Microsoft.Office.Server.Search.WebControls.ManagedPropertyFilterGenerator" MetadataThreshold="1" NumberOfFiltersToDisplay="4" MaxNumberOfFilters="200" ShowMoreLink="True" MappedProperty="personnames" MoreLinkText="show more" LessLinkText="show fewer" ShowCounts="Count" />

For the location refiner:

<Category Title="Location" Description="Use this filter to restrict results by location" Type="Microsoft.Office.Server.Search.WebControls.ManagedPropertyFilterGenerator" MetadataThreshold="1" NumberOfFiltersToDisplay="4" MaxNumberOfFilters="20" ShowMoreLink="True" MappedProperty="locations" MoreLinkText="show more" LessLinkText="show fewer" ShowCounts="Count" />

No comments: