[Sigia-l] RE: Thunderstone Search Appliance [was: RE: Google Appliance]

Gabriel gabriel at graphnical.com
Sun Nov 6 22:12:42 EST 2005


Hello Denise,

Sorry for the late reply :)


>> A couple of quick questions about Thunderstone -- getting beyond the 
>> interface to the real underlying search architecture:


I will answer these to the best of my knowledge--however, I encourage you to
contact Thunderstone directly for any additional questions or
clarifications...they are quite knowledgeable :)

The project in which I used Thunderstone was concluded almost a year
ago...so, I may be a bit sketchy on the details.


>> can you build parametric indexes with Thunderstone and do fielded 
>> searching, or do you just have the same huge full text index that the 
>> Google appliance delivers?


Yes --- and meta-searching 

>> are there separate loader and crawler programs which would allow me to
>> include metadata from formal repositories and a web content management
>> system? 

Not sure what you mean by 'include' metadata...are you referring to tagging
the pages? At very least I am sure there is some novel URI (or similar key)
that can be used to combine the search results with the tagging (metadata)
collection (IE: crawl both content and metadata for content in two separate
collections then add a link from the content results to a metadata search
which should return the correct metadata for that specific content). Also,
not sure what you mean by 'loader' --- the crawler is separate; the entire
process is pretty well modularized... 

>> what access do you have to integrate classification schemes and thesauri
>> into the query transformation? 

My project did not require extensive custom classes or verbiage
considerations --- the index 'content' was created to handle that as an
inherent feature, hand-in-handish approach --- however, I do believe it
ships with default 'thesauri' which can be overloaded and there is some type
of category management...

>> can you define the relevancy algorithm (Google's is hard coded into the 
>> index architecture and you cannot change it) can you sort results and 
>> narrow your search based on parameters (relates to the parametric index 
>> question above)?

Everything that is included in their appliance is technically a separate
application (for the most part, see their site). I asked the same
question--the answer went sort of like this:  sure, you can tweak it. 

>> is it possible to build in a security classification sensitivity (ie will
>> the loader program and the matching algorithms allow me to designate who
>> can/cannot see which content)?

That's interesting...once again, my project did not require formal security
as it was meant to be completely public --- that said, looking back I see no
reason why that could not be accomplished...basically, you can provide the
box with credentials for certain collections --- since the content in those
collections should be protected, the index will contain links to content
requiring credentials to view...then it would be a question of securing the
protected collection search interface...unless you do not mind providing
summaries of the protected content... 


>> Thanks for considering my questions.   We're building an enterprise 
>> search and actually find that we cannot use some crawlers/full-text 
>> search engines because they do not support parametric search, do not 
>> respect security, and ignore our knowledge organization structures.

>> Best regards,
>> Denise


No problem :) I hope the above helps a bit...


>||;)

gabriel kent
www.futureprogress.net






More information about the Sigia-l mailing list