The SemLab semantic search platform is based on our Vicore™ data processing platform which, among others, is the basis of our leading semantic news analysis solution ViewerPro™.
The purpose of semantic search is to improve standard keyword-based search technologies by adding domain dependent, a-priory knowledge about the semantics of the concepts entailed in the documents that are analysed. This a-priory knowledge is contained in ontologies: formal descriptions of the concepts that exist in the domain of application. In general the ontologies play a pivotal role in a semantic enabled technology and have a profound effect on its overall quality. For a search platform, the common quality parameters Precision and Recall will largely depend on the quality of the ontologies used.
Semlab’s semantic platform fully acknowledges that ontologies are the key to a successful semantic analysis system and therefore support ontology maintenance as a core technology. This includes both expert knowledge expression as well as (supervised) automatic ontology learning from domain-specific documents. In addition the SemLab platform fully supports existing base ontologies and offers their own default domain ontologies for many areas of financial information management, exactly as are used every day by the ViewerPro semantic news analysis platform.
The SemLab semantic platform is in essence a data pipelining platform in the sense that the data sources are fed to the start of the line, various relatively independent subsequent operations are done on the data. The results of these are stored in various databases.
In the case of semantic search, the data sources consist of text documents (e.g. reports, news messages, e-mails) in various formats and the final result is an semantic index, modeling the occurrence of ontology terms in the documents. When a user presents the platform with search query, this query is treated as if it were another document and also expressed in terms of the ontology. Since both the document corpus and the user’s query are expressed within the same domain of discourse, they can be compared and ranked on similarity. Finally, using carefully designed threshold values, the resulting document sections are presented to the user in a GUI or available through an API to other applications. In addition to the retrieval of documents, the domain ontology can also be queried directly. In this way the semantic search platform also serves as a question answering tool, enabling users to quickly retrieve the corporate domain knowledge.