Abstract: Lexical search, as deployed by most search systems, seldom retrieves relevant entries and results in a lot of user frustration. It performs especially poorly when the data collection grows and users want to search for more complex queries.
In recent years pre-trained transformer networks have led to a revolution in search: It is now possible to get 10x better search results with minimal effort and without requiring training or external information like click rates.
One method that has been proven to be particularly helpful is semantic search: Here, documents and queries are mapped to a semantic vector space. This semantic vector space does not just encode the used words in a document, but it actually encodes the meaning of documents. This allows to overcome the lexical gap of e.g. synonyms (United States and U.S.) or spelling variations / mistakes.
The method of semantic search is especially helpful for multilingual and cross-lingual search: Previously, you had to spend a lot of time to tune lexical search systems for each language individually to work e.g. with synonyms, spelling variations, spelling mistakes etc. Now, with semantic search this is extremely simplified: Within minutes, you get a system that works amazingly well on 100+ languages.
Join me for this talk to get an intro into semantic search, how it can simplify your search setup while at the same time delivering way better search results.
Bio: Nils Reimers is an expert on search relevance using pre-trained transformer network. In 2018, he authored and open-sourced the popular sentence-transformers library, which is the most popular framework to design semantic search applications. Recently, he joined cohere.ai as director of machine learning to lead the Search-as-a-Service team to develop new state-of-the-art neural search models and to make them broadly accessible as API endpoints.
Director of Machine Learning | cohere.ai