Thursday, January 24, 2008

Latent Semantic Indexing

Latent semantic Indexing (LSI) is a process of extracting related words or information from your website content. This is very interesting topic in Information Retrieval (IR) System. Top search engines works on the latent semantic indexing (LSI) based.

Lexical indexing is completely based on Lexical analysis. Lexical analysis is the processing of an input which can be a form of sequence of characters which will be produce as output. A sequence of characters or symbols called as lexical tokens. A lexical analyzer will be divided into two stages. First stage is known as a scanner and second stage is known as an evaluator. The Latent Semantic Indexing is depending on these two states. LSI based search engine optimization is much more complex in comparison to normal search engine optimization. The search engine ranking for a particular website will have to pass several processes in the latent semantic indexing based search engine optimization. This process will contain the occurrence of a keyword in a document and the close relationship with the other words of the document, flavor of your website content.

If you're searching in a Latent semantic indexing (LSI) indexed database then the search engine looks at similar values it has calculated for every synonyms word and returns the best matched website that will be the best fit to the query. Because latent semantic indexing does not require exact matching words for ranking result.

For LSI based Search Engine Optimization we go through following process:

Categorization of the documents
Contextual Explanation from the lexical similar words
Conceptual Comparison
Cross-Lingual Text Analysis
Content Relationship Discovery
Document Summarization
Taxonomy Generation

Google Sandbox Effect

In the age of fair competition you may find it hard to believe that a search engine may hinder the appearance of a new website.

This is what is currently believed to be happening on more web servers today. Some programmers have viewed Google as uncomfortable to rank newer websites until they have proven their viability to exist for more than a period of "x" months. Thus the term "Sandbox Effect" applies to the idea that all new websites have their ratings placed in a holding tank until such time is deemed appropriate before a ranking can commence.

However the website is not hindered as much as the links that are reciprocated from other users. Newer links that are created are put on a "probationary" status until again they pickup in rank from other matured sites or placed directly by an ad campaign. The idea behind the hindrance is to prevent a fast ranking to occur on a new website. The usual holding period seems to be between 90 and 120 days before a site would start obtaining rank from reciprocal or back linking.

Some advice has been given to have companies you are going to reciprocate back add your link first to the website. This may help grandfather your site in, thus reducing the waiting time associated with "new" websites. People have noticed a 0 page rank when first signing up and receiving a bolstering 7 page ranking after 4 months. Why the delay? The fact is, that if people realized how easy it would be to get a high ranking, would that take away the credibility of the engine. It depends on whom you ask, but it does seem to be happening frequently to newer subscribers. Do not discontinue back linking, your rank will eventually appear.