2/27/12 | 11:30:00 AM
Labels: search quality
This month we have many improvements to celebrate. With 40 changes reported, that marks a new record for our monthly series on search quality. Most of the updates rolled out earlier this month, and a handful are actually rolling out today and tomorrow. We continue to improve many of our systems, including related searches, sitelinks, autocomplete, UI elements, indexing, synonyms, SafeSearch and more. Each individual change is subtle and important, and over time they add up to a radically improved search engine.
Here’s the list for February:
- More coverage for related searches. [launch codename “Fuzhou”] This launch brings in a new data source to help generate the “Searches related to” section, increasing coverage significantly so the feature will appear for more queries. This section contains search queries that can help you refine what you’re searching for.
- Tweak to categorizer for expanded sitelinks. [launch codename “Snippy”, project codename “Megasitelinks”] This improvement adjusts a signal we use to try and identify duplicate snippets. We were applying a categorizer that wasn’t performing well for our expanded sitelinks, so we’ve stopped applying the categorizer in those cases. The result is more relevant sitelinks.
- Less duplication in expanded sitelinks. [launch codename “thanksgiving”, project codename “Megasitelinks”] We’ve adjusted signals to reduce duplication in the snippets for expanded sitelinks. Now we generate relevant snippets based more on the page content and less on the query.
- More consistent thumbnail sizes on results page. We’ve adjusted the thumbnail size for most image content appearing on the results page, providing a more consistent experience across result types, and also across mobile and tablet. The new sizes apply to rich snippet results for recipes and applications, movie posters, shopping results, book results, news results and more.
- More locally relevant predictions in YouTube. [project codename “Suggest”] We’ve improved the ranking for predictions in YouTube to provide more locally relevant queries. For example, for the query [lady gaga in ] performed on the US version of YouTube, we might predict [lady gaga in times square], but for the same search performed on the Indian version of YouTube, we might predict [lady gaga in India].
- More accurate detection of official pages. [launch codename “WRE”] We’ve made an adjustment to how we detect official pages to make more accurate identifications. The result is that many pages that were previously misidentified as official will no longer be.
- Refreshed per-URL country information. [Launch codename “longdew”, project codename “country-id data refresh”] We updated the country associations for URLs to use more recent data.
- Expand the size of our images index in Universal Search. [launch codename “terra”, project codename “Images Universal”] We launched a change to expand the corpus of results for which we show images in Universal Search. This is especially helpful to give more relevant images on a larger set of searches.
- Minor tuning of autocomplete policy algorithms. [project codename “Suggest”] We have a narrow set of policies for autocomplete for offensive and inappropriate terms. This improvement continues to refine the algorithms we use to implement these policies.
- “Site:” query update [launch codename “Semicolon”, project codename “Dice”] This change improves the ranking for queries using the “site:” operator by increasing the diversity of results.
- Improved detection for SafeSearch in Image Search. [launch codename "Michandro", project codename “SafeSearch”] This change improves our signals for detecting adult content in Image Search, aligning the signals more closely with the signals we use for our other search results.
- Interval based history tracking for indexing. [project codename “Intervals”] This improvement changes the signals we use in document tracking algorithms.
- Improvements to foreign language synonyms. [launch codename “floating context synonyms”, project codename “Synonyms”] This change applies an improvement we previously launched for English to all other languages. The net impact is that you’ll more often find relevant pages that include synonyms for your query terms.
- Disabling two old fresh query classifiers. [launch codename “Mango”, project codename “Freshness”] As search evolves and new signals and classifiers are applied to rank search results, sometimes old algorithms get outdated. This improvement disables two old classifiers related to query freshness.
- More organized search results for Google Korea. [launch codename “smoothieking”, project codename “Sokoban4”] This significant improvement to search in Korea better organizes the search results into sections for news, blogs and homepages.
- Fresher images. [launch codename “tumeric”] We’ve adjusted our signals for surfacing fresh images. Now we can more often surface fresh images when they appear on the web.
- Update to the Google bar. [project codename “Kennedy”] We continue to iterate in our efforts to deliver a beautifully simple experience across Google products, and as part of that this month we made further adjustments to the Google bar. The biggest change is that we’ve replaced the drop-down Google menu in the November redesign with a consistent and expanded set of links running across the top of the page.
- Adding three new languages to classifier related to error pages. [launch codename "PNI", project codename "Soft404"] We have signals designed to detect crypto 404 pages (also known as “soft 404s”), pages that return valid text to a browser but the text only contain error messages, such as “Page not found.” It’s rare that a user will be looking for such a page, so it’s important we be able to detect them. This change extends a particular classifier to Portuguese, Dutch and Italian.
- Improvements to travel-related searches. [launch codename “nesehorn”] We’ve made improvements to triggering for a variety of flight-related search queries. These changes improve the user experience for our Flight Search feature with users getting more accurate flight results.
- Data refresh for related searches signal. [launch codename “Chicago”, project codename “Related Search”] One of the many signals we look at to generate the “Searches related to” section is the queries users type in succession. If users very often search for [apple] right after [banana], that’s a sign the two might be related. This update refreshes the model we use to generate these refinements, leading to more relevant queries to try.
- International launch of shopping rich snippets. [project codename “rich snippets”] Shopping rich snippets help you more quickly identify which sites are likely to have the most relevant product for your needs, highlighting product prices, availability, ratings and review counts. This month we expanded shopping rich snippets globally (they were previously only available in the US, Japan and Germany).
- Improvements to Korean spelling. This launch improves spelling corrections when the user performs a Korean query in the wrong keyboard mode (also known as an "IME", or input method editor). Specifically, this change helps users who mistakenly enter Hangul queries in Latin mode or vice-versa.
- Improvements to freshness. [launch codename “iotfreshweb”, project codename “Freshness”] We’ve applied new signals which help us surface fresh content in our results even more quickly than before.
- Web History in 20 new countries. With Web History, you can browse and search over your search history and webpages you've visited. You will also get personalized search results that are more relevant to you, based on what you’ve searched for and which sites you’ve visited in the past. In order to deliver more relevant and personalized search results, we’ve launched Web History in Malaysia, Pakistan, Philippines, Morocco, Belarus, Kazakhstan, Estonia, Kuwait, Iraq, Sri Lanka, Tunisia, Nigeria, Lebanon, Luxembourg, Bosnia and Herzegowina, Azerbaijan, Jamaica, Trinidad and Tobago, Republic of Moldova, and Ghana. Web History is turned on only for people who have a Google Account and previously enabled Web History.
- Improved snippets for video channels. Some search results are links to channels with many different videos, whether on mtv.com, Hulu or YouTube. We’ve had a feature for a while now that displays snippets for these results including direct links to the videos in the channel, and this improvement increases quality and expands coverage of these rich “decorated” snippets. We’ve also made some improvements to our backends used to generate the snippets.
- Improvements to ranking for local search results. [launch codename “Venice”] This improvement improves the triggering of Local Universal results by relying more on the ranking of our main search results as a signal.
- Improvements to English spell correction. [launch codename “Kamehameha”] This change improves spelling correction quality in English, especially for rare queries, by making one of our scoring functions more accurate.
- Improvements to coverage of News Universal. [launch codename “final destination”] We’ve fixed a bug that caused News Universal results not to appear in cases when our testing indicates they’d be very useful.
- Consolidation of signals for spiking topics. [launch codename “news deserving score”, project codename “Freshness”] We use a number of signals to detect when a new topic is spiking in popularity. This change consolidates some of the signals so we can rely on signals we can compute in realtime, rather than signals that need to be processed offline. This eliminates redundancy in our systems and helps to ensure we can continue to detect spiking topics as quickly as possible.
- Better triggering for Turkish weather search feature. [launch codename “hava”] We’ve tuned the signals we use to decide when to present Turkish users with the weather search feature. The result is that we’re able to provide our users with the weather forecast right on the results page with more frequency and accuracy.
- Visual refresh to account settings page. We completed a visual refresh of the account settings page, making the page more consistent with the rest of our constantly evolving design.
- Panda update. This launch refreshes data in the Panda system, making it more accurate and more sensitive to recent changes on the web.
- Link evaluation. We often use characteristics of links to help us figure out the topic of a linked page. We have changed the way in which we evaluate links; in particular, we are turning off a method of link analysis that we used for several years. We often rearchitect or turn off parts of our scoring in order to keep our system maintainable, clean and understandable.
- SafeSearch update. We have updated how we deal with adult content, making it more accurate and robust. Now, irrelevant adult content is less likely to show up for many queries.
- Spam update. In the process of investigating some potential spam, we found and fixed some weaknesses in our spam protections.
- Improved local results. We launched a new system to find results from a user’s city more reliably. Now we’re better able to detect when both queries and documents are local to the user.
And here are a few more changes we’ve already blogged about separately:
- Flight Search on mobile
- Improved health searches
- Better related searches for images
- Upcoming concert dates