All designers love ligatures but vanilla web typography just sucks. I heard of smartypants that solves many of thoses issues, bringing sexy quotes, gorgeous ampersands and all. It’s just perfect, visually.
But i’m worried about SEO. Let’s say on a web page, the word
ﬁnally. Are search engines capable of indexing that word and return the page when searching for
finally(without the ligature) ?
If your server dishes out pages with ligatures (like smartypants does), search engines are inconsistent. Bing currently doesn’t index the ligatures right. I’d say in general, it’s asking for trouble. Since search engines change, there’s a method below you can use to test how search engines you’re interested in index ligatures.
More detail on the first case (pages dished out with ligatures in the code)…
…it shouldn’t make any difference to a well-set-up search engine.
First it helps to understand the difference between glyphs and characters. A ligature
ﬁ is one glyph that stands for two characters
i. How software treats it is up to that software and depends on context and the task at hand – you’ll see from examples in that linked question that when you copy and paste glyphs, what gets pasted will vary: sometimes the glyph is pasted, sometimes the glyph is treated as its associated characters and
i are pasted.
Any well-made automatic text processor that is interested in text semantics (search engines, spell check, screen readers…) should treat a glyph as identical to the characters it stands for, and should treat
ﬁnally as identical to
finally, because that’s the textual meaning of the
Not everything is well-made…
Here’s an easy way to test search engines. Here’s a line of text from that other question:
Copy the ligature ﬁ from Illustrator to this input box
If we take the non-ligatures version of that sentence and search on it in double-quotes:
"Copy the ligature fi from Illustrator to this input box"):
- …if a search engine treats ligature glyphs as matches for the characters they represent, it will find that page (and, when it’s indexed, this one)
- …if a search engine treats ligature glyphs as different to the characters they represent, it’ll find nothing until this page is indexed, then, it’ll find only this page, and searches with the ligature version will find that page.
- …if a search engine freaks out at the sight of glyphs like ligatures completely, it’ll find nothing, not even this page, and searches with the ligature version will also find nothing
Some quick results for the world’s top 5 search engines (links are to search results):
- Google: Good (type 1). (despite comment below, it copes fine with both unicode or HTML entity formatting)
- Bing: Fail (type 2).
- Yahoo: Fail (type 2) (turns out Yahoo is “Powered by Bing”)
- Yandex (Russian): Good (type 1)
- Baidu (Chinese): erm, no graphicdesign.stackexchange.com pages seem to appear in Baidu searches at all… maybe we’re banned there…?!