The Inside Track on Firefox Development.
« I Hate Comcast | Main | Justification »
June 21, 2006
SpellCheck
I was just now talking to Brett, who's been working on Spell Check for the past few weeks. One of the difficulties Firefox 2.0 faces is that there's no compatibly licensed pan-language dictionary. We're using MySpell now, which is LGPL for English only, I think. From what I'm told, we'd really like to use ASpell, because their dictionaries are way better, but because they are GPL, we can't. IANAL, so I can't give all the reasoning behind any of this.
What this means anyway is that we have an dictionary for English now that marks "Firefox" as a misspelling and our experience for users in other languages is that they have to manually go and download a dictionary. That sucks.
It'd be great if ASpell was also available under a compatible license, or some other compatibly licensed dictionary of similar quality were to emerge.
Posted by ben at June 21, 2006 10:04 AM
Comments
One problem I had with Firefox 2.0's spellcheck is that if you have a really large textbox (like, editing a large article on Wikipedia), Firefox will hang for a bit so it can underline stuff in red. I know there's probably no way to avoid this, but for the average Joe it's "crashing" and something should probably be done about it.
Posted by: Harrison at June 21, 2006 10:59 AM
There is hunspell on MPL/GPL/LGPL which is regarded as one of the best spellcheckers.
http://hunspell.sourceforge.net/
See also https://bugzilla.mozilla.org/show_bug.cgi?id=319778
Posted by: marcoos at June 21, 2006 11:19 AM
I've seen the same problem as Harrison, and its even worse on older systems.
If there are more than X amount of words, could there be some resumabnle 'timeouts' to fake an asynch call?
As for the current Dict, why not bundle it with a modified version that includes 'Firefox'? Surely under thr LGPL we can do that ;).
I do agree, a tri-licensed dictionary would be great, but I guess that is the downfall of the tri-license.
Posted by: Jed at June 21, 2006 11:21 AM
That's frustrating news about the ASpell license. What about the Mac version? Is there any chance of just interfacing into the native OSX dictionary for cross-application goodness? Or are we looking at ‘Password Manager vs. Keychain’-esque duplication?
Posted by: Ben Ward at June 21, 2006 11:24 AM
Since it's all open-source, what prevents from adding Firefox to the English dictionary we ship?
Posted by: Jesse Ruderman at June 21, 2006 12:03 PM
IMHO, this is the problem with the tri-license. Everything *must* be compatible with *all three licenses.* Because they use the stock LGPL, GNOME and GTK have been chosen as a base by numerous other projects (Nokia 770, anyone?). I know Mozilla's not just going to suddenly switch, but asking every contributer to relicense every single piece of code we use to three separate licenses (including the MPL, which some in the free software community have problems with) seems difficult.
Posted by: LinkTiger at June 21, 2006 12:42 PM
Question.
Why not addopt the ASpell ones?
It's not like Firefox cannot co-exist with GPL code, the restriction there is that if someone wants to use Firefox under a license that is not GPL they cannot use the dictionaries.
Why is that such a problem?
Unless MoCo wants to spring off yet another project I don't see another alternative.
Posted by: Jed at June 21, 2006 2:01 PM
Does any of commenters here have anything against hunspell (the spellchecker I linked above)? It's good, it's free, it supports more complex grammar and has the same license as Mozilla, so it's totally compatible...
Posted by: marcoos at June 21, 2006 2:25 PM
I'm using Thunderbird 1.5.0.4 and it has a spell checker, how does it handle this problem?
The biggest problem with TB's spellcheck is it doesn't suggest "a lot" for "alot" and "voilà" for "voila", leading to "viola, allot is happening" even when people check spelling :-)
BTW, I couldn't sign in here with my TypeKey identity. "The site you are signing into requests that you provide your email address. ..." I do so, I get "The site you're trying to comment on has not signed up for this feature. Please inform the site owner."
Posted by: skierpage at June 21, 2006 2:28 PM
Debian has a bunch of dictionaries for various languages (packages named aspell-<language code> and myspell-<language code>). Some of them do meet your licensing needs. Their copyright information is linked from the respective package pages.
Posted by: silence is foo at June 21, 2006 2:35 PM
Will there be a site which links to the multiple dictionaries for other languages? And is there anyway of using Microsoft Office's dictionary? Some one better develop a Te Reo dictionary!
Posted by: Greg at June 21, 2006 2:40 PM
The Mozilla Suite/Seamonkey's had built-in spellchecking for a few years now, and I'm sure that dictionaries were included when they used to ship localised versions?
Otherwise, Seamonkey currently has an "Download More" languages option in the spellcheck window that takes you to http://dictionaries.mozdev.org/installation.html where its a one-click install for your own language. With Firefox's incremental updates users would only need to download the dictioanry once (unlike Seamonkey's all-in-one update that blows away your dictionary with each new version).
Posted by: JB at June 21, 2006 3:14 PM
Would it be possible to just hook into the system OS X one for the Mac version. I'd hate to have a separate dictionary for the web, which is the reason it was made a system thing.
Posted by: Francisco Tolmasky at June 21, 2006 6:19 PM
Jesse beat me to the question: what prevents us from adding Firefox in? Either contributing the patch upstream, or "forking" (though to insignificant to call it a fork).
Posted by: Robert Accettura at June 21, 2006 6:30 PM
Why is everybody ignoring the hunspell guy above? Sounds like an interesting alternative for me.
Posted by: Doh at June 21, 2006 8:52 PM
Firefox 3 will be able to use the Mac OS X dictionaries like Camino currently does (Camino trunk only this week; branch next week). Accessing the system dictionaries requires some bits of Cocoa and apparently Carbon apps (or Mozilla Carbon apps) don't like it when Cocoa is initialized inside them ;)
The Hunspell engine that marcoos pointed out seems very promising in terms of supporting a large percentage of the localized builds Mozilla.* apps ship and in the willingness of the developer to address size and perf issues that people have already run across. Hopefully there's truly serious consideration being given to that engine, rather than just leading the developer on....
Posted by: Smokey Ardisson at June 21, 2006 9:11 PM
The dictionary in 2.0a3 contains words like "hes" and "thats"; I'm sure these are incorrect, and I can't find them in other dictionaries. "a lot" is not suggested as a replacement for "alot".
Great feature though! 2.0 users are gonna love it.
Posted by: Tom B. at June 21, 2006 11:09 PM
RE: Hes and thats
adding the s to the end of the word implies that something belongs to that.
The ships deck. The deck belongs to the ship.
So hes is incorrect, I think it should become 'his', his shoes, shoes that belong to him. Or he has shoes (without the s)
monk.e.boy
Posted by: monk.e.boy at June 21, 2006 11:46 PM
My understanding from Axel was that we are switching over to hunspell. It looks like there needs to be a bit more coordination between the different people working on this problem. :-)
The dictionary licensing issue is one that we've had for years. A few clarifications:
- Licensing problems merely mean we can't ship the dictionaries bundled; it doesn't stop us offering them for separate download, which is what we've been doing for ages.
- Many dictionaries are GPLed. It's really hard to work out what that means, but we interpret it as "The will of the dictionary author is that the dictionary should only be used in GPLed works". Firefox doesn't ship under the GPL, so we can't ship such dictionaries.
- Some dictionaries are under the LGPL. We could ship these legally, although it would be a change in mozilla.org policy, which up to now has been that we only ship things in the default build which are compatible with all three of our licences.
- There has been an effort to get some dictionaries relicensed, with varying success. Here's a list of currently known dictionaries. If anyone has something to add, please do.
If anyone has any more dictionary licensing questions, feel free to email me.
Posted by: Gerv at June 22, 2006 12:28 AM
So, you believe it normal that Aspell and probably Linux itself should change it licensing to accommodate a money making organization making millions that it doesn't return to developers?
Get real! The geek world doesn't revolve around your HQ building.
Posted by: mofo at June 22, 2006 4:21 AM
monk.e.boy
That's the first i've heard of that rule, but I am not an expert grammatician.
I'd write > or > or > because, for possessive, you use an 'apostrophe s' most of the time. You can use an s without an apostrophe when you are NOT referring to the possessive (when you don't mean, "the deck belongs to the ship".
> or >: here I am using two ways of writing the plural of ship deck (as opposed to when we want to refer to a particular deck on a ship, or a particular ship's deck). I also illustrate the difference in using a descriptive plural: the captains table -- referring to a type of table, and the possessive -- the captain's table: the particular table belonging to the captain of the ship in question.
As to the "hes and shes" that appear in the dictionary: I have no idea. I take it one is the plural of he and one is the plural of she, where he and she are used as nouns, but I know of no particular idioms that are not utter slang where they could be used in the way I used captain in the above explanation. The use of he and she as slangy nouns or adjectives must be rare in proper speech, maybe it's like a more slangy "his and hers".
Posted by: pm martin at June 22, 2006 7:43 AM
sorry, that post is incomprehensible due to the blog's software destroying things in double angle brackets, my preferred way of quotation. BOO! . i shouldn't have posted without previewing I guess. Oh yea, and the back button doesn't work :mad:. Stupid computers.
Sorry for the spam, and Use your imagination.
--
... I'd write the ship's deck or the ship's decks or the ships' decks for possessive ...
--
... "the man ordered three ships deck from the pirate catalogue on the captain's chair ", "the man ordered three ship decks from the pirate catalogue on the captains chair. Here I using two ways of writing the plural of ship deck ...
Posted by: pm martin at June 22, 2006 7:54 AM
ASpell is great. I use it all the time in Opera. :)
Posted by: d3bruts1d at June 22, 2006 11:49 AM
Quoting Kevin Atkinson
http://www.mail-archive.com/aspell-user@gnu.org/msg00773.html
--snip--
Aspell in under the LGPL as it clearly says in the manual.
Each dictionary is under its own copyright. The English dictionary is under a much weaker copyright than the LGPL. Some others or LGPL while many are GPL. I do not control the copyright of the dictionaries and accept anything which meats the FSF definition of "free".
--snap--
So, where is the problem?
Posted by: AC at June 23, 2006 5:08 AM
So, you believe it normal that Aspell and probably Linux itself should change it licensing to accommodate a money making organization
Why not? We changed ours, at great effort, to accommodate existing and potential GPL projects that wanted to use Mozilla code. But I don't think, in fact, that anyone is asking them to change, merely bemoaning the current tangled mess. No blame, it is what it is for historical reasons.
that it doesn't return to developers?
Have you personally contributed more than you think you've gotten in return? Most Mozilla developers seem OK with the way the Foundation and Corporation are handling this, though of course we're keeping a close eye on it.
Posted by: Daniel Veditz at June 23, 2006 2:02 PM
Why not use enchant?
It is under LGPL.
More information:
http://www.abisource.com/projects/enchant/
Posted by: david Flechl at June 24, 2006 2:53 PM
Aspell gives good results for English and for languages where phonetic tables have been created. A *big* con is that it does not support compound word rules as hunspell does. Hunspell is for many languages the only engine that is really fulfilling the needs. Furthermore Hunspell is the spell checker which is most actively developed atm.
Posted by: Bjoern at June 24, 2006 4:21 PM
long live the maple leaf!!!!!
Posted by: Alexei at June 26, 2006 10:19 PM
If the spelling was an extension would that make licensing easier? Or could differently-licensed dictionaries be added as extensions?
Posted by: Monkey at July 1, 2006 3:14 AM
Why does firefox need its own spellchecker? Wouldn't it be better to use an API only like enchant. Then firefox can use whatever spellchecker is available via enchant. Enchant supports the following Backends
Enchant is capable of having multiple backends loaded at once. Currently, Enchant has 6 backends:
* Aspell/Pspell (intends to replace Ispell)
* Ispell (old as sin, could be interpreted as a defacto standard)
* MySpell/Hunspell (an OOo projects, also used by Mozilla)
* Uspell (primarily Yiddish, Hebrew, and Eastern European languages - hosted in AbiWord's CVS under the module "uspell")
* Hspell (Hebrew)
* AppleSpell (Mac OSX)
(look at http://www.abisource.com/projects/enchant/)
Enchant is currently licensed under the LGPL license
Posted by: oskar at July 2, 2006 3:57 AM
Does anyone have any idea why Firefox has started opening up a different size than when I closed it? For instance, I will have the size maximized, when I reopened it is shrunk down. I have Sage installed and I will put the little icon up next to my home button, when I reopen Firefox it's gone. I use Update Notifier. I place the update notifier icon next to the button for going to the FF home page, when I reopen FF, it's gone.
I used not to have this problem but now all of a sudden this has started happening. Any suggestions?? Thanks!!
Posted by: jon at July 21, 2006 5:51 AM
©1997-2006 Ben Goodger. All Rights Reserved.
Opinions expressed here are my own, and not those of any organization that I may be affiliated with.
Reload icon is © Stephen Horlander;
Firefox logo is by
Jon Hicks, and is a
trademark of The Mozilla Foundation.
GetFirefox buttons are from rakaz
