I’m not certain what DDG actually uses (wasn’t it Bing?), but in my experience from the last couple of months it ignores quotes substantially more eagerly than Google does. For this particular term, a little bit of domain knowledge helps: even without quotes, brst becchi, brst formalism, brst quantization or perhaps bv brst will get you reasonable results. (I could swear Google corrected brst quantization to best quantization a year ago, but apparently not anymore.) Searching for stuff in the context of BRST is still somewhat unpleasant, though.
I... don’t think anything particularly surprising is happening here, except for quotes being apparently ignored? I’ve had it explained to me that a rare word is essentially indistinguishable from a popular misspelling by NLP techniques as they currently exist, except by feeding the machine a massive dictionary (and perhaps not even then). BRST is a thing that you essentially can’t even define satisfactorily without at the very least four years of university-level physics (going by the conventional broad approach—the most direct possible road can of course be shorter if not necessarily more illuminating). “Best” is a very popular word both generally and in searches, and the R key is next to E on a Latin keyboard. If you are a perfect probabilistic reasoner with only these facts for context (and especially if you ignore case), I can very well believe that your best possible course of action is to assume a typo.
How to permit overriding that decision (and indeed how to recognize you’ve actually made one worth worrying about without massive human input—e.g. Russian adjectives can have more than 20 distinct forms, can be made up on the spot by following productive word-formation processes, and you don’t want to learn all of the world’s languages!) is simply a very difficult problem for what is probably a marginal benefit in the grand scheme of things.
In English, maybe; in Russian, I frequently find myself reaching for the nonexistent “morphology but not synonyms” operator (as the same noun phrase can take a different form depending on whether it is the subject or the object of a verb, or even on which verb it is the object of); even German should have the same problem AFAIU, if a bit milder. I don’t dare think about how speakers of agglunative languages (Finnish, Turkish, Malayalam) suffer.
(DDG docs do say it supports +... and even +"...", but I can’t seem to get them to do what I want.)
Ah, OK. I don’t know anything about Russian. This is a hard problem. I think the solution is something like what you suggest: more operators allowing different transformations. Even in English, I would like a "you may pluralize but nothing else" operator.
Well it’s not that alien, it (along with the other Eastern Slavic languages, Ukrainian and Belarusian) is mostly a run-of-the-mill European language (unlike Finnish, Estonian or Hungarian) except it didn’t lose the Indo-European noun case system like most but instead developed even more cases. That is, where English or French would differentiate the roles of different arguments of a verb by prepositions or implicitly by position, Russian (like German and Latin) has a special axis of noun forms called “case” which it uses for that (and also prepositions, which now require a certain case as well—a noun form can’t not have a case like it can’t not have a number).
There are six of them (nominal [subject], genitive [belonging, part, absence, “of”], dative [indirect object, recipient, “to”], accusative [direct object], instrumental [device, means, “by”], prepositional [what the hell even is this]), so you have (cases) × (numbers) = 6 × 2 = 12 noun forms, and adjectives agree in number and gender with their noun, but (unlike Romance languages) plurals don’t have gender, so you have (cases) × (numbers and genders) = 6 × (3 + 1) = 24 adjective forms.
None of this would be particularly problematic, except these forms work like French or Spanish verbs: they are synthetic (case, number and gender are all a single fused ending, not orthogonal ones) and highly convoluted with a lot of irregularities. And nouns and adjectives are usually more important for a web search than verbs.
I... don’t think anything particularly surprising is happening here, except for quotes being apparently ignored? I’ve had it explained to me that a rare word is essentially indistinguishable from a popular misspelling by NLP techniques as they currently exist, except by feeding the machine a massive dictionary (and perhaps not even then). BRST is a thing that you essentially can’t even define satisfactorily without at the very least four years of university-level physics (going by the conventional broad approach—the most direct possible road can of course be shorter if not necessarily more illuminating). “Best” is a very popular word both generally and in searches, and the R key is next to E on a Latin keyboard. If you are a perfect probabilistic reasoner with only these facts for context (and especially if you ignore case), I can very well believe that your best possible course of action is to assume a typo.
How to permit overriding that decision (and indeed how to recognize you’ve actually made one worth worrying about without massive human input—e.g. Russian adjectives can have more than 20 distinct forms, can be made up on the spot by following productive word-formation processes, and you don’t want to learn all of the world’s languages!) is simply a very difficult problem for what is probably a marginal benefit in the grand scheme of things.
I just dislike hitting these margins so much.