Amazon Alexa, Apple Siri, Facebook M, Google Assistant, and Microsoft Cortana: these are all big companies that have joined the intelligent personal assistant (or virtual assistant) race. Breakthrough? Toy? Yet another kind of interface for search engines? One more attempt to introduce natural language interface to even broader audience?
I say it's something in the middle. It's quite the endeavor to introduce semantic technologies into everyday life; technologies that try to understand users. Will they succeed? There are serious doubts because virtual assistants mostly work with standard applications and standard areas of searching, so, they don't understand everything, but mostly only pre-defined areas. What about third-party applications, user-created data, and other areas of searching?
Each virtual assistant is optimized for a specific area of its creator's concern (i.e., for phones, it's voice messages, emails, meetings, etc). Theoretically, third-party applications can use them, too, through APIs, but the problem is that these interfaces are different for each assistant and sometimes are proprietary. Could you imagine if each browser worked with its own type of hypertext?
Well, to some degree we could, as browsers used to have specific HTML tags, but all competitors have finally agreed that it is better to support standards. Why? Because it was a nightmare for developers when their sites looked different, for browser teams that were constantly asked why this or that page was defaced, and, of course, for users, who were disappointed with deformed sites. The same is expected of virtual assistants until the current situation changes.
Do virtual assistants make a difference with what we can do in a browser with search queries? Judging by the results, they do not create an additional value for searches. Additionally, they inherit search engine problems. Advertisements and tutorials for intelligent personal assistants mostly focus on standard search areas like maps, directions, weather, stock exchange, entertainment, sports, restaurants, etc. Why? Because they are popular areas for searching and, in part, search engines are optimized for queries there. These areas are quite formalized with standard fields and values. Therefore, we have a lot of sites that search other sites of namely these domains. Is search for them ideal? No, because, for example, a hotel room for two near SomeCity center new year query won't lead you directly to reservations but gives the same results as a SomeCity hotels" query (further manual filtering required).
20 Problems of Modern Computer Semantics
If any application is designed to have natural language and give results from files and applications, then, evidently, it needs an interface between natural language and applications. In what form? Is it possible at all to match applications restricted with assumptions and natural language full of ambiguities? This is not the only problem. Others are deeply rooted in the computer industry itself and have historical reasons (though some are not viable now). In part, our interaction with computers is too machine-centric, which can be explained by the restricted abilities of computer themselves many years ago (but not now). Additionally, we are accustomed to interacting with information and don't want to change our habits.
That all being said, here are 20 problems with modern computer semantics. In the second part of this series, I'll discuss ways to tackle these problems.
The usage of unique classification occurs. Classification is grouping things by similarity, but such grouping is arbitrary because we can have an almost infinite number of similarity criteria (for example, Galaxy -> Solar System -> Jupiter, Celestial bodies -> Gas giants -> Jupiter, or Astronomical objects -> Planets -> Jupiter).
Classification paths can be confused. We are forced to use file and GUI paths (as a chain of GUI controls and actions with them), but because of arbitrariness, we can forget which similarity criteria was used before.
There is mixed classification, abstraction, specification, inclusiveness, and relevance. We can have paths like SolarSystem/planets/Jupiter/chemistry/reports, though (a) Solar System has planets, (b) planet is class of Jupiter, (c) Jupiter relates to or has chemistry, (d) reports is rather a summary (abstraction) of directory content.
Duplicate classification occurs. When we copy a file from one place to another, a file path (classification) is not kept always, so, in a new place, we are to re-classify it.
There is not meaningful identification. File names sometimes are encoded (like jup_meas_2015) or random (like New Document5.txtt"). In particular, it occurs because users are not motivated to give meaningful names.
Abstractions and specifications are misused or not used. Though usually, it is not underlined, any name should express an aggregated meaning for content. Because users are not motivated to support this, it leads to desynchronization between name and content.
Duplicate identification occurs. Very often, an internal identification (like a title in a text file) is duplicated with an external identification (in a file name).
Scopes are usually static. Thus, directory scope includes files (which are copied or created there) and usually are not present in other scopes (which has different matching similarity criteria for given information).
Context is usually interpreted as an ability to have synonyms to words (like "my phone") or dependency on personal data. However, we can interpret it wider, as it is rather a disambiguation if a given word belongs to the current scope or not (which includes both synonyms and personal data as modifiers of a scope).
Semantics is not explicit (for users). For example, hypertext has semantic tags (like em instead of i), but even this tag does not explain why we have an emphasis on these words. Another example is relevancy. It is frequently used now, but the problem is that without knowing relevancy criteria, it is hard to use further. The fact that the Sun is relevant to nuclear fusion tells us a little about the character of this relevancy. Moreover, semantics is not exposed to users, as modern semantics approaches assume meaning will be extracted by intelligent agents (even though no algorithm is able to understand natural language fully and no algorithm is able to summarize content).
Users operate with computer entities. For example, we usually operate with planet file or planet database or planet window, though, in real life, we deal with just planets and not with planet book or planet model.
Meaning is enclosed in computer entities. Often, meaning is not exposed from UI, binary data, or applications, and it is hard to link it between different applications. We can do this only through written instructions, data export, or APIs, but even so meaning remains scattered (as we cut links with other information).
Meaning is not considered for user-created data. For example, you can create a one-time report that's not worthy of writing an application about, but which is nonetheless important.
The granularity of meaning is often restricted with boundaries of computer entities. However, generally speaking, some meaning scopes may include different parts of several files and some files may include several meaning scopes. Thus, an article about Jupiter may include sections for atmosphere and exploration, which can be considered separately from overall meaning scope of Jupiter.
There is no difference between global search and local search. A global search would be Find me something anywhere and a local search would be Find me what I lost here.
There are no manual tools for disambiguation and other meaning handling.
There are no automatic tools for disambiguation and other meaning handling.
Mostly, only the top-down approach is used in semantics. It forces you to define domain entities and rules in full extent, which is possible only with restrictions and assumptions. Any domain definition can be expanded almost infinitely. Say, if you have the name origin field for the planet table, then you may need to link planet domain with mythology, then with folklore, then with geography, then with biology, then with zoology, etc.
Communication is not considered to be a part of semantics, though it is about information and information update exchange.
Natural language questions are not used widely. Though virtual assistants can change this tendency but users are not motivated to do so because search engines have similar results for both <word> and What is <word> queries, so users may prefer a brief variant.
Stay tuned for the second part of this series, in which we will discuss 20 tips for dealing with these problems.