Semantic Web: How the Internet Could Understand Content

The invention of the Web and its rapid global spread is a historically unique achievement. First introduced in 1990, the WWW has now become mankind’s greatest source of information and interaction and continues to grow steadily and exponentially. From a few dozen websites in 1991, today there are almost two billion with around 100 billion web documents. And every six months, the number of web documents doubles. The web threatens to become a victim of its own success, because who can see through so much information? Relevant search engines contribute to structuring the informational space of the web and making it manageable. Without them, WWW users would be hopelessly overwhelmed. But even search engines only “see” part of the web. In the “Deep Web”, i.e. the part of the WWW that cannot be found using a search engine, there is another unmanageable number of websites and documents. The same goes for the “Dark Web”, access to which can only be gained by using certain anonymization programs.

But even in the area that traditional search engines can “see”, they lack the ability to organize and structure search results. The big question is therefore how to successfully process the information that the web provides us in such a way that everyone with their own requirements can take full advantage of it. What is needed is an “intelligent” Web that offers each user individual content and precisely the information that concerns them, a version of the Web often referred to as “Web 3.0” or “Semantic Web”.

What intelligent technology is behind the term TCP/IP? How do you get online videos? And why does the Internet appear to us as a whole, when it is made up of billions of different computers? Computer science professor Christoph Meinel examines this and more every three weeks in his behind-the-scenes look at the World Wide Web.
All episodes are available here: »Meinels Web Tutorial«

Why is it so difficult to create such a smart web? Computers ‘see’, ‘hear’ and ‘think’ differently from humans, and this applies to all media such as text, photos, music or videos. They don’t “understand” what the texts, images and music mean. To computers, all media is just specifically structured sequences of zeros and ones. They can recognize how words are made up of letters, they can count how often words appear in texts, they can tell pixels in pictures apart from each other, but they don’t understand right away if pictures show cats or politicians, whether bit sequences, sounds represent, describe a harmonious piece of music, or binary-coded video sequences represent a cinematic masterpiece. When we as humans are confronted with such media information, we rely on deep empirical and contextual knowledge which helps us to grasp the (semantic) meaning of the information presented and to classify the newly offered information correctly. We can identify an advertisement on a newspaper page at a glance and distinguish it from a content-rich article about the current state of the corona pandemic. We recognize politicians acting in words and pictures and can easily distinguish political news from travelogues or poetic musings, even if everything has been presented to us only in text or pictures.

Leave a Comment