Current crawling and searching work is based on the tags. A tag is a non-hierarchical keyword or expression assigned to a chunk of information. When crawling is performed based on tags the resulting metadata can include homonyms (the same tags used with different meanings) and synonyms (multiple tags for the same concept), which may lead to inappropriate connections between items and inefficient searches for information about a subject.
Ontologies contain formal descriptions of concepts and relationships that model certain domains. An ontology defines knowledge and properties of a given domain in a way that machines can read it and understand it.
We apply ontological knowledge for the text present in the tweets and messages on user wall in to different categories.
In contrast conventional approaches in text categorization that enhance known methods with ontological knowledge, we propose to directly use ontological knowledge for text categorization. The novelty of our method is that it does not rely on the training of a categorizer, making a training set unnecessary, and directly leveraging the knowledge from the ontology for text categorization. Thus without learning the distinguishing features of each category from documents in the training set, we are able to extract specific knowledge about the interesting domain or category from the ontology and use it for categorizing of tweets and users comments.
Our approach of categorization concentrates on the recognized named entities and relationships in tweet/message on user wall text to perform the categorization according to the taxonomy in the used ontology. In the implemented approach, the ontology effectively becomes the classifier. A classification category in system is defined as an ontology fragment, and can be seen as a context of interest (derived from tags for each category ) for categorization.