Comparison of ranking algorithms with dataspace

With increased in digitization the amount of homogeneous, unstructured, semi-structured, structured or heterogeneous data being created and stored is exploding is collectively called “Dataspace”. Data being generated from various heterogeneous sources like, digital images, audio, video, online transactions, online social media, data from sensor nodes, click streams for different domains including, retails, medical, healthcare, energy, and day to day life utilities. In business, industries, institutions and organizations, individuals contribute the data volume like technical reports, seminar reports, research papers, dissertations, thesis etc. For instance, 30 billion web pages are accessed or the World Wide Web. With terrific number of pages of that exist today; search engines assume a significant role in the current internet of thing (IOT).

So with billions of web pages accessible on the web, a user query entered in the search engine may returns thousands of web pages, and thus it becomes extremely important to rank these results in such a way that the most “related” or “important” or “authorized” pages are displayed first. This job of prioritizing the results is performed by ranking algorithms, and various search engines use different schemes for ranking the results. Ranking of data can also do in heterogeneous data to retrieve information from the Dataspace. The aim of this paper is to describe Dataspace and present a survey on ranking algorithms, and their comparison, Comparison is done on the basis of some parameters such as main technique use, methodology, and input parameter, and relevancy, quality of results, importance and limitations, search engines and time complexity of algorithms. In this we also explained how ranking can be used in Dataspace with challenges to information retrieval from heterogeneous data or from Dataspace.