I recently watched The Artist, a film of cinematic genius at the cinema. It is witty, charming, surprising and wonderful. I found myself smiling to myself remembering meeting the famous [...]
When you switch off your internet-connected computer, whether laptop, desktop or smartphone, any tasks that you initiated also stop, or at least pause in hibernating memory. Resume the machine and the task might continue where it left off, or it might require a restart. It’s all very frustrating, time-consuming and wasteful. But, what if you’re networked device could initiate a task in the cloud, such as a long-haul search, a calculation or a database analysis, for instance. You could switch off your device (to save battery life) and the task would still be running elsewhere. Resume your device at a later time and you would either see the results or you were after or get a status update on how long to completion for the task.
This issue is annoying for everyday users accessing disparate data sources, such as mp3 files, videos, appointments, address books, but even more so for scientists with large computer-controlled experiments and sensors to run and data sets to process on vast high-speed networks of computers known as Grids. The software running such tasks does not always recover well when the “client” computer that initiated the task is disconnected.
In an idea world distributed databases, whether for everyday users or scientists, really do hold the promise of interconnecting mobile devices, workstations, and servers and allowing them to share data and computational resources on an enormous scale. But, the very dynamic nature of mobile and wide-area networks means that the sources and resources are in constant flux, even if packets of information are shared redundantly (as with the Bit Torrent system used to distribute open source software…oh and illicit music and video files). The answer might be to generate a virtual catalog of the sources and resources that adapts to changes as devices move on- and off-line when users connect and disconnect their laptops, mobiles and other computers.
Now, Oliver Moreno-Puello and Manuel Rodriguez-Martinez of the Department of Electrical and Computer Engineering, at the University of Puerto Rico, in Mayagüez, think they have developed a decentralized framework for managing the necessary hidden information, the metadata, that could allow such a catalog to be constructed and allow tasks and data queries to be initiated by one device, picked up by others, and completed at the end-user device. It avoid reliance on the so-called MobileIP system that is yet to emerge and instead builds on the peer-to-peer (P2P) paradigm of information distribution well-known to Bit Torrent users and “file sharers” but having countless legitimate applications In computing.
“Our approach is based on a peer-to-peer catalogue management organization, using consistent hashing as the mechanism to locate metadata objects,” the team explains. “Our framework makes the system more scalable since there is no central metadata repository and metadata can be found through an efficient search mechanism. It also provides efficient mechanisms to handle the arrival and departure of hosts in the system.”
The team has built their prototype solution using the open source NetTraveler middleware system. Essentially, NetTraveler can start a query, continue running it even when the “client” device posing the query leaves, and then delivers the query results when the client returns to the system. “NetTraveler is designed to support efficient data management over dynamic wide-area networks where data sources go offline, change location, have limited power capabilities and form ad hoc federations of sites,” the team says. Early tests show a 99.9% success rate on queries, the team adds. Availability beats centralized and partitioned catalog approaches and is close to the ideal of a fully replicated catalog system. Such replication is common for file sharing on private Bit Torrent tracker networks but not so much on public networks.
There are three keys to the success of such a system. First, P2P systems are “self-organizing”, the nodes (user devices) organize themselves into a network through a discovery process so it can adapt to arrivals, departures and failures. Secondly, P2P involves symmetric communication, all peers are just that equals within the system and all can act as either client of server. Thirdly, P2P has decentralized control, there is no hierarchy, no central controller telling any of the nodes how to behave. Because of this, the system is incredibly robust to those arrivals, departures and failures, providing there is a large enough number of nodes within the network to begin with.
Oliver Moreno-Puello & Manuel Rodriguez-Martinez (2011). Catalogue manager for metadata dissemination in the NetTraveler middleware system. Int. J. Intelligent Information and Database Systems, 5 (3), 271-295
Image reproduced from http://www.wp7connect.com/
© 2011, City Connect News. Copyright Notice & Disclaimer are below.
Filed Under: Science
About the Author: David Bradley has worked in science communication for more than twenty years. After reading chemistry at university, he worked and travelled in the USA, did a stint in a QA/QC lab and then took on a role as a technical editor for the Royal Society of Chemistry. Then, following an extended trip to Australia, he returned and began contributing as a freelance to the likes of New Scientist and various trade magazines. He has been growing his portfolio and and has constructed the Sciencebase Science News and the Sciencetext technology website. He also runs the SciScoop Science Forum which is open to guest contributors on scientific topics.