Distributed Computing refers to processing across distributed architecture,where several participating entities with processing power are distributed across the network. This field fascinates me alot. one of my projects is based on this concept titled 'Pebble Grid Computing Framework'. In this a framework was developed that provides tools, daemons and libraries to implement a grid.
This interest is derived from distributed systems. Another project titled 'FileSync' is based on this. Files and directories on the cluster or Grid nodes can be synchronized using GUI App without any explicit mount commands.Other sub areas under this are Cloud Computing, Grid Computing, Web application, Web Services. Cloud Computing refers to the computing power of the WWW. Here the processing power resides at server. Rich data processing web applications providing services to requests form the core of the Cloud. Take any examples like Gmail, Google Docs,etc. all these form the Cloud. The future of Cloud Computing will result in light weight, faster operating systems. As most of the processing power will shift to the cloud, the machine's responsibility will be to just provide acccess to that cloud service. This decade, One such Operating System is in news. It is ChromeOS from Google.
Grid Computing refers to combination of computing power of several resources which are distributed across several administrative domains.Usually when an I/O bound process is in execution, most of CPU cycles are wasted.Grid Computing aims to capture those CPU cycles of several remote machines and use them for useful work. This is also called as CPU cycle scavenging.Popular projects using immense power of grid are: SETI@home, Einstein@Home,etc. objective of my project was to provide a framework to users that provides features to implement Department-Cluster-Global grid.
Crawling the Web
Web Crawling is one of the fascinating area to work upon. I love to learn new approaches in crawling. One of my reseach project was based on this. My aim was to compare the crawler's performance by using semaphore among the crawler threads. Its all about aggregating all information on the web at one place. But actually it is not a single place, more precisely not on single server. Here also concepts of distributed systems applies. Several distributed entities are involved in serving our search and other related queries.
Pebble grid computing framework allows jobs of divide and conquer nature. so it supports c,c++ web crawling scripts to be executed on remote execution resource. This way Pebble grid computing framework makes crawling faster and organized.