Change detection and notification ( CDN ) refers to the automatic detection of changes made to World Wide Web pages and notices to interested users via email or other means. While search engines are designed to find web pages, CDN systems are designed to monitor changes to web pages. Prior to the detection and notification changes, the user needs to manually check the web page changes, either by visiting the website or regularly searching again. Effective and effective detection of changes and notifications is hampered by the fact that most servers do not accurately track content changes through Last-Modified or ETag headers.
Video Change detection and notification
History
In 1996, NetMind developed the first change detection and notification tool known as Mind-it, which runs for six years. It spawned new services like ChangeDetection (1999), ChangeDetect (2002) and Google Alerts (2003). Historically, poll changes have been made either by servers that send email notifications or desktop programs that consciously alert users to changes. Change alerts are also possible directly to mobile devices and through push notifications, HTTP webhooks and callbacks for application integration.
Monitoring options vary by service or product and the extent of monitoring one web page at a time to the rest of the website. What is actually monitored also varies by service or product with the possibility of monitoring text, links, documents, scripts, images, or screenshots.
With the exception of Google's patent applications related to Google Alerts, intellectual property activity with the detection of changes and vendor notifications is minimal. No vendor has successfully utilized the exclusive right to alter detection and notification technology through patents or other legal means. This has resulted in significant functional overlap between products and services.
Maps Change detection and notification
Architectural Approach
Changed detection and notification services can be categorized based on the software architecture they use. Two main approaches can be distinguished:
Server-based
The server collects content, tracks changes and records data, sends notifications in the form of email notifications, webhooks, RSS. Typically, the website associated with the configuration is managed by the user. Some services also have mobile device apps that connect to cloud servers and provide warnings to mobile devices.
Client based
Local client applications with graphical content interface content, track changes and log data.
Considerations
Some web pages change regularly, because of the inclusion of ads or feeds on the page presented. This can trigger false-positives in the changes, as users are often only interested in changes to the main content. Some approaches to overcome this problem exist.
- Create a metric of the difference between two page versions (calculated for example from total size change, change in HTML file, or change in DOM tree) and ignore changes below some thresholds. Thresholds can be set by the user, or estimated automatically by comparing some early versions of the page.
- Content extraction. For popular sites, or sites running popular software, content can be actively separated from the chaff by selecting DOM sub-trees, for example using XPath. Another typical method is the use of regular expressions to extract only the text that the user is interested in.
References
-
Chakravarthy, S.; Hara, S. C. H. (2006). "Automating Change Detection and Web Page Notification (Invited Papers)". 17th International Conference on Database and Expert System Applications (DEXA'06) . p.Ã, 465. doi: 10.1109/DEXA.2006.34. ISBN 0-7695-2641-1. - Shobhna, Bansal; Chadhaury, Manoj (June 2013). "Survey on Changing Web Page Detection Systems Using Different Approaches" (PDF) . International Computer Science and Computing Journal . IJCSMC. 2 (6): 294-299. ISSNÃ, 2320-088X . Retrieved September 8 2016 .
Source of the article : Wikipedia