Children's Internet Protection Act (CIPA) Ruling eBook

United States District Court for the Eastern District of Pennsylvania
This eBook from the Gutenberg Project consists of approximately 196 pages of information about Children's Internet Protection Act (CIPA) Ruling.

Children's Internet Protection Act (CIPA) Ruling eBook

United States District Court for the Eastern District of Pennsylvania
This eBook from the Gutenberg Project consists of approximately 196 pages of information about Children's Internet Protection Act (CIPA) Ruling.
their links downward to bring back the pages to which they link (and the pages to which those pages link, and so on, but usually down only a few levels).  This spidering software uses the same type of technology that commercial Web search engines use.  While useful in expanding the number of relevant URLs, the ability to retrieve additional pages through this approach is limited by the architectural feature of the Web that page-to-page links tend to converge rather than diverge.  That means that the more pages from which one spiders downward through links, the smaller the proportion of new sites one will uncover; if spidering the links of 1000 sites retrieved through a search engine or Web directory turns up 500 additional distinct adult sites, spidering an additional 1000 sites may turn up, for example, only 250 additional distinct sites, and the proportion of new sites uncovered will continue to diminish as more pages are spidered.  These limitations on the technology used to harvest a set of URLs for review will necessarily lead to substantial underblocking of material with respect to both the category definitions employed by filtering software companies and CIPA’s definitions of visual depictions that are obscene, child pornography, or harmful to minors. 2.  The “Winnowing” or Categorization Phase

Once the URLs have been harvested, some filtering software companies use automated key word analysis tools to evaluate the content and/or features of Web sites or pages accessed via a particular URL and to tentatively prioritize or categorize them.  This process may be characterized as “winnowing” the harvested URLs.  Automated systems currently used by filtering software vendors to prioritize, and to categorize or tentatively categorize the content and/or features of a Web site or page accessed via a particular URL operate by means of (1) simple key word searching, and (2) the use of statistical algorithms that rely on the frequency and structure of various linguistic features in a Web page’s text.  The automated systems used to categorize pages do not include image recognition technology.  All of the filtering companies deposed in the case also employ human review of some or all collected Web pages at some point during the process of categorizing Web pages.  As with the harvesting process, each technique employed in the winnowing process is subject to limitations that can result in both overblocking and underblocking.

First, simple key-word-based filters are subject to the obvious limitation that no string of words can identify all sites that contain sexually explicit content, and most strings of words are likely to appear in Web sites that are not properly classified as containing sexually explicit content.  As noted above, filtering software companies also use more sophisticated automated classification systems for the statistical classification of texts.  These systems assign weights to words or other textual features and use algorithms to determine

Copyrights
Project Gutenberg
Children's Internet Protection Act (CIPA) Ruling from Project Gutenberg. Public domain.