This tradeoff between overblocking and underblocking also applies not just to automated classification systems, but also to filters that use only human review. Given the approximately two billion pages that exist on the Web, the 1.5 million new pages that are added daily, and the rate at which content on existing pages changes, if a filtering company blocks only those Web pages that have been reviewed by humans, it will be impossible, as a practical matter, to avoid vast amounts of underblocking. Techniques used by human reviewers such as blocking at the Ip address level, domain name level, or directory level reduce the rates of underblocking, but necessarily increase the rates of overblocking, as discussed above. To use a simple example, it would be easy to design a filter intended to block sexually explicit speech that completely avoids overblocking. Such a filter would have only a single sexually explicit Web site on its control list, which could be re-reviewed daily to ensure that its content does not change. While there would be no overblocking problem with such a filter, such a filter would have a severe underblocking problem, as it would fail to block all the sexually explicit speech on the Web other than the one site on its control list. Similarly, it would also be easy to design a filter intended to block sexually explicit speech that completely avoids underblocking. Such a filter would operate by permitting users to view only a single Web site, e.g., the Sesame Street Web site. While there would be no underblocking problem with such a filter, it would have a severe overblocking problem, as it would block access to millions of non-sexually explicit sites on the Web other than the Sesame Street site.


