First Page


Document Type



This Article highlights some of the critical distinctions between small data surveillance and big data cybersurveillance as methods of intelligence gathering. Specifically, in the intelligence context, it appears that “collect-it-all” tools in a big data world can now potentially facilitate the construction, by the intelligence community, of other individuals' digital avatars. The digital avatar can be understood as a virtual representation of our digital selves and may serve as a potential proxy for an actual person. This construction may be enabled through processes such as the data fusion of biometric and biographic data, or the digital data fusion of the 24/7 surveillance of the body and the 360° surveillance of the biography. Further, data science logic and reasoning, and big data policy rationales, appear to be driving the expansion of these emerging methods. Consequently, I suggest that an inquiry into the scientific validity of the data science that informs big data cybersurveillance and mass dataveillance is appropriate. As a topic of academic inquiry, thus, I argue in favor of a science-driven approach to the interrogation of rapidly evolving bulk metadata and mass data surveillance methods that increasingly rely upon data science and big data's algorithmic, analytic, and integrative tools. In Daubert v. Merrell Dow Pharmaceuticals, Inc., the Supreme Court required scientific validity determinations prior to the introduction of scientific expert testimony or evidence at trial. I conclude that to the extent that covert intelligence gathering relies upon data science, a Daubert-type inquiry is helpful in conceptualizing the proper analytical structure necessary for the assessment and oversight of these emerging mass surveillance methods.