Last modified by editors on 2022/06/17 03:35

<
From version < 12.2 >
edited by editors
on 2022/06/16 00:10
To version < 12.3 >
edited by editors
on 2022/06/16 00:11
>
Change comment: There is no comment for this version

Summary

Details

Page properties
Content
... ... @@ -87,8 +87,7 @@
87 87  In the context of this session, Programming anthropology: coding and culture in the age of AI,  while we have not attempted an ethnography of the methods we have used, or the processes of their application, we have been very mindful of how our choice of methods might influence or bias future research carried out on the infrastructure we are developing. We must consider the impact of these massive additions to eHRAF, particularly the impact on researchers, how and what they can research, and the bias we could be introducing to research outcomes.
88 88  
89 89  One obvious problem with data science and NLP methods is that these are either based directly on statistical methods, and each thus is associated with a probability of error, often 10% or more, or are based on non-deterministic ‘deep learning’ through adaptations of ML, including neural networks, which are more or less opaque with respect to their internal operation - the algorithm is known, the details of operation are known, but the validity of outcomes can only be accessed through practice. Only the outcomes are explicable, and again, these imbue substantial margins of error in application, 10-20%, which is regarded as ‘good-enough’ by much of the ML community. Even if we also accept these results as good-enough, we have only made a limited gain, as we have a limited ability to relate the outcomes to the individual constituents that, combined with the algorithm, contribute to the outcomes. 
90 -
91 -This is not as limiting as it might seem at first consideration. The ignorance is, in principle, specific to demonstrating the detail of evolving relationships, and thus the inability to fully understand how a history of interactions leads from the original context to the outcomes of algorithmic application. But at the same time, the algorithm is available, the properties of constituents known, and knowledge of the underlying logic that motivates the algorithm and specified constituent properties is available. Armed with the latter elements, one gains an understanding of the process, if not an absolute understanding. Less than ideal, but not dissimilar to the limits of ethnological precision of an expert ethnographer. The limits of analytic processes, and the error rates associated with these, can be documented and applied to use.
90 +\\This is not as limiting as it might seem at first consideration. The ignorance is, in principle, specific to demonstrating the detail of evolving relationships, and thus the inability to fully understand how a history of interactions leads from the original context to the outcomes of algorithmic application. But at the same time, the algorithm is available, the properties of constituents known, and knowledge of the underlying logic that motivates the algorithm and specified constituent properties is available. Armed with the latter elements, one gains an understanding of the process, if not an absolute understanding. Less than ideal, but not dissimilar to the limits of ethnological precision of an expert ethnographer. The limits of analytic processes, and the error rates associated with these, can be documented and applied to use.
92 92  
93 93  From a developer’s, and an anthropologist’s, point of reference, it is important that we provide constant reference to these limitations, given the propensity of researchers to accept the outcomes of such procedures as definitive, or are over-suspicious. But just as imperative is the need to provide similar guidance for the content of the ethnographies themselves. Over time ethnological conceptions and perceptions have changed, been added or eliminated. Focal interests and themes change. Objectives change. Biases change. Early sources were often produced by missionaries, administrators, or tourists. Even then, eHRAF’s coverage of a society might range from the 16th  century to the 21st century, covering a range of different perspectives from many different roles. This guidance does not take the form of specific guidance on specific documents, but rather of tools and procedures to assist in evaluating content against other ethnographic sources, and identifying critical assumptions in the text.
94 94  
... ... @@ -96,7 +96,7 @@
96 96  
97 97  So we will a) expand the metadata for the eHRAF database using NLP, ML and a range of more conventional textual analyses, b) expose the contents of the eHRAF database to data mining and analysis by researchers through an API, c) develop services that leverage the API to process results, and d) provide a range of approaches to leveraging the API to suit researchers with different levels of technical skill, including web applications, JupyterLab workflow templates, and direct API access.
98 98  
99 -By exposing data metadata, computer assisted text analysis methods and data management tools, with guided means to leverage these through interactive web applications and JupyterLab templates together with interactive exemplars and training materials, we will expand capacity to advance secondary comparative, cross-cultural, and other ethnographic research and extend this capacity to a much wider constituency of researchers. These services will make practical the inclusion of cross-cultural analysis in research whose primary purpose is quite removed from this track, such as consideration of cross-cultural approaches to common or new problems.
98 +By exposing data, metadata, computer assisted text analysis methods and data management tools, with guided means to leverage these through interactive web applications and JupyterLab templates together with interactive exemplars and training materials, we will expand capacity to advance secondary comparative, cross-cultural, and other ethnographic research and extend this capacity to a much wider constituency of researchers. These services will make practical the inclusion of cross-cultural analysis in research whose primary purpose is quite removed from this track, such as consideration of cross-cultural approaches to common or new problems.
100 100  
101 101  = References =
102 102  
... ... @@ -110,7 +110,6 @@
110 110  
111 111  
112 112  
113 -
114 114  
115 115  )))
116 116  
Copyright 1949-2023 Human Relations Area Files, Yale University
23.0