Statistical approaches to the study of archaeological surface survey datasets are problematic, as they are often characterized by excessive zero counts and overdispersion. This paper introduces a Bayesian Zero-Inflated Negative Binomial (ZINB) model to classify survey grid units into user-defined functional areas based on artifact distributions. This approach was applied to the Roman maritime villa of Mascherone, located near the city of Siponto (northern Apulia, Italy), surveyed using a total sampling strategy. After filtering, 52 squares (20 × 20 m) containing 21 distinct artifact types were analyzed to identify three hypothesized functional areas: residential, storage, and craft. The model explicitly accounts for structural and sampling-derived zeros in the dataset while also handling overdispersion. Furthermore, it provides probabilistic classifications with quantified uncertainty for each square unit. Results indicate a residential core consistent with legacy aerial evidence, while storage and craft zones remain less certain due to limited indicators. This approach effectively addresses zero-inflation in survey datasets and offers a scalable framework for broader archaeological and landscape analyses.
Classifying Functional Areas at the Roman Villa of Mascherone (Manfredonia, Italy): Modeling Sparse Surface Survey Count Data using a Bayesian Zero-Inflated Negative Binomial Model
Ragno, Roberto
Conceptualization
;Goffredo, RobertoWriting – Original Draft Preparation
;Piepoli, LucianoWriting – Original Draft Preparation
2025-01-01
Abstract
Statistical approaches to the study of archaeological surface survey datasets are problematic, as they are often characterized by excessive zero counts and overdispersion. This paper introduces a Bayesian Zero-Inflated Negative Binomial (ZINB) model to classify survey grid units into user-defined functional areas based on artifact distributions. This approach was applied to the Roman maritime villa of Mascherone, located near the city of Siponto (northern Apulia, Italy), surveyed using a total sampling strategy. After filtering, 52 squares (20 × 20 m) containing 21 distinct artifact types were analyzed to identify three hypothesized functional areas: residential, storage, and craft. The model explicitly accounts for structural and sampling-derived zeros in the dataset while also handling overdispersion. Furthermore, it provides probabilistic classifications with quantified uncertainty for each square unit. Results indicate a residential core consistent with legacy aerial evidence, while storage and craft zones remain less certain due to limited indicators. This approach effectively addresses zero-inflation in survey datasets and offers a scalable framework for broader archaeological and landscape analyses.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


