Open Voices

About the project

Open Voices is a Linked Open Data project aimed at analyzing the socio-cultural dynamics that influence violence against women in Italy. To approach this complex issue, we have established a few fundamental parameters:

Year span: 2018, providing a focused and detailed analysis based on data from a single year to ensure consistency and clarity in interpreting the results.
Gender: The study includes data on both women and men, as well as aggregated totals, to provide a comprehensive understanding of the phenomenon and examine differences and commonalities across genders.
Victims: The analysis specifically targets victims who reach out to the national helpline 1522 and categorizes the data by geographic regions across Italy.
Factors of interest: The study examines public opinions on gender roles, the acceptability of intimate partner violence, causes of violence, and victim demographics.

This project aims to inform policies, raise awareness, and provide key insights to foster a more informed and empathetic societal response to gender-based violence.

To present these findings in a clear and accessible manner, different graphical visualizations have been used to showcase the data.

The full documentation of the project is available for download here.

Main RQ: How do geographical and cultural factors across different regions of Italy influence attitudes toward gender-based violence?

Find out the results!

Scenario

The perception and occurrence of violence against women in Italy are shaped by demographic factors, including geographical location, which influence societal stereotypes and the acceptability of violence. A data-driven approach that integrates population statistics, societal attitudes, and reports of violence can pinpoint intervention opportunities to reduce violence and challenge harmful stereotypes.

This project aims to raise public awareness about violence against women, promote gender equality, and enhance support services for victims. The documentation is based on statistical data from ISTAT, the Ministry of Justice, and the Ministry of Security, with a particular focus on public opinions regarding gender roles and the utilization of the anti-violence hotline 1522.

This research explores the relationship between regional stereotypes, public perceptions, and the prevalence of sexual and domestic violence against women in Italy. It investigates how stereotypes about women and sexual violence differ across regions, reporting rates, and societal attitudes. The study also examines regional variations in awareness and education about consent, alongside socio-cultural factors that contribute to victim-blaming and the normalization of abusive behaviors in intimate relationships.

The final aim is to uncover patterns in the prevalence and reporting of violence, offering insights into the complex interaction between cultural norms and legal outcomes across Italy’s diverse regions.

RESEARCH QUESTIONS

How do geographical and cultural factors across different regions of Italy influence attitudes toward sexual violence?
What are the prevailing stereotypes about women and sexual violence across different regions in Italy, and how do these stereotypes influence public perception, legal responses, and the prevalence of violence?
How does the level of awareness and education about consent and sexual violence vary across Italian regions, and what impact does this have on reporting rates?
What socio-cultural factors contribute to victim-blaming and the normalization of abusive behaviors in intimate relationships in various geographical areas of Italy, and how do these factors shape public attitudes and response systems?
To what extent do regional differences in gender roles and expectations influence the prevalence and reporting of domestic and sexual violence against women across Italy?
Which territories exhibit the highest levels of gender stereotypes, and how do these correlate with data on sexual and domestic violence? Are there geographical areas where stereotypes persist despite lower reported rates of violence?

Results

Datasets, including source and mashup ones, used to support the investigative analysis

Total victims in the italian population during the year 2018 turning to 1522 anti-violence and stalking number

Female victims, in the italian population during the year 2018 turning to 1522 anti-violence and stalking number

Male victims, in the italian population during the year 2018 turning to 1522 anti-violence and stalking number

Source and mashup datasets

The project comprises the use of 10 different datasets, between source and mashup ones.

The 5 source datasets have been downloaded in .csv format from the database belonging to Istat, the Italian National Institute of Statistics:

I.Stat: a datawarehouse organised by theme, presented in multidimensional tables and with a wide range of standard metadata

The source datasets underwent a cleaning phase during which duplicates and irrelevant entries were identified and removed. Additionally, missing data was supplemented when necessary to ensure the datasets were complete and better suited for streamlined management and analysis. This process laid the groundwork for accurate and meaningful research.

Following the cleanup, we moved to the mashup phase, where we integrated and harmonized the processed datasets to create the final mashup datasets. These datasets were specifically tailored to address our research questions, ensuring they provide the necessary depth and context for our analyses.

Detailed documentation on the data processing steps, including the cleaning procedures, integration strategies, and the methodologies employed to create the mashup datasets, is available in the GitHub repository of this project.

All
Source datasets
Mashup datasets

D1 - Population 2018

ID: D1
Provenience: I.Stat
Format: .csv, .xlsx, .xml, .px
Metadata: Provided
URI: 2018Population
License: CC BY 3.0

D2 - Opinions about sexual violence

ID: D2
Provenience: I.Stat
Format: .csv, .xlsx, .xml, .px
Metadata: Provided
URI: OpinionsViolenceGeoAreas
License: CC BY 3.0

D3 - Acceptability of intimate partner violence

ID: D3
Provenience: I.Stat
Format: .csv, .xlsx, .xml, .px
Metadata: Provided
URI: OpinionsPartnerGeoAreas
License: CC BY 3.0

D4 - Victims turning to 1522

ID: D4
Provenience: I.Stat
Format: .csv, .xlsx, .xml, .px
Metadata: Provided
URI: Victims
License: CC BY 3.0

D5 - Indication of some causes of intimate partner violence

ID: D5
Provenience: I.Stat
Format: .csv, .xlsx, .xml, .px
Metadata: Provided
URI: ViolenceCauses
License: CC BY 3.0

MD5 - K-MEANS CLUSTERS

ID: MD5
Creation date: 1 January 2025
Format: .csv
Metadata: Provided
URI: MD5
License: CC BY 4.O
Download: MD5

MD6 - Victims causes indication

ID: MD6
Creation date: 1 January 2025
Format: .csv
Metadata: Provided
URI: MD6
License: CC BY 4.O
Download: MD6

MD7 - Geographical Distribution of Female Victims

ID: MD7
Creation date: 1 January 2025
Format: .csv
Metadata: Provided
URI: MD7
License: CC BY 4.O
Download: MD7

MD8 - Geographical Distribution of Males Victims

ID: MD8
Creation date: 1 January 2025
Format: .csv
Metadata: Provided
URI: MD8
License: CC BY 4.O
Download: MD8

MD9 - Geographical Distribution of Total Victims

ID: MD9
Creation date: 1 January 2025
Format: .csv
Metadata: Provided
URI: MD9
License: CC BY 4.O
Download: MD9

Analyses

The five source datasets have been analysed for four aspects.

Quality analysis

In alignment with the National Guidelines for Enhancing Public Information Assets, we evaluated data quality based on four key factors: accuracy, completeness, coherence, and timeliness.

Legal Analysis

The legal analysis aims to evaluate potential risks and imbalances related to the long-term sustainability of data generation and dissemination, focusing on areas such as privacy, intellectual property rights, licensing for release, restrictions on public access, economic factors, and other temporal considerations.

Ethical Analysis

Assessing the datasets through the lens of the Data Ethics Principles and Guidelines, focusing on human-centered approaches, transparency, responsibility, and safeguarding individual data.

Technical Analysis

This section examines the metadata provided by Istat and other relevant information about the datasets, including format, provenance, and Internationalized Resource Identifiers (IRI). Additionally, it involves creating RDF assertions and metadata for the mashup datasets, along with an evaluation of the project’s alignment with the FAIR principles.

Quality analysis

In line with the National Guidelines ("Linee guida nazionali per la valorizzazione del patrimonio informativo pubblico"), developed under the Data & Analytics Framework project by AgID and the Digital Transformation Team, we conducted a thorough quality assessment of our datasets to ensure their reliability and suitability for their intended purposes. Specifically, there are four main factors to look for when analysing data quality:

Accuracy (syntactic and semantic): Verifying that the data and its attributes accurately reflect the real-world values or events they represent.
Coherence: Ensuring that the data is consistent and free from contradictions when compared to other related datasets within the administrative context.
Completeness: Assessing whether the datasets provide exhaustive values and fully account for all related entities (sources) that contribute to the defined procedures.
Timeliness (or promptness of updating): Confirming that the data is up to date and corresponds to the relevant timeframes for the associated processes.

The results of this analysis are summarized in a table highlighting the overall quality of each dataset and identifying any areas requiring improvement.

ID Dataset	Accuracy	Coherence	Completeness	Timeliness
D1 - Population 2018
D2 - Opinions about sexual violence
D3 - Acceptability of intimate partner violence
D4 - Victims turning to 1522			*
D5 - Indication of some causes of intimate partner violence

* In the victims dataset, data on the number of victims of gender-based violence for the region Trentino-Alto Adige was missing. To estimate these values (total, males and females), the death rate per inhabitant of the region Friuli-Venezia Giulia was used, considered similar in terms of population density.
The same method was used to estimate missing values of male and female victims for Trentino and male victims in Basilicata, comparing them with the rate of Molise, another region with similar demographic characteristics.
Victim rate formula (by region):
𝑉𝑖𝑐𝑡𝑖𝑚𝑠 𝑟𝑎𝑡𝑒 (𝑝𝑒𝑟 𝑟𝑒𝑔𝑖𝑜𝑛) = 𝑉𝑖𝑐𝑡𝑖𝑚𝑠 𝑉𝑎𝑙𝑢𝑒 ÷ 𝑃𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑙𝑢𝑒
Formula for estimating missing values:
𝑀𝑖𝑠𝑠𝑖𝑛𝑔 𝑣𝑖𝑐𝑡𝑖𝑚𝑠 𝑣𝑎𝑙𝑢𝑒 𝑒𝑠𝑡𝑖𝑚𝑎𝑡𝑖𝑜𝑛 = 𝑣𝑖𝑐𝑡𝑖𝑚𝑠 𝑟𝑎𝑡𝑒 × 𝑝𝑜𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛 𝑣𝑎𝑙𝑢𝑒 (𝑜𝑓 𝑟𝑒𝑔𝑖𝑜𝑛 𝑤𝑖𝑡ℎ 𝑚𝑖𝑠𝑠𝑖𝑛𝑔 𝑑𝑎𝑡𝑎)

Legal analysis

The legal analysis of the source datasets is essential to ensure the long-term sustainability of the production process and of the publication of datasets and to guarantee a of both the data production process and the publication of datasets, while also guaranteeing a balanced service that aligns with public responsibilities and respects individual rights.

This analysis was carried out using a reference checklist consisting of a series of binary questions regarding the topics of: privacy issues, IPR policy, licenses, limitations on public access, economical conditions, and temporal aspects.

To check:	D1	D2	D3	D4	D5
Is the dataset free of any personal data as defined in the Regulation (EU) 2016/679?
Is the dataset free of any indirect personal data that could be used for identifying the natural person?
Is the dataset free of any particular personal data (art. 9 GDPR)?
Is the dataset free of any information that combined with common data available in the web, could identify the person?
Is the dataset free of any information related to human rights (e.g., refugees, witness protection, etc.)
Did you use a tool for calculating the range of the risk of deanonymization?	Not needed	Not needed	Not needed	Not needed	Not needed
Are you using geolocalization capabilities?
Did you check that the open data platform respect all the privacy regulations (registration of the end-user, profiling, cookies, analytics, etc.)?
Do you know who, in your open data platform, is the Controller and Processor of the privacy data of the system?
Have you checked the privacy regulation of the country where the dataset are physically stored?
Do you have non-personal data?

To check:	D1	D2	D3	D4	D5
Have you created and generated the dataset?
Are you the owner of the dataset?
Are you sure not to use third party data without the proper authorization and license?
Have you checked if there are any limitations in your national legal system for releasing some kind of datasets with open license?

To check:	D1	D2	D3	D4	D5
Did you release the dataset with an open data license?
Did you include the clause: "In any case the dataset can't be used for re-identifying the person"?
Did you release the API (in case you have it) with an open source license?
Have you checked that the open data/API platform license regime is in compliance with your IPR policy?

To check:	D1	D2	D3	D4	D5
Did you check that the dataset concerns your institutional competences, scope and finality?
Did you check the limitations for the publication stated by your national legislation or by the EU directives?
Did you check if there are some limitations connected to the international relations, public security or national defence?
Did you check if there are some limitations concerning the public interest?
Did you check the international law limitations?
Did you check the INSPIRE law limitations for the spatial data?

To check:	D1	D2	D3	D4	D5
Did you check that the dataset could be released for free?
Did you check if there are some agreements with some other partners in order to release the dataset with a reasonable price?	Not needed	Not needed	Not needed	Not needed	Not needed
Did you check if the open data platform terms of service include a clause of “non liability agreement” regarding the dataset and API provided?
In case you decide to release the dataset to a reasonable price did you check if the limitation imposed by the new directive 2019/1024/EU are respected?	Not needed	Not needed	Not needed	Not needed	Not needed
In case you decide to release the dataset to a reasonable price did you check the e-Commerce directive and regulation?	Not needed	Not needed	Not needed	Not needed	Not needed

To check:	D1	D2	D3	D4	D5
Do you have a temporary policy for updating the dataset?
Do you have some mechanism for informing the end-user that the dataset is updated at a given time to avoid mis-usage and so potential risk of damage?
Did you check if the dataset for some reason cannot be indexed by the research engines (e.g., Google, Yahoo, etc.)?
In case of personal data, do you have a reasonable technical mechanism for collecting request of deletion (e.g., right to be forgotten)?	Not needed	Not needed	Not needed	Not needed	Not needed

Publication license

A key aspect of the legal analysis is determining the appropriate publication license for the newly created mashup datasets. This decision must align with the licenses of the source datasets, which in our case, were all released under the CC BY 3.0.
To guide this process, we utilized the Licensing Assistant tool provided by the European Commission. After evaluating the options, we opted to publish all 10 mashup datasets under the CC BY 4.0 license.

The table below summarizes the original licenses of the source datasets and the final publication license applied to the mashup datasets:

ID	Dataset	Original licenses	Final license
MD6	Victims causes indication	CC BY 3.0, CC BY 3.0, CC BY 3.0	CC BY 4.0
MD7	Geographical Distribution of Female Victims	CC BY 3.0, CC BY 3.0, CC BY 3.0	CC BY 4.0
MD8	Geographical Distribution of Males Victims	CC BY 3.0, CC BY 3.0, CC BY 3.0	CC BY 4.0
MD9	Geographical Distribution of Total Victims	CC BY 3.0, CC BY 3.0, CC BY 3.0	CC BY 4.0

Ethical analysis

For the ethical analysis of our project’s data, we applied the Data Ethics Principles and Guidelines and the Odi Project's detailed framework for assessing the ethical aspects of our data processing.

Both the source and mashup datasets for our project are exclusively derived from the Italian National Institute of Statistics (ISTAT). Therefore, we first focused on evaluating the fairness of ISTAT's data collection and management practices. Following that, we established clear ethical guidelines to ensure responsible handling of the datasets throughout our project's lifecycle.

Data Ethics Principles

The "Open Voices" project aims to analyse the socio-cultural dynamics that influence violence against women in Italy, therefore the ethical approach in data management is fundamental to ensure that the rights of victims are respected, that no stereotypes or discrimination are fostered, and that the analysis contributes positively to raising awareness on such a sensitive issue.
Human being at the center: ISTAT’s policy is deeply aligned with both ethical standards and legislative principles. The organization prioritizes the dissemination of statistical information to promote awareness of Italy’s social and economic conditions. It also strives to enhance public decision-making by providing clear, accessible statistical data. In addition, ISTAT conducts research to continually refine statistical methodologies and improve Italy's statistical literacy.
Equality: In the context of the "Open Voices" project, the main objective is to analyze the socio-cultural dynamics that influence violence against women in Italy. Therefore, the collection of data concerns very sensitive issues such as gender violence and its perception within society. ISTAT data on key equality issues, such as gender discrimination and domestic violence.
Transparency: ISTAT ensures transparency in data management by providing comprehensive documentation. This documentation covers the data collection methods, clarifies the use of specific terms and definitions, and outlines policies and licenses that safeguard against misinterpretation of the data.
Accountability: ISTAT's quality assurance procedures align with European frameworks, specifically the European Statistics Code of Practice. This adherence strengthens both accountability and governance within the national statistical system and aligns it with European standards.
Individual data protection: ISTAT ensures that its datasets are anonymized in compliance with legal requirements. The organization’s practices adhere to strict confidentiality standards, ensuring that the privacy of respondents is always respected. Their approach to handling sensitive data complies with European data protection laws (e.g., Regulation (EU) 2016/679 and Legislative Decree No. 322/1989).

Ethical concerns and their management

Although ISTAT adheres to ethical principles in data collection and management, the team has paid particular attention to the ethical management of source information, given the great sensitivity of the content dealt with in our project, which concerns gender-based violence. Data on sexual violence, the acceptability of partner violence and access to support services (such as 1522) are sensitive issues and ethical concerns in their treatment have been addressed in the following ways:

Data Integrity and Privacy: To ensure data integrity and privacy, the values from the source datasets were aggregated and presented as percentages, avoiding any correlation with real individuals.
Protection of Vulnerable Groups: Certain sensitive data were intentionally omitted to avoid the risk of discriminatory behavior. The project aimed to protect women experiencing violence, minimizing the risk of stigmatization.
Avoiding Generalizations and Misinterpretations: The objective of the project was to identify potential patterns in the dynamics of gender-based violence, not to make inferences or generalizations. In our results and conclusions documentation, we emphasize that any observed patterns in the data should not be generalized due to inconsistencies in the data and the absence of other potentially relevant socio-economic factors.
All relevant documentation regarding the data processing for the creation of mashup datasets and visualizations is provided in our GitHub repository.

Technical analysis

Source datasets

All source datasets have been assessed following the metadata model established by Agenzia per l'Italia Digitale (AGID), which categorizes metadata quality into four levels. This classification is based on two key factors: the strength of the data-metadata relationship and the level of detail provided.

Syntactic Quality: Ensures correct formatting and structural validity of metadata according to schemas and data models used.
Semantic Quality: Verifies the coherence and meaningfulness of metadata, ensuring alignment with controlled vocabularies or taxonomies.
Completeness: Assesses whether all mandatory and optional metadata elements are present to avoid missing critical information.
Consistency: Ensures logical consistency within and across datasets, verifying relationships and adherence to rules.

Note: Further details and reconstructed metadata for the source datasets are available in the metadata analysis table below.

ID	Provenience	Format	Metadata	URI	License
D1	I.Stat	.csv, .xlsx, .xml, .px	Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.	2018Population	CC BY 3.0
D2	I.Stat	.csv, .xlsx, .xml, .px	Level 4: An SDMX-structured file is available for download, featuring a strong connection between data and metadata, with detailed descriptions at the datum level. These files are machine-readable. Level 2: Additional metadata, offering clear information about sources and methodologies, is provided on a separate webpage, accessible via a sidebar menu.	OpinionsViolenceGeoAreas	CC BY 3.0
D3	I.Stat	.csv, .xlsx, .xml, .px	Level 4: An SDMX-structured file is available for download, featuring a strong connection between data and metadata, with detailed descriptions at the datum level. These files are machine-readable. Level 2: Additional metadata, offering clear information about sources and methodologies, is provided on a separate webpage, accessible via a sidebar menu. >webpage, accessible through a sidebar menu	OpinionsPartnerGeoAreas	CC BY 3.0
D4	I.Stat	.csv, .xlsx, .xml, .px	Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.	Victims	CC BY 3.0
D5	I.Stat	.csv, .xlsx, .xml, .px	Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable. >webpage, accessible through a sidebar menu	ViolenceCauses	CC BY 3.0

RDF Metadata Assertion of the Datasets

All generated mashup datasets have been documented using metadata, adhering to the latest standard of DCAT Version 3 (August 2024). This choice reflects our commitment to leveraging the most up-to-date standards for metadata representation, ensuring flexibility and compatibility with modern technological frameworks.
While we referred to DCAT-AP_IT for alignment with Italian national guidelines, which are based on the earlier DCAT v1.0 (2016) and impose stricter constraints, our implementation adopted DCAT Version 3, as already stated. This approach was chosen to maintain interoperability within the national ecosystem while also benefiting from the advancements and enhanced features of the latest DCAT version of the standard. By doing so, our datasets align with the Italian Agency for Digitalization (AGID) directives, ensuring their integration with the broader public sector information (PSI) heritage, while also meeting the evolving needs of metadata management on an international scale.

Semantic Enrichment of Datasets

The metadata for the source datasets were primarily derived from the original data sources. When metadata was incomplete or unavailable, additional information was inferred and supplemented following the same principles applied to the mashup datasets. For instance, themes were assigned to source datasets based on recognized European Authority standards. To enhance the semantic description of datasets, we have utilized several key ontologies, including DCAT, DCTERMS, PROV, FOAF, ADMS, SKOS, CC, and DCAT-AP IT. These ontologies provide a structured framework for describing datasets and their metadata, ensuring interoperability and adherence to Linked Open Data (LOD) standards. This approach facilitates the discoverability, understandability, and reuse of datasets across diverse contexts.

The DCAT (Data Catalog Vocabulary) ontology plays a central role, serving to describe datasets and data catalogs and publish them on the Web. Key properties such as dcat3:dataset, associate datasets with a catalog, while dcat3:theme categorizes datasets by their topics, and dcat3:distribution specifies the available data formats. For example, datasets can be linked to themes like population statistics or societal issues using dcat3:theme. Learn more about DCAT Version 3 at W3C DCAT.

DCTERMS (Dublin Core Terms) is used to describe general metadata, including dcterms:title for dataset titles and descriptions, and dcterms:accessRights to indicate accessibility. This inclusion ensures the use of a widely recognized standard for metadata description. More information about DCTERMS can be found at Dublin Core Terms.

ADMS (Asset Description Metadata Schema) is included to complement DCAT by describing assets like datasets and services, particularly in government and public administration contexts. ADMS plays an important role in the asset management landscape. For further details, consult the ADMS Specification.

CC (Creative Commons) provides a vocabulary for licensing datasets, such as cc:license, ensuring clear and standardized license attribution. Visit Creative Commons specifications for further insights.

Finally, DCAT-AP IT (DCAT Application Profile for Italy) is referenced through dcatapit: to align with Italian government open data standards. It extends DCAT to meet specific national requirements. For more information, check out DCAT-AP IT.

Title	Open Voices OADE Project - Datasets Catalog
Identifier	OpenVoicesCatalog
Description	Catalog containing the mashup datasets for the project Open Voices
Publisher	open voices
Issued	10/01/2025
Modified	10/01/2025
Datasets	MD6, MD7, MD8 MD9
Hompage	Open Voices
Language	English
Theme Taxonomy	European vocabulary for Data theme
License	CC BY 4.0
RDF assertion of metadata	Download RDF

Dataset	Title	Description	Theme	Subject	Accrual periodicity	Rights holder	Creator	Publisher	Distribution	Language	Derived from	License	RDF assertion of the metadata
D1	Population estimates 2018	Intercensal population estimates by age, sex and citizenship on 1st January	Population and society	2816 demography and population	Never	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	IstatDati database Endpoint	Italian, English	Censimento della popolazione (31.12.2018) Censimenti della popolazione 2001 e 2011 flussi demografici (nascite, decessi, migrazioni, acquisizioni della cittadinanza) registrati tra i censimenti 2001 e 2018	CC BY 3.0	Download Informative note on population reconstruction 2002-2018 (IT) Download methodological note on population reconstruction 2002-2018 (IT) Download SDMX structured data from IstatData endpoint
D2	Opinions about sexual violence	Opinions about sexual violence - geographical areas	Violence	2836 social protection, 2816 demography and population	Annual	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istat	Italian, English	Gender role stereotypes and the social image of violence	CC BY 3.0	Additional metadata
D3	Acceptability of intimate partner violence	Acceptability of intimate partner violence - geographical areas	Violence	2836 social protection, 2816 demography and population	Annual	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	I.Stat	Italian, English	Gender role stereotypes and the social image of violence	CC BY 3.0	additional Metadata
D4	Victims - sex, age class, geographical areas	Victims turning to 1522 (anti-violence and stalking number)	Violence	2836 social protection, 2816 demography and population	Annual	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istat	Italian, English		CC BY 3.0	additional Metadata
D5	Indication of some causes of intimate partner violence	Indication of some causes of intimate partner violence - geographical areas	Violence	2836 social protection, 2816 demography and population	Annual	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	Istituto Nazionale di Statistica	I.Stat	Italian, English	Gender role stereotypes and the social image of violence	CC BY 3.0	additional Metadata

Mashup Dataset	Serie	Title	Identifier	Description	Theme	Subject	Keywords	Issued	Modified	Accrual periodicity	Rights holder	Creator	Publisher	Distribution	Language	Derived from	License	RDF assertion of the metadata
MD6	MD6	MD6 - Victims causes indication	MD6	Mashup dataset with an analysis that explores the relationship between the population, victims of violence and the causes of such violence.	Population and society	2836 social protection, 2816 demography and population	social protection, gender violence, victims of violence, causes of violence, italy	10/01/2025	10/01/2025	Never	OpenVoices	Lucrezia Pograri, Chiara Martina	OpenVoices	CSV distribution of Open Voices MD1 for the year 2017	English	D1, D3	CC BY 4.0	Download RDF
MD7	MD7	MD7 - Geographical Distribution of Female Victims	MD7	Mashup dataset with exploration of the geographical distribution of female victims of violence	Population and society	2836 social protection, 2816 demography and population	social protection, violence against womans, victims of violence, gender violence, italy	10/01/2025	10/01/2025	Never	OpenVoices	Lucrezia Pograri, Chiara Martina	OpenVoices	CSV distribution of Open Voices MD1 for the year 2018	English	D2, D3	CC BY 4.0	Download RDF
MD8	MD8	MD8 - Geographical Distribution of Males Victims	MD8	Mashup dataset with exploration of the geographical distribution of males victims of violence	Population and society	2836 social protection, 2816 demography and population	social protection, gender violence, victims of violence, italy	10/01/2025	10/01/2025	Never	OpenVoices	Lucrezia Pograri, Chiara Martina	OpenVoices	CSV distribution of Open Voices MD1 for the year 2019	English	D2, D3	CC BY 4.0	Download RDF
MD9	MD9	MD9 - Geographical Distribution of Total Victims	MD9	Mashup dataset with exploration of the geographical distribution of total victims of violence	Population and society	2836 social protection, 2816 demography and population	social protection, violence against womans, victims of violence, gender violence, italy	10/01/2025	10/01/2025	Never	OpenVoices	Lucrezia Pograri, Chiara Martina	OpenVoices	CSV distribution of Open Voices MD2 for the year 2017	English	D1, D4, D5, D6	CC BY 4.0	Download RDF

FAIR principles

During the project development, we aimed to align our efforts with the FAIR principles established by the GO FAIR Initiative.
These principles, developed by a consortium of scientists and organizations, provide guidelines to ensure digital assets are Findability, Accessibility, Interoperability, Reusability, with a strong emphasis on machine-actionability.

The table below utilizes the FAIR Principles overview from GO FAIR as a checklist to systematically evaluate our project’s adherence to these guidelines.

To check:
(Meta)data are assigned a globally unique and persistent identifier
Data are described with rich metadata (defined by R1 below)
Metadata clearly and explicitly include the identifier of the data they describe
(Meta)data are registered or indexed in a searchable resource

To check:
(Meta)data are retrievable by their identifier using a standardised communications protocol
The communication protocol is open, free, and universally implementable
The communication protocol allows for an authentication and authorisation procedure, where necessary
Metadata are accessible, even when the data are no longer available

To check:
(Meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation.
(Meta)data use vocabularies that follow FAIR principles
(Meta)data include qualified references to other (meta)data

To check:
(Meta)data are richly described with a plurality of accurate and relevant attributes
(Meta)data are released with a clear and accessible data usage license
(Meta)data are associated with detailed provenance
(Meta)data meet domain-relevant community standards

Preprocessing of data

The source datasets were processed using KNIME software, a powerful platform for data analytics and workflow automation. Various operations were carried out to clean, preprocess, and mash up the data, ensuring that it was structured and ready for analysis. These operations included data cleansing steps such as handling missing values, standardizing formats, and removing duplicates, as well as combining multiple data sources to create the mashup dataset.

The workflow of the project, which outlines each step of the data transformation process, is available for consultation in the image below, providing full transparency and enabling replication of the analysis.

Workflow KNIME

Additionally, the full KNIME workflow can be downloaded here, allowing users to access the cleaned and integrated data for their own analysis or research purposes.

Visualizations

Various graphic representations of the data facilitate a deeper exploration of the project’s topic, allowing for the analysis of potential correlations between key factors across different regions. Together, these visualizations present the data in an engaging and accessible manner, supporting a comprehensive analysis of the topic.

The visualizations have been created using two libraries: Leaflet.js for the map and Plotly.js for the bar charts and bubble chart, ensuring dynamic and visually appealing representations. The choice of tools and design approach was guided by the Seven Laws of Information by Moody and Walsh (1999), a set of principles emphasizing clarity, simplicity, and effectiveness in data visualization.

Show the data: Each visualization focuses on clearly displaying the data without unnecessary decoration. For example, the map highlights regional differences, while bar and bubble charts emphasize trends and distributions.
Maximize data-ink ratio: Non-essential elements, such as excessive gridlines and redundant labels, have been minimized to ensure that the "data ink" – the part of the visualization that represents the actual data – is maximized.
Avoid distortion: Careful attention has been paid to scaling and proportions to avoid misleading interpretations. For instance, axes are clearly labeled and not truncated to ensure trends are accurately represented.
Be consistent: A consistent use of color schemes, fonts, and layout into each visualization ensures clarity and makes it easier for viewers to compare data.
Use a clear layout: The layout of each visualization is designed to guide the viewer’s eye to the most important elements, such as highlighting key regions on the map or grouping related data in the charts.
Make it interactive: Interactivity has been incorporated into the visualizations, allowing users to zoom in on the map, hover over data points for additional details, and filter results in the charts for a more tailored exploration of the data.
Provide context: Each visualization includes labels, legends, and explanations to ensure the data is properly contextualized, reducing the risk of misinterpretation.

Below are the types of visualizations used.

Bar Charts

This type of visualization provides an immediate, clear comparison of values for each of the variables, offering a straightforward way to assess differences across regions.

The following three charts present datasets on opinions about gender roles and violence against women among adults in Italy, based on the 2018 survey on gender role stereotypes and the social image of violence. The survey, conducted by Istat in partnership with the Equal Opportunities Department, explores cultural models and factors influencing attitudes toward violence. They examine the spread of gender stereotypes, acceptability of violence, its causes, and attitudes toward sexual violence. the spread of gender stereotypes (D2), acceptability of violence (D3), its causes (D5), and attitudes toward sexual violence.

Opinions about sexual violence

This chart explores the point of view of Italian adults on gender roles and sexual violence, broken down by region. The bar graph shows the degree of agreement with various stereotypes. The data show how different regions of Italy respond to these beliefs, highlighting cultural and social variations in perceptions of violence against women.

Controls

Choose a region:

Select a stereotype:

Results: Regional differences in perceptions of sexual violence in Italy are evident. This thesis highlights the wide regional differences in perceptions of sexual violence in Italy. Northern regions, such as Tuscany and Lombardy, show a greater awareness and strong disagreement about misconceptions about sexual violence, as the idea that women can provoke violence with their clothing or that violence within marriage is not considered to be such.

Acceptability of intimate partner violence

This chart examines the acceptability of intimate partner violence in different regions of Italy. The bar graph shows the acceptability of various violent behaviors. The data shows the percentage of people who consider these behaviors acceptable or not, broken down by region, contributing to an understanding of cultural differences in opinion about domestic violence.

Controls

Choose a region:

Select a behavior:

Results: In contrast, in regions such as Calabria, Puglia or Liguria, more permissive and stereotyped views persist, with significant percentages of people justifying or minimizing sexual violence. These findings suggest the importance of targeted educational interventions to raise public awareness and reduce the gap in regional perceptions, in order to promote a culture of respect and awareness about gender violence. The differences between regions indicate a significant variation in the perception of acceptability of violence. Regions such as Lazio, Sardinia and Piedmont show a clear opposition to couple violence, while in regions like Abruzzo and Basilicata there are still worrying margins of tolerance. In some regions, conditional or partial acceptance of these behaviors suggests that gender stereotypes and cultural norms permissive towards control or violence may persist.

Indication of some causes of intimate partner violence

This chart explores the perceived causes of violence in intimate relationships in different regions of Italy. The data show how often each factor is identified as a cause of violence, contributing to a deeper understanding of the cultural and social roots of this phenomenon.

Controls

Results: The data show a marked regional disparity in the factors associated with gender-based violence in Italy. Regions such as Friuli Venezia Giulia and Emilia Romagna emerge with the highest rates in several categories, such as women’s perception as property and the need for superiority towards their partners, indicating a persistent rooting of gender stereotypes. Factors such as drug or alcohol abuse, prevalent in Sardinia, and childhood experiences of domestic violence, widespread in Basilicata, underline the importance of addressing the root causes of violence, including addiction and intergenerational transmission. Although region, such as Piedmont, show slight opposition to violence, the presence of tolerant attitudes in other areas highlights the need for targeted interventions. Finally, it is crucial to promote cultural change through educational campaigns, strengthen psychological support and intervene on addictions to reduce regional disparities and prevent gender violence in a sustainable and effective way.

Choropleth Map

Choropleth map serves as the most effective method to visually display regional variations in our data, highlighting trends and correlations across geographical areas. The victims' values are presented as numbers, while regional clusters are generated through data normalization by population, applying the k-means algorithm to the rate of victims per 100,000 inhabitants. This map allows us to observe general trends in population and violence victimization rates. Higher rates are seen in regions like Lazio, Campania, and Abruzzo, while the lowest rates, often due to lower population density, are found in Molise. Other regions, such as Valle d'Aosta, Trentino Alto Adige, and Molise, report lower victimization rates, with most regions falling in between these extremes.
The k-means algorithm has been used to divide the Italian geographical areas into three clusters, based on the victims rate:

Low intensity cluster (0-11%)
Medium intensity cluster (12-16%)
High intensity cluster (17+%)

Victims of violence by geographical areas

Victims per 100k inhabitants

Highest rate: Lazio 21.34%
Lowest rate: Molise 7.90%

Bubble chart

The chart compares various causes of violence across Italian territories, revealing significant regional differences. Key causes such as "considering women to be property" and "abuse of drugs or alcohol" show strong correlations with higher victimization rates, especially in regions like Lombardia and Friuli-Venezia Giulia. Causes like "religious reasons" have a lower impact, while societal factors like the need for men to feel superior to their partners remain consistent across many areas. The analysis highlights the complex regional patterns of violence, emphasizing the need for targeted interventions based on specific territorial dynamics.

RDF Serialization of Datasets Catalog

Below is an overview of the datasets used and produced within this project, presented in RDF/Turtle format. For more detailed information on the RDF metadata assertions, please refer to the Technical Analysis section of the website or to the project documentation.

Sustainability

The datasets used in this project are sourced from the Italian National Institute of Statistics (Istat), which manages them across its databases. However, as Istat is transitioning its content to the IstatData platform, the URIs referenced in this project may become outdated.

Open Voices is the final project developed for the Open Access and Digital Ethics course (a.y. 2024/2025) within the Digital Humanities and Digital Knowledge Master's Degree (University of Bologna). As such, it is not actively maintained and will not be updated in the future.

This project adheres to the open data sustainability principles aligned with the United Nations' Sustainable Development Goals (SDGs), which address social, environmental, and economic challenges. Our aim is to ensure that data within this project remains accessible, reliable, and actionable by following these key principles:

Data Quality and Integrity: Maintaining accuracy, reliability, and trustworthiness.
Data Disaggregation: Breaking data into detailed units for improved analysis, particularly for marginalized groups.
Data Transparency and Openness: Providing accessible, understandable data to foster trust and accountability.
Data Usability and Curation: Organizing and documenting data to facilitate reuse.
Data Protection and Privacy: Safeguarding sensitive information while promoting openness.
Data Governance and Independence: Ensuring transparent and accountable management practices free from external influence.
Data Rights: Protecting ownership and access rights.

Team & Statement of responsibility

Lucrezia Pograri

Project ideation — Data retrieval — KNIME preprocessing — Mashup datasets — Visualizations — RDF metadata assertion — Website development

Chiara Martina

Project ideation — Data retrieval — KNIME preprocessing — Mashup datasets — Analyses — Visualizations — Website development

Licenses and credits

Images and icons

All the icons are taken from ICONS8. They are available for unrestricted commercial and noncommercial use without permission or fee (CC0)

Source Datasets

Creative Commons Attribution 3.0 Unported (CC BY 3.0)

Mashup Datasets

Creative Commons Attribution 4.0 International (CC BY 4.0)

Softwares used

Web template

This website is built on the HTML5 template "Vesperr" by BootstrapMade and released under MIT

Open Voices

How do geographical and cultural factors across different regions of Italy influence attitudes toward gender-based violence?

About the project

Main RQ: How do geographical and cultural factors across different regions of Italy influence attitudes toward gender-based violence?

Scenario

Results

Source and mashup datasets

D1 - Population 2018

D2 - Opinions about sexual violence

D3 - Acceptability of intimate partner violence

D4 - Victims turning to 1522

D5 - Indication of some causes of intimate partner violence

MD5 - K-MEANS CLUSTERS

MD6 - Victims causes indication

MD7 - Geographical Distribution of Female Victims

MD8 - Geographical Distribution of Males Victims

MD9 - Geographical Distribution of Total Victims

Analyses

Quality analysis

Legal Analysis

Ethical Analysis

Technical Analysis

Quality analysis

Legal analysis

Privacy Issues

Intellectual Property Rights

Licenses

Limitations on public access

Economical conditions

Temporary aspects

Publication license

Ethical analysis

Data Ethics Principles

Ethical concerns and their management

Technical analysis

Source datasets

RDF Metadata Assertion of the Datasets

Semantic Enrichment of Datasets

FAIR principles

Findability

Accessibility

Interoperability

Reusability

Preprocessing of data

Visualizations

Bar Charts

Controls

Controls

Controls

Choropleth Map

Victims of violence by geographical areas

Victims per 100k inhabitants

Bubble chart

RDF Serialization of Datasets Catalog

Sustainability

Team & Statement of responsibility

Lucrezia Pograri

Chiara Martina

Licenses and credits

Images and icons

Source Datasets

Mashup Datasets

Softwares used

Web template