Getting Started
Welcome to the Getting Started guide for metaprivBIDS. This Python build tool enables a user to calculate a variety of different data privacy metrics on tabular data from a user interface.
installation
The metaprivBIDS software runs on multiple platforms (e.g. Linux, MacOS, Windows) that have a Python 3.7 installation.
It is recommended (but not required) to first create a virtual environment. This can be done with venv or, if pygraphviz fails (as it happens), with conda.
python -m venv metapriv
source metapriv/bin/activate
or
conda config --add pkgs_dirs ~/conda_pkgs
conda create --name venv -c conda-forge "python>=3.7" graphviz pygraphviz r-base r-sdcMicro rpy2
conda activate venv
You can then install metaprivBIDS by cloning the git respository.
git clone https://github.com/CPernet/metaprivBIDS.git
To execute the program make sure all dependencies from pyproject.toml is avalible in a python 3.7 enviroment. This can be done by running
cd metaprivBIDS
pip install -e .
Basic Usage
Once installed, you can call and execute the program globally from any directory using the terminal/command prompt. This means you don’t need to navigate to the program’s installation folder; you can run it from anywhere.
metaprivBIDS
prompting the program to start.
Command-Line Execution
After following the installation guide, the metrics within the MetaprivBIDS tool can be called through an import statement without making use of the GUI.
e.g.
from metaprivBIDS.metaprivBIDS.corelogic.metapriv_corelogic import metaprivBIDS_core_logic
metapriv = metaprivBIDS_core_logic()
# Load the data
data_info = metapriv.load_data('metaprivBIDS/Use_Case_Data/adult_mini.csv')
# Inspect {column, unique value count, column type}
data = data_info["data"]
print("Column Types:",'\n')
print(data_info["column_types"],'\n')
# Select Quasi-Identifiers
selected_columns = ["age", "education", "marital-status", "occupation", "relationship","sex","salary-class"]
results_k_global = metapriv.find_lowest_unique_columns(data, selected_columns)
print('Find Influential Columns:','\n')
print(results_k_global)
# Compute Personal Information Factor
pif_value, cig_df = metapriv.compute_cig(data, selected_columns)
print("PIF Value:", pif_value)
print("CIG DataFrame:")
print(cig_df)
# Run SUDA2 computation
results_suda = metapriv.compute_suda2(data, selected_columns, sample_fraction=0.3, missing_value=-999)
# Access results
data_with_scores = results_suda["data_with_scores"]
attribute_contributions = results_suda["attribute_contributions"]
attribute_level_contributions = results_suda["attribute_level_contributions"]
Next Steps
Explore the Examples to see Interactive Tutorial of how to navigate the graphical user interface for MetaprivBIDS.