pycanon.anonymity package¶
Subpackages¶
Module contents¶
Module with different functions which calculate properties about anonymity.
k-anonymity, (alpha,k)-anonymity, l-diversity, entropy l-diversity, (c,l)-diversity, basic beta-likeness, enhanced beta-likeness, t-closeness and delta-disclosure privacy.
- pycanon.anonymity.alpha_k_anonymity(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) Tuple[float, int]¶
Calculate alpha and k for (alpha,k)-anonymity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
alpha and k values for (alpha,k)-anonymity.
- Return type:
alpha is a float, k is an int.
- pycanon.anonymity.basic_beta_likeness(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) float¶
Calculate beta for basic beta-likeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
beta value for basic beta-likeness.
- Return type:
float.
- pycanon.anonymity.delta_disclosure(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) float¶
Calculate delta for delta-disclousure privacy.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
delta value for delta-discloure privacy.
- Return type:
float.
- pycanon.anonymity.enhanced_beta_likeness(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) float¶
Calculate beta for enhanced beta-likeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
beta value for enhanced beta-likeness.
- Return type:
float.
- pycanon.anonymity.entropy_l_diversity(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) float¶
Calculate l for entropy l-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
l value for entropy l-diversity.
- Return type:
float.
- pycanon.anonymity.k_anonymity(data: DataFrame, quasi_ident: List | ndarray) int¶
Calculate k for k-anonymity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
- Returns:
k value for k-anonymity.
- Return type:
int.
- pycanon.anonymity.l_diversity(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) int¶
Calculate l for l-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
l value for l-diversity.
- Return type:
int.
- pycanon.anonymity.recursive_c_l_diversity(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, imp=False, gen=True) Tuple[float, int]¶
Calculate c and l for recursive (c,l)-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
c and l values for recursive (c,l)-diversity.
- Return type:
c is a float, l is an int.
- pycanon.anonymity.t_closeness(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True) float¶
Calculate t for t-closeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
t value for basic t-closeness.
- Return type:
float.