pycanon.anonymity package#
Subpackages#
Module contents#
Module with different functions which calculate properties about anonymity.
k-anonymity, (alpha,k)-anonymity, l-diversity, entropy l-diversity, (c,l)-diversity, basic beta-likeness, enhanced beta-likeness, t-closeness and delta-disclosure privacy.
- pycanon.anonymity.alpha_k_anonymity(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) Tuple[float, int] #
Calculate alpha and k for (alpha,k)-anonymity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
alpha and k values for (alpha,k)-anonymity.
- Return type:
alpha is a float, k is an int.
- pycanon.anonymity.basic_beta_likeness(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) float #
Calculate beta for basic beta-likeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
beta value for basic beta-likeness.
- Return type:
float.
- pycanon.anonymity.delta_disclosure(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) float #
Calculate delta for delta-disclousure privacy.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
delta value for delta-discloure privacy.
- Return type:
float.
- pycanon.anonymity.enhanced_beta_likeness(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) float #
Calculate beta for enhanced beta-likeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
beta value for enhanced beta-likeness.
- Return type:
float.
- pycanon.anonymity.entropy_l_diversity(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) float #
Calculate l for entropy l-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
l value for entropy l-diversity.
- Return type:
float.
- pycanon.anonymity.k_anonymity(data: DataFrame, quasi_ident: Union[List, ndarray]) int #
Calculate k for k-anonymity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
- Returns:
k value for k-anonymity.
- Return type:
int.
- pycanon.anonymity.l_diversity(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) int #
Calculate l for l-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
l value for l-diversity.
- Return type:
int.
- pycanon.anonymity.recursive_c_l_diversity(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], imp=False, gen=True) Tuple[float, int] #
Calculate c and l for recursive (c,l)-diversity.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
c and l values for recursive (c,l)-diversity.
- Return type:
c is a float, l is an int.
- pycanon.anonymity.t_closeness(data: DataFrame, quasi_ident: Union[List, ndarray], sens_att: Union[List, ndarray], gen=True) float #
Calculate t for t-closeness.
- Parameters:
data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – boolean, default to True. If true, it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA
- Returns:
t value for basic t-closeness.
- Return type:
float.