pycanon.report package¶

Submodules¶

pycanon.report.base module¶

Get report values for all privacy models.

pycanon.report.base.get_report_values(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) → Tuple[int, Tuple[float, int], int, float, Tuple[Any, int], float, float, float, float]¶

Generate a report with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.json module¶

Get report values as JSON for all privacy models.

pycanon.report.json.get_json_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) → str¶

Generate a report (JSON) with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.pdf module¶

Get report values as PDF file for all privacy models.

pycanon.report.pdf.get_pdf_report(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True, file_pdf='report.pdf') → None¶

Generate the PDF report with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.
file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.pdf_utility_report module¶

Get utility report values.

pycanon.report.pdf_utility_report.get_pdf_utility_report(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True, gen=True, file_pdf='utility_report.pdf') → None¶

Generate the PDF report both with the utility and anonymity checks.

Parameters:

data_raw (pandas dataframe) – dataframe with the data raw under study.
data_anon (pandas dataframe) – dataframe with the data anonymized.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.
file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.pdf_utility_report.get_utility_report_values(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True) → Tuple[float, float, float, dict]¶

Generate a report with the parameters obtained for each utility metric.

Parameters:

data_raw (pandas dataframe) – dataframe with the data raw under study.
data_anon (pandas dataframe) – dataframe with the data anonymized.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-

Module contents¶

Generate reports with all privacy model’s scores.

pycanon.report.get_json_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) → str¶

Generate a report (JSON) with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.get_pdf_report(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True, file_pdf='report.pdf') → None¶

Generate the PDF report with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.
file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.get_pdf_utility_report(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True, gen=True, file_pdf='utility_report.pdf') → None¶

Generate the PDF report both with the utility and anonymity checks.

Parameters:

data_raw (pandas dataframe) – dataframe with the data raw under study.
data_anon (pandas dataframe) – dataframe with the data anonymized.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.
file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.get_report_values(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) → Tuple[int, Tuple[float, int], int, float, Tuple[Any, int], float, float, float, float]¶

Generate a report with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.print_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) → None¶

Generate a report with the parameters obtained for each anonymity check.

Parameters:

data (pandas dataframe) – dataframe with the data under study.
quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.
sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.
gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.