pycanon.report package

Submodules

pycanon.report.base module

Get report values for all privacy models.

pycanon.report.base.get_report_values(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) Tuple[int, Tuple[float, int], int, float, Tuple[Any, int], float, float, float, float]

Generate a report with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.json module

Get report values as JSON for all privacy models.

pycanon.report.json.get_json_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) str

Generate a report (JSON) with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.pdf module

Get report values as PDF file for all privacy models.

pycanon.report.pdf.get_pdf_report(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True, file_pdf='report.pdf') None

Generate the PDF report with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

  • file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.pdf_utility_report module

Get utility report values.

pycanon.report.pdf_utility_report.get_pdf_utility_report(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True, gen=True, file_pdf='utility_report.pdf') None

Generate the PDF report both with the utility and anonymity checks.

Parameters:
  • data_raw (pandas dataframe) – dataframe with the data raw under study.

  • data_anon (pandas dataframe) – dataframe with the data anonymized.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

  • file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.pdf_utility_report.get_utility_report_values(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True) Tuple[float, float, float, dict]

Generate a report with the parameters obtained for each utility metric.

Parameters:
  • data_raw (pandas dataframe) – dataframe with the data raw under study.

  • data_anon (pandas dataframe) – dataframe with the data anonymized.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-

Module contents

Generate reports with all privacy model’s scores.

pycanon.report.get_json_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) str

Generate a report (JSON) with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.get_pdf_report(data: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, gen=True, file_pdf='report.pdf') None

Generate the PDF report with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

  • file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.get_pdf_utility_report(data_raw: DataFrame, data_anon: DataFrame, quasi_ident: List | ndarray, sens_att: List | ndarray, sup=True, gen=True, file_pdf='utility_report.pdf') None

Generate the PDF report both with the utility and anonymity checks.

Parameters:
  • data_raw (pandas dataframe) – dataframe with the data raw under study.

  • data_anon (pandas dataframe) – dataframe with the data anonymized.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • sup (boolean) – boolean, default to True. If true, suppression has been applied to the original dataset (somo records may have been deleted)-

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

  • file_pdf (string with extension .pdf) – name of the pdf file with the report. Default to ‘report.pdf’

pycanon.report.get_report_values(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) Tuple[int, Tuple[float, int], int, float, Tuple[Any, int], float, float, float, float]

Generate a report with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.

pycanon.report.print_report(data: DataFrame, quasi_ident: list, sens_att: list, gen=True) None

Generate a report with the parameters obtained for each anonymity check.

Parameters:
  • data (pandas dataframe) – dataframe with the data under study.

  • quasi_ident (list of strings) – list with the name of the columns of the dataframe that are quasi-identifiers.

  • sens_att (is a list of strings) – list with the name of the columns of the dataframe that are the sensitive attributes.

  • gen (boolean) – default to true. If true it is generalized for the case of multiple SA, if False, the set of QI is updated for each SA.