Getting started with Facets_Hands-On Explainable AI（XAI） with Python-QQ阅读男生武侠网

上QQ阅读APP看书，第一时间看更新

Getting started with Facets

In this section, we will install Facets in Python, using Jupyter Notebook on Google Colaboratory.

We will then retrieve the training and testing datasets. Finally, we will read the data files.

The data files are the training and testing datasets from Chapter 1, Explaining Artificial Intelligence with Python. This way, we are in a situation in which we know the subject and can analyze the data without having to spend time understanding what it means.

Let's first install Facets on Google Colaboratory.

Installing Facets on Google Colaboratory

Open Facets.ipynb. The first cell contains the installation command:

# @title Install the facets-overview pip package.
!pip install facets-overview

The installation may be lost when the virtual machine (VM) is restarted. If this is the case, it will be installed again. If Facets is installed, the following message is displayed:

Requirement already satisfied:

The program will now retrieve the datasets.

Retrieving the datasets

The program retrieves the datasets from GitHub or Google Drive.

To import the data from GitHub, set the import option to repository = "github".

To read the data from Google Drive, set the option to repository = "google".

In this section, we will activate GitHub and import the data:

# @title Importing data <br>
# Set repository to "github"(default) to read the data
# from GitHub <br>
# Set repository to "google" to read the data
# from Google {display-mode: "form"}
import os
from google.colab import drive
# Set repository to "github" to read the data from GitHub
# Set repository to "google" to read the data from Google
repository = "github"
if repository == "github":
  !curl -L https://raw.githubusercontent.com/PacktPublishing/Hands-On-Explainable-AI-XAI-with-Python/master/Chapter03/DLH_train.csv --output "DLH_train.csv"
  !curl -L https://raw.githubusercontent.com/PacktPublishing/Hands-On-Explainable-AI-XAI-with-Python/master/Chapter03/DLH_test.csv --output "DLH_test.csv"

The data is now accessible to our runtime. We will set the path for each file:

 # Setting the path for each file
  dtrain = "/content/DLH_train.csv"
  dtest = "/content/DLH_test.csv"
 print(dtrain, dtest)

You can choose the same actions with Google Drive:

if repository == "google":
  # Mounting the drive. If it is not mounted, a prompt
  # will provide instructions
  drive.mount('/content/drive')
  # Setting the path for each file
  dtrain = '/content/drive/My Drive/XAI/Chapter03/DLH_Train.csv'
  dtest = '/content/drive/My Drive/XAI/Chapter03/DLH_Train.csv'
  print(dtrain, dtest)

We have installed Facets and can access the files. We will now read the files.

Reading the data files

In this section, we will use pandas to read the data files and load them into DataFrames.

We will first import pandas and define the features:

# Loading Denis Rothman research training and testing data
# into DataFrames
import pandas as pd
features = ["colored_sputum", "cough", "fever", "headache", "days",
            "france", "chicago", "class"]

The data files contain no headers so we will use our features array to define the names of the columns for the training data:

train_data = pd.read_csv(dtrain, names=features, sep=r'\s*,\s*',
                         engine='python', na_values="?")

The program now reads the training data file into a DataFrame:

test_data = pd.read_csv(dtest, names=features, sep=r'\s*,\s*',
                        skiprows=[0], engine='python', na_values="?")

Having read the data into DataFrames, we can now implement feature statistics for our datasets.

本周热推：

硅谷之火：个人计算机的诞生与衰落（第3版）从零开始学51单片机C语言电脑组装、维护、维修全能一本通（全彩版）硬件产品经理手册：手把手构建智能硬件产品量子霸权