Python数据分析 知识量:13 - 56 - 232
1、选择某一列时,只需要在DataFrame对象名称后面使用方括号标出列的名称即可,就像在数组中选择元素一样。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_new.xlsx") print(df,'\n') print(df['Name'])
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Bob male 99 84 89 3 Olivia female 86 87 44 4 Jeff male 48 87 65 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Isabella female 66 85 55 0 Noah 1 Emma 2 Bob 3 Olivia 4 Jeff 5 Liam 6 Sophia 7 Isabella Name: Name, dtype: object
2、除了使用以上方式,还可以使用点(.)来选择列,即在DataFrame对象名称后面使用“对象名称.列名”的方式。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_new.xlsx") print(df,'\n') print(df.Name)
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Bob male 99 84 89 3 Olivia female 86 87 44 4 Jeff male 48 87 65 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Isabella female 66 85 55 0 Noah 1 Emma 2 Bob 3 Olivia 4 Jeff 5 Liam 6 Sophia 7 Isabella Name: Name, dtype: object
以上2种选择数据的方式称为普通索引。
3、还可以通过具体列的位置进行选择,即使用位置索引。这时需要使用iloc方法:
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_new.xlsx") print(df,'\n') print(df.iloc[:,0])
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Bob male 99 84 89 3 Olivia female 86 87 44 4 Jeff male 48 87 65 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Isabella female 66 85 55 0 Noah 1 Emma 2 Bob 3 Olivia 4 Jeff 5 Liam 6 Sophia 7 Isabella Name: Name, dtype: object
在iloc后的方括号中,由逗号分隔了两个参数。第1个参数用于表明选择的行索引;第2个参数用于表明选择的列索引。其中,第1个参数为冒号时,表示选择所有行。
如果要选择多列,当使用普通索引时,通过列表的形式传入多个列名即可。而使用位置索引时,传入多个位置索引即可。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_new.xlsx") print(df,'\n') print(df[['Name','Chinese']],'\n') print(df.iloc[:,[0,2]])
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Bob male 99 84 89 3 Olivia female 86 87 44 4 Jeff male 48 87 65 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Isabella female 66 85 55 Name Chinese 0 Noah 90 1 Emma 56 2 Bob 99 3 Olivia 86 4 Jeff 48 5 Liam 55 6 Sophia 90 7 Isabella 66 Name Chinese 0 Noah 90 1 Emma 56 2 Bob 99 3 Olivia 86 4 Jeff 48 5 Liam 55 6 Sophia 90 7 Isabella 66
当需要选择连续的多个列时,使用传统标出所有列名的方法就有些笨拙了。可以使用iloc方法,通过传入一个类似于区间的参数来实现,这称为切片索引。
import pandas as pd df=pd.read_excel(r"D:\PythonTestFile\exam_new.xlsx") print(df,'\n') print(df.iloc[:,0:3])
运行结果为:
Name Sex Chinese English Math 0 Noah male 90 50 66 1 Emma female 56 56 55 2 Bob male 99 84 89 3 Olivia female 86 87 44 4 Jeff male 48 87 65 5 Liam male 55 88 69 6 Sophia female 90 66 96 7 Isabella female 66 85 55 Name Sex Chinese 0 Noah male 90 1 Emma female 56 2 Bob male 99 3 Olivia female 86 4 Jeff male 48 5 Liam male 55 6 Sophia female 90 7 Isabella female 66
0:3表示选择列索引值0~2的列(即:第1列到第3列),不包括列索引值为3的列(即:左闭右开)。
Copyright © 2017-Now pnotes.cn. All Rights Reserved.
编程学习笔记 保留所有权利
MARK:3.0.0.20240214.P35
From 2017.2.6