Pandas: Find Rows Where Column/Field Is Null
I did some experimenting with a dataset I've been playing around with to find any columns/fields that have null values in them. Learn how I did it!
Join the DZone community and get the full member experience.
Join For FreeIn my continued playing around with the Kaggle house prices dataset, I wanted to find any columns/fields that have null values in them.
If we want to get a count of the number of null fields by column we can use the following code, adapted from Poonam Ligade’s kernel.
Prerequisites
import pandas as pd
Count the Null Columns
train = pd.read_csv("train.csv")
null_columns=train.columns[train.isnull().any()]
train[null_columns].isnull().sum()
LotFrontage 259
Alley 1369
MasVnrType 8
MasVnrArea 8
BsmtQual 37
BsmtCond 37
BsmtExposure 38
BsmtFinType1 37
BsmtFinType2 38
Electrical 1
FireplaceQu 690
GarageType 81
GarageYrBlt 81
GarageFinish 81
GarageQual 81
GarageCond 81
PoolQC 1453
Fence 1179
MiscFeature 1406
dtype: int64
So there are lots of different columns containing null values. What if we want to find the solitary row which has "Electrical" as null?
Single Column Is Null
print(train[train["Electrical"].isnull()][null_columns])
LotFrontage Alley MasVnrType MasVnrArea BsmtQual BsmtCond BsmtExposure \
1379 73.0 NaN None 0.0 Gd TA No
BsmtFinType1 BsmtFinType2 Electrical FireplaceQu GarageType GarageYrBlt \
1379 Unf Unf NaN NaN BuiltIn 2007.0
GarageFinish GarageQual GarageCond PoolQC Fence MiscFeature
1379 Fin TA TA NaN NaN NaN
And what if we want to return every row that contains at least one null value? That’s not too difficult – it’s just a combination of the code in the previous two sections.
All Null Columns
print(train[train.isnull().any(axis=1)][null_columns].head())
LotFrontage Alley MasVnrType MasVnrArea BsmtQual BsmtCond BsmtExposure \
0 65.0 NaN BrkFace 196.0 Gd TA No
1 80.0 NaN None 0.0 Gd TA Gd
2 68.0 NaN BrkFace 162.0 Gd TA Mn
3 60.0 NaN None 0.0 TA Gd No
4 84.0 NaN BrkFace 350.0 Gd TA Av
BsmtFinType1 BsmtFinType2 Electrical FireplaceQu GarageType GarageYrBlt \
0 GLQ Unf SBrkr NaN Attchd 2003.0
1 ALQ Unf SBrkr TA Attchd 1976.0
2 GLQ Unf SBrkr TA Attchd 2001.0
3 ALQ Unf SBrkr Gd Detchd 1998.0
4 GLQ Unf SBrkr TA Attchd 2000.0
GarageFinish GarageQual GarageCond PoolQC Fence MiscFeature
0 RFn TA TA NaN NaN NaN
1 RFn TA TA NaN NaN NaN
2 RFn TA TA NaN NaN NaN
3 Unf TA TA NaN NaN NaN
4 RFn TA TA NaN NaN NaN
And that's it!
If you liked this post, here are some more great posts by Mark Needham on Pandas:
Published at DZone with permission of Mark Needham, DZone MVB. See the original article here.
Opinions expressed by DZone contributors are their own.
Comments