/ Python And R Data science skills: 50 dealing with missing data in Numpy

Monday 5 February 2018

50 dealing with missing data in Numpy

50 dealing with missing data in Numpy
In [1]:
import numpy as np
import pandas as pd
In [2]:
dist1={'A':[1,2,np.nan],'B':[5,np.nan,np.nan],'C':[1,2,3]}
In [3]:
dist1
Out[3]:
{'A': [1, 2, nan], 'B': [5, nan, nan], 'C': [1, 2, 3]}
In [4]:
df = pd.DataFrame(dist1)
In [5]:
df
Out[5]:
A B C
0 1.0 5.0 1
1 2.0 NaN 2
2 NaN NaN 3
In [7]:
df.dropna(axis=1)
Out[7]:
C
0 1
1 2
2 3
In [12]:
df.dropna(thresh=2,axis=1)
Out[12]:
A C
0 1.0 1
1 2.0 2
2 NaN 3
In [14]:
df.fillna(value=8)
Out[14]:
A B C
0 1.0 5.0 1
1 2.0 8.0 2
2 8.0 8.0 3
In [15]:
df
Out[15]:
A B C
0 1.0 5.0 1
1 2.0 NaN 2
2 NaN NaN 3
In [16]:
df['A']
Out[16]:
0    1.0
1    2.0
2    NaN
Name: A, dtype: float64
In [17]:
df['A'].fillna(value=df['A'].mean())
Out[17]:
0    1.0
1    2.0
2    1.5
Name: A, dtype: float64

No comments:

Post a Comment