/ Python And R Data science skills: 50 dealing with missing data in Numpy
Showing posts with label 50 dealing with missing data in Numpy. Show all posts
Showing posts with label 50 dealing with missing data in Numpy. Show all posts

Monday, 5 February 2018

50 dealing with missing data in Numpy

50 dealing with missing data in Numpy
In [1]:
import numpy as np
import pandas as pd
In [2]:
dist1={'A':[1,2,np.nan],'B':[5,np.nan,np.nan],'C':[1,2,3]}
In [3]:
dist1
Out[3]:
{'A': [1, 2, nan], 'B': [5, nan, nan], 'C': [1, 2, 3]}
In [4]:
df = pd.DataFrame(dist1)
In [5]:
df
Out[5]:
A B C
0 1.0 5.0 1
1 2.0 NaN 2
2 NaN NaN 3
In [7]:
df.dropna(axis=1)
Out[7]:
C
0 1
1 2
2 3
In [12]:
df.dropna(thresh=2,axis=1)
Out[12]:
A C
0 1.0 1
1 2.0 2
2 NaN 3
In [14]:
df.fillna(value=8)
Out[14]:
A B C
0 1.0 5.0 1
1 2.0 8.0 2
2 8.0 8.0 3
In [15]:
df
Out[15]:
A B C
0 1.0 5.0 1
1 2.0 NaN 2
2 NaN NaN 3
In [16]:
df['A']
Out[16]:
0    1.0
1    2.0
2    NaN
Name: A, dtype: float64
In [17]:
df['A'].fillna(value=df['A'].mean())
Out[17]:
0    1.0
1    2.0
2    1.5
Name: A, dtype: float64