Tuesday, May 21, 2013

Python for data analysis(study notes): chapter 2

Assume dataframe has rows 1,2,3...,n, columns A,B,C,...Z (so it has n records with 26 fields)
Example1:
a,  find 10 most common values in A
   pandas: value_counts()
b, find 10 most common values in A with sub-info of B
   numpy: where(B.str.contains('keyword'), 'keyword', 'no keyword')
   pandas: groupby, size(), unstack()
               .sum(1), aggsort(),take
 other funcs: dropna(), fillna(0),notnull()

Example2:
a, find A, mean of B group by A,C
  pandas: pivot_table(B,rows = A, cols = C, aggfunc = 'mean')
--------------to be continue---------------
 


No comments:

Post a Comment