Assume dataframe has rows 1,2,3...,n, columns A,B,C,...Z (so it has n records with 26 fields)
Example1:
a,  find 10 most common values in A
   pandas: value_counts()
b, find 10 most common values in A with sub-info of B
   numpy: where(B.str.contains('keyword'), 'keyword', 'no keyword') 
   pandas: groupby, size(), unstack()
               .sum(1), aggsort(),take
 other funcs: dropna(), fillna(0),notnull()
Example2:
a, find A, mean of B group by A,C
  pandas: pivot_table(B,rows = A, cols = C, aggfunc = 'mean')
--------------to be continue--------------- 
  
No comments:
Post a Comment