[Python数据分析]新股破板买入,赚钱几率如何?
接下来,假设破板买入后最长持股10天,这样我们只取妹纸股票之后10天的数据,这样数据比较整齐,也便于后续处理。如果数据不满10天,则跳过。 df=df.tail(11)['close'] ![]() >>> import tushare as ts >>> df=ts.get_hist_data('603737') >>> >>> df=df[['open','close','p_change']] >>> start_date=df[df['p_change']<9.8].tail(1).index[0] >>> df=df[df.index>=start_date] >>> df=df.tail(11)['close'] >>> df date 2016-07-15 100.90 2016-07-14 99.73 2016-07-13 98.87 2016-07-12 100.51 2016-07-11 99.38 2016-07-08 110.47 2016-07-07 113.71 2016-07-06 112.75 2016-07-05 115.63 2016-07-04 110.46 2016-07-01 111.82 Name: close, dtype: float64View Code 通过上一步,我们取到包含破板当日以及之后十天的收盘价,将数据转换为numpy.array数据。 close_array=df.values ![]() >>> close_array=df.values >>> close_array array([ 100.9 , 99.73, 98.87, 100.51, 99.38, 110.47, 113.71, 112.75, 115.63, 110.46, 111.82])View Code 进一步进行处理,如果破板后某个交易日收盘价格高于破板当天收盘价,则将array中的值置为1,否则为0。 import tushare as ts import numpy as np df=ts.get_hist_data('603737') df=df[['open','close','p_change']] start_date=df[df['p_change']<9.8].tail(1).index[0] df=df[df.index>=start_date] df=df.tail(11)['close'] close_array=df.values for i in range(1,11): if close_array[i]>close_array[0]: close_array[i]=1 else: close_array[i]=0 close_array[0]=0 print (close_array) 输出结果: localhost:~ shengtianhe$ python find.py [ 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1.] 一只股票的数据组织好了,接下来把所有新股的数据取出来,这里需要用到合并array的方法numpy.concatenate,另外代码里还有一些小地方的处理,搜了一圈才知道怎么做,比如如何返回值判断非空: import tushare as ts import numpy as np import pandas as pd df=ts.new_stocks() df=df[df['issue_date']>'2016-06-01'] df=df[['code','name','issue_date']] df=df[['code']] stock_code=df.values # print(stock_code) i=0 df_matrix = None for stock in stock_code: detail=ts.get_hist_data(stock_code[i][0]) if detail is None: i=i+1 continue detail=detail[['open','close','p_change']] hasbreak=detail[detail['p_change']<9.8] if hasbreak.size==0: i=i+1 continue start_date=hasbreak.tail(1).index[0] hasbreak=hasbreak[hasbreak.index>=start_date] hasbreak=hasbreak.tail(11)['close'] if hasbreak.size <11: i=i+1 continue close_array=hasbreak.values for day in range(1,hasbreak.size): if close_array[day]>close_array[0]: close_array[day]=1 else: close_array[day]=0 close_array[0]=0 df_matrix_thisRound=pd.DataFrame(close_array) df_matrix_thisRound=df_matrix_thisRound.T if df_matrix is None: df_matrix=df_matrix_thisRound else: df_matrix=np.concatenate((df_matrix,df_matrix_thisRound.values)) i=i+1 print(df_matrix) 运行结果如下: ![]() localhost:~ shengtianhe$ python proData.py [Getting data:]########[[ 0. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 0. 1. 0. 0. 1. 1. 1. 0. 0.] [ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0.] [ 0. 1. 1. 1. 1. 0. 1. 0. 1. 1. 1.] [ 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 0. 0. 1. 1. 1. 1.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 1. 0. 0. 0. 0. 0. 0. 0. 1. 1.] [ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 1.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 0. 0.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 0. 0. 0. 0. 1. 1. 1. 1. 1.] [ 0. 1. 0. 0. 0. 0. 0. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 0. 1. 0. 0. 1. 1. 1. 1. 0.] [ 0. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 0. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 0. 1. 1.] [ 0. 1. 1. 1. 1. 0. 0. 0. 0. 0. 1.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 1. 1. 1. 1. 0. 0. 0. 0.] [ 0. 1. 1. 0. 0. 1. 1. 0. 0. 0. 0.] [ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 0. 0. 1. 1. 1. 1. 0. 1.] [ 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 0. 0. 1. 1. 1. 1. 1. 1.] [ 0. 0. 0. 0. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0.] [ 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 0. 0. 0. 1. 1. 1. 1. 1. 0.] [ 0. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1.] [ 0. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1.] [ 0. 0. 0. 0. 0. 1. 1. 1. 1. 1. 1.]]View Code ----- 第四步:如何计算整体上每天盈利的概率。 这一步就相对简单了,直接求每一列的均值即可,数值上就代表当天盈利的概率了。 http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.DataFrame.mean.html 加入代码如下: mean=df_matrix.mean(0) print(mean) (编辑:应用网_丽江站长网) 【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容! |