正文之前
因为数据差距实在太大,从10-10000都有,要是全搞决策树我估计我是啥都不用搞了,看着电脑卡死就ok!所以特地将连续的数据转化为连续的数据!看看是不是会生成新的、更好地决策树!
正文
废话不多说!直接丢代码!不然真是难受的一批!写了好一会儿才搞定的!
#include#include #include using namespace std;int main(){ int count=0; float attr[34]; ifstream in("/Users/zhangzhaobo/Documents/Graduation-Design/Mydata.txt"); ofstream out("/Users/zhangzhaobo/Documents/Graduation-Design/Data/New_Data.txt"); string line[34]; for (int i = 0; i < 34; ++i) { in>>line[i]; } out<<"Diff_X"<<"\t"<<"Diff_Y"<<"\t"; for (int i = 4; i < 8; ++i) { out< <<"\t"; } out<<"Diff_Luminosity\t"; out< <<"\t"; out<<"TypeouOfSteel\t"; for (int i = 13; i < 27; ++i) { out< <<"\t"; } out<<"Fault"; out< >attr[i]; } float X_dis=attr[1]-attr[0]; float Y_dis=attr[3]-attr[2]; float Luminosity_dis=attr[9]-attr[8]; float TypeOfSteel=attr[11]; out< <<"\t"< <<"\t"; for (int i = 4; i < 8; ++i) { out< <<"\t"; } out< <<"\t"; out< <<"\t"; out< <<"\t"; for (int i = 13; i < 27; ++i) { out< <<"\t"; } int Fault=0; for (int i = 0; i < 7; ++i) { Fault=(Fault+attr[i+27])*2; } out< <
正文
改善之后的属性为:
Diff_X Diff_Y Pixels_Areas X_Perimeter Y_Perimeter Sum_of_Luminosity Diff_Luminosity Length_of_Conveyer TypeouOfSteel Steel_Plate_Thickness Edges_Index Empty_Index Square_Index Outside_X_Index Edges_X_Index Edges_Y_Index Outside_Global_Index LogOfAreas Log_X_Index Log_Y_Index Orientation_Index Luminosity_Index SigmoidOfAreas Fault8 44 267 17 44 24220 32 1687 1 80 0.0498 0.2415 0.1818 0.0047 0.4706 1 1 2.4265 0.9031 1.6435 0.8182 -0.2913 0.5822 1286 29 108 10 30 11397 39 1687 1 80 0.7647 0.3793 0.2069 0.0036 0.6 0.9667 1 2.0334 0.7782 1.4624 0.7931 -0.1756 0.2984 128复制代码
为此还特地写了个C++的程序来观察!
#include#include #include using namespace std;int main(){ string line[72]; int count=0; for (int i = 0; i < 72; ++i) { cin>>line[i]; } for (int i = 0; i < 24; ++i) { cout<<"[->"< <<": "< <<" --> "< <<" --> "< <
最后整出来还蛮好看!?
[->0: Diff_X --> 8 --> 6[->1: Diff_Y --> 44 --> 29[->2: Pixels_Areas --> 267 --> 108[->3: X_Perimeter --> 17 --> 10[->4: Y_Perimeter --> 44 --> 30[->5: Sum_of_Luminosity --> 24220 --> 11397[->6: Diff_Luminosity --> 32 --> 39[->7: Length_of_Conveyer --> 1687 --> 1687[->8: TypeouOfSteel --> 1 --> 1[->9: Steel_Plate_Thickness --> 80 --> 80[->10: Edges_Index --> 0.0498 --> 0.7647[->11: Empty_Index --> 0.2415 --> 0.3793[->12: Square_Index --> 0.1818 --> 0.2069[->13: Outside_X_Index --> 0.0047 --> 0.0036[->14: Edges_X_Index --> 0.4706 --> 0.6[->15: Edges_Y_Index --> 1 --> 0.9667[->16: Outside_Global_Index --> 1 --> 1[->17: LogOfAreas --> 2.4265 --> 2.0334[->18: Log_X_Index --> 0.9031 --> 0.7782[->19: Log_Y_Index --> 1.6435 --> 1.4624[->20: Orientation_Index --> 0.8182 --> 0.7931[->21: Luminosity_Index --> -0.2913 --> -0.1756[->22: SigmoidOfAreas --> 0.5822 --> 0.2984[->23: Fault --> 128 --> 128复制代码