久久久久久久av_日韩在线中文_看一级毛片视频_日本精品二区_成人深夜福利视频_武道仙尊动漫在线观看

從 GradientBoostingClassifier 中提取決策規則

Extracting decision rules from GradientBoostingClassifier(從 GradientBoostingClassifier 中提取決策規則)
本文介紹了從 GradientBoostingClassifier 中提取決策規則的處理方法,對大家解決問題具有一定的參考價值,需要的朋友們下面隨著小編來一起學習吧!

問題描述

我已經解決了以下問題:

I have gone through the below questions:

如何提取 GradientBosstingClassifier 的決策規則

如何從中提取決策規則scikit-learn 決策樹?

但是以上兩個并沒有解決我的目的.以下是我的查詢:

However the above two does not solve my purpose. Below is my query:

我需要使用 gradientboostingclassifer 在 Python 中構建一個模型,并在 SAS 平臺中實現這個模型.為此,我需要從 gradientboostingclassifer 中提取決策規則.

I need to build a model in Python using gradientboostingclassifer and implement this model in SAS platform. To do this I need to extract decision rules from the gradientboostingclassifer .

以下是我目前嘗試過的:

Below is what I have tried so far:

在 IRIS 數據上構建模型:

Build the model on the IRIS data:

# import the most common dataset
from sklearn.datasets import load_iris
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.tree import export_graphviz
from sklearn.externals.six import StringIO  
from IPython.display import Image

X, y = load_iris(return_X_y=True)
# there are 150 observations and 4 features
print(X.shape) # (150, 4)
# let's build a small model = 5 trees with depth no more than 2
model = GradientBoostingClassifier(n_estimators=5, max_depth=3, learning_rate=1.0)
model.fit(X, y==2) # predict 2nd class vs rest, for simplicity
# we can access individual trees
trees = model.estimators_.ravel()

def plot_tree(clf):
    dot_data = StringIO()
    export_graphviz(clf, out_file=dot_data, node_ids=True,
                    filled=True, rounded=True, 
                    special_characters=True)
    graph = pydotplus.graph_from_dot_data([enter image description here][3]dot_data.getvalue())  
    return Image(graph.create_png())

# now we can plot the first tree
plot_tree(trees[0])

繪制圖表后,我檢查了第一棵樹的圖表源代碼,并使用以下代碼寫入文本文件:

After the plotting of the graph, I have checked the source code of the graph for the 1st tree and write to text file using the below code:

with open("C:\UsersXXXXDesktopPythoninput_tree.txt", "w") as wrt:
    wrt.write(export_graphviz(trees[0], out_file=None, node_ids=True,
                filled=True, rounded=True, 
                special_characters=True))

以下是輸出文件:

digraph Tree {
node [shape=box, style="filled, rounded", color="black", fontname=helvetica] ;
edge [fontname=helvetica] ;
0 [label=<node &#35;0<br/>X<SUB>3</SUB> &le; 1.75<br/>friedman_mse = 0.222<br/>samples = 150<br/>value = 0.0>, fillcolor="#e5813955"] ;
1 [label=<node &#35;1<br/>X<SUB>2</SUB> &le; 4.95<br/>friedman_mse = 0.046<br/>samples = 104<br/>value = -0.285>, fillcolor="#e5813945"] ;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
2 [label=<node &#35;2<br/>X<SUB>3</SUB> &le; 1.65<br/>friedman_mse = 0.01<br/>samples = 98<br/>value = -0.323>, fillcolor="#e5813943"] ;
1 -> 2 ;
3 [label=<node &#35;3<br/>friedman_mse = 0.0<br/>samples = 97<br/>value = -1.5>, fillcolor="#e5813900"] ;
2 -> 3 ;
4 [label=<node &#35;4<br/>friedman_mse = -0.0<br/>samples = 1<br/>value = 3.0>, fillcolor="#e58139ff"] ;
2 -> 4 ;
5 [label=<node &#35;5<br/>X<SUB>3</SUB> &le; 1.55<br/>friedman_mse = 0.222<br/>samples = 6<br/>value = 0.333>, fillcolor="#e5813968"] ;
1 -> 5 ;
6 [label=<node &#35;6<br/>friedman_mse = 0.0<br/>samples = 3<br/>value = 3.0>, fillcolor="#e58139ff"] ;
5 -> 6 ;
7 [label=<node &#35;7<br/>friedman_mse = 0.222<br/>samples = 3<br/>value = 0.0>, fillcolor="#e5813955"] ;
5 -> 7 ;
8 [label=<node &#35;8<br/>X<SUB>2</SUB> &le; 4.85<br/>friedman_mse = 0.021<br/>samples = 46<br/>value = 0.645>, fillcolor="#e581397a"] ;
0 -> 8 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
9 [label=<node &#35;9<br/>X<SUB>1</SUB> &le; 3.1<br/>friedman_mse = 0.222<br/>samples = 3<br/>value = 0.333>, fillcolor="#e5813968"] ;
8 -> 9 ;
10 [label=<node &#35;10<br/>friedman_mse = 0.0<br/>samples = 2<br/>value = 3.0>, fillcolor="#e58139ff"] ;
9 -> 10 ;
11 [label=<node &#35;11<br/>friedman_mse = -0.0<br/>samples = 1<br/>value = -1.5>, fillcolor="#e5813900"] ;
9 -> 11 ;
12 [label=<node &#35;12<br/>friedman_mse = -0.0<br/>samples = 43<br/>value = 3.0>, fillcolor="#e58139ff"] ;
8 -> 12 ;
}

為了從輸出文件中提取決策規則,我嘗試了以下 python RegEX 代碼來轉換為 SAS 代碼:

To extract the decision rules from the output file I have tried the below python RegEX code to translate to SAS code:

 import re
with open("C:\UsersXXXXDesktopPythoninput_tree.txt") as f:
    with open("C:\UsersXXXXDesktopPythonoutput.txt", "w") as f1:
        result0 = 'value = 0;'
        f1.write(result0)
        for line in f:
            result1 = re.sub(r'^(d+)s+.*<br/>([A-Z]+)<SUB>(d+)</SUB>s+(.+?)([-d.]+)<br/>friedman_mse.*;$',r"if 23 4 5 then do;",line)
            result2 = re.sub(r'^(d+).*(?!SUB).*(values+=)s([-d.]+).*;$',r"2 value + 3; end;",result1)
            result3 = re.sub(r'^(d+s+->s+d+s+);$',r'1',result2)
            result4 = re.sub(r'^digraph.+|^node.+|^edge.+','',result3)
            result5 = re.sub(r'&(w{2});',r'1',result4)
            result6 = re.sub(r'}','end;',result5)
            f1.write(result6)

以下是上述代碼的輸出 SAS:

below is the output SAS from the above code:

value = 0;
if X3 le  1.75 then do;
if X2 le  4.95 then do;
0 -> 1 [labeldistance=2.5, labelangle=45, headlabel="True"] ;
if X3 le  1.65 then do;
1 -> 2 
value = value + -1.5; end;
2 -> 3 
value = value + 3.0; end;
2 -> 4 
if X3 le  1.55 then do;
1 -> 5 
value = value + 3.0; end;
5 -> 6 
value = value + 0.0; end;
5 -> 7 
if X2 le  4.85 then do;
0 -> 8 [labeldistance=2.5, labelangle=-45, headlabel="False"] ;
if X1 le  3.1 then do;
8 -> 9 
value = value + 3.0; end;
9 -> 10 
value = value + -1.5; end;
9 -> 11 
value = value + 3.0; end;
8 -> 12 
end;

如您所見,輸出文件中缺少一塊,即我無法正確打開/關閉 do-end 塊.為此,我需要使用節點號,但我沒有這樣做,因為我在這里找不到任何模式.

As you can see there is a missing piece in the output file i.e. I am not able to open/close the do-end block properly. For this I need to make use of the node numbers but I am failing to so as I am unable to find any pattern here.

誰能幫我解決這個問題.

Could anyone of you please help me with this query.

除此之外,像決策樹分類器一樣,我不能提取上面第二個鏈接中提到的 children_left、children_right、閾值.我已經成功提取了GBM的每一棵樹

Apart from this, like decisiontreeclassifier can I not extract the children_left, children_right, threshold value as mentioned in the above 2nd link. I have successfully extracted each tree of GBM

trees = model.estimators_.ravel()

但是我沒有找到任何有用的函數可以用來提取每棵樹的值和規則.如果我能以與 DecisionTreeclassifier 類似的方式使用 grapviz 對象,請提供幫助.

but I didn't find any useful function which I can use to extract the value and rules of each tree. Kindly help if I can use the grapviz object in a similar way of DecisionTreeclassifier.

用任何其他可以解決我的目的的方法來幫助我.

Help me with any other method which can solve my purpose.

推薦答案

不需要使用graphviz導出來訪問決策樹數據.model.estimators_ 包含模型所包含的所有單個分類器.對于 GradientBoostingClassifier,這是一個形狀為 (n_estimators, n_classes) 的 2D numpy 數組,每個項目都是一個 DecisionTreeRegressor.

There is no need to use the graphviz export to access the decision tree data. model.estimators_ contains all the individual classifiers that the model consists of. In the case of a GradientBoostingClassifier, this is a 2D numpy array with shape (n_estimators, n_classes), and each item is a DecisionTreeRegressor.

每個決策樹都有一個屬性 _tree 和 了解決策樹結構 展示了如何從該對象中取出節點、閾值和子對象.

Each decision tree has a property _tree and Understanding the decision tree structure shows how to get out the nodes, thresholds and children from that object.


import numpy
import pandas
from sklearn.ensemble import GradientBoostingClassifier

est = GradientBoostingClassifier(n_estimators=4)
numpy.random.seed(1)
est.fit(numpy.random.random((100, 3)), numpy.random.choice([0, 1, 2], size=(100,)))
print('s', est.estimators_.shape)

n_classes, n_estimators = est.estimators_.shape
for c in range(n_classes):
    for t in range(n_estimators):
        dtree = est.estimators_[c, t]
        print("class={}, tree={}: {}".format(c, t, dtree.tree_))

        rules = pandas.DataFrame({
            'child_left': dtree.tree_.children_left,
            'child_right': dtree.tree_.children_right,
            'feature': dtree.tree_.feature,
            'threshold': dtree.tree_.threshold,
        })
        print(rules)

為每棵樹輸出如下內容:

Outputs something like this for each tree:

class=0, tree=0: <sklearn.tree._tree.Tree object at 0x7f18a697f370>
   child_left  child_right  feature  threshold
0           1            2        0   0.020702
1          -1           -1       -2  -2.000000
2           3            6        1   0.879058
3           4            5        1   0.543716
4          -1           -1       -2  -2.000000
5          -1           -1       -2  -2.000000
6           7            8        0   0.292586
7          -1           -1       -2  -2.000000
8          -1           -1       -2  -2.000000

這篇關于從 GradientBoostingClassifier 中提取決策規則的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!

【網站聲明】本站部分內容來源于互聯網,旨在幫助大家更快的解決問題,如果有圖片或者內容侵犯了您的權益,請聯系我們刪除處理,感謝您的支持!

相關文檔推薦

How should I verify a log message when testing Python code under nose?(在鼻子下測試 Python 代碼時,我應該如何驗證日志消息?)
Patch __call__ of a function(修補函數的 __call__)
How to call self in a mock method of an object in Python?(如何在 Python 中對象的模擬方法中調用 self?)
Mocking only a single method on an object(僅模擬對象上的單個方法)
Mocking a subprocess call in Python(在 Python 中模擬子進程調用)
Checking call order across multiple mocks(檢查多個模擬的調用順序)
主站蜘蛛池模板: 岛国毛片| 午夜精品久久 | 欧美激情国产日韩精品一区18 | 中文字幕一区二区三区四区不卡 | 黄色片免费看视频 | 99久久99| 久久网一区二区 | 国产剧情一区 | 日韩中文一区二区三区 | 奇米超碰 | 久久手机视频 | 国产日韩精品在线 | 我我色综合 | 视频在线一区二区 | 欧美日一区 | 欧美一区二区三区在线观看 | 亚洲精品久久久久国产 | 欧美一级免费 | 欧美一级全黄 | 成人在线观看亚洲 | 天天操夜夜操 | 特黄视频 | 久久9热| 久热国产精品视频 | 久久精品中文字幕 | 国产精品爱久久久久久久 | 日日干夜夜操天天操 | 在线观看免费av网站 | 91视视频在线观看入口直接观看 | 黄色成人免费在线观看 | 国产999精品久久久久久 | 国产精品久久 | 国产精品久久片 | 亚洲精品一区二区三区免 | 欧美日韩一区在线观看 | 国产视频一区二区三区四区五区 | 亚洲国产高清在线观看 | 欧洲av在线| 91一区二区三区 | 电影91久久久 | 亚洲国产欧美在线人成 |