問題描述
當我嘗試將 svmlight python 包 與我已轉換為 svmlight 格式的數據一起使用時我得到一個錯誤.它應該是非常基本的,我不明白發生了什么.代碼如下:
When I try to use the svmlight python package with data I already converted to svmlight format I get an error. It should be pretty basic, I don't understand what's happening. Here's the code:
import svmlight
training_data = open('thedata', "w")
model=svmlight.learn(training_data, type='classification', verbosity=0)
我也試過了:
training_data = numpy.load('thedata')
和
training_data = __import__('thedata')
推薦答案
一個明顯的問題是您在打開數據文件時會截斷它,因為您指定了寫入模式 "w"
.這意味著將沒有要讀取的數據.
One obvious problem is that you are truncating your data file when you open it because you are specifying write mode "w"
. This means that there will be no data to read.
無論如何,如果您的數據文件類似于此 example,因為是python文件,所以需要導入.這應該有效:
Anyway, you don't need to read the file like that if your data file is like the one in this example, you need to import it because it is a python file. This should work:
import svmlight
from data import train0 as training_data # assuming your data file is named data.py
# or you could use __import__()
#training_data = __import__('data').train0
model = svmlight.learn(training_data, type='classification', verbosity=0)
您可能希望將您的數據與示例的數據進行比較.
You might want to compare your data against that of the example.
數據文件格式明確后編輯
輸入文件需要被解析成這樣的元組列表:
The input file needs to be parsed into a list of tuples like this:
[(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
(target, [(feature_1, value_1), (feature_2, value_2), ... (feature_n, value_n)]),
...
]
svmlight 包似乎不支持讀取 SVM 文件格式的文件,并且沒有任何解析功能,因此必須在 Python 中實現.SVM 文件如下所示:
The svmlight package does not appear to support reading from a file in the SVM file format, and there aren't any parsing functions, so it will have to be implemented in Python. SVM files look like this:
<target> <feature>:<value> <feature>:<value> ... <feature>:<value> # <info>
所以這里有一個解析器,可以將文件格式轉換為 svmlight 包所需的格式:
so here is a parser that converts from the file format to that required by the svmlight package:
def svm_parse(filename):
def _convert(t):
"""Convert feature and value to appropriate types"""
return (int(t[0]), float(t[1]))
with open(filename) as f:
for line in f:
line = line.strip()
if not line.startswith('#'):
line = line.split('#')[0].strip() # remove any trailing comment
data = line.split()
target = float(data[0])
features = [_convert(feature.split(':')) for feature in data[1:]]
yield (target, features)
你可以這樣使用它:
import svmlight
training_data = list(svm_parse('thedata'))
model=svmlight.learn(training_data, type='classification', verbosity=0)
這篇關于加載 svmlight 格式錯誤的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!