Rexer Analytics Data Mining Survey

Being at KDD (Knowledge Discovery and Data Mining) right now - not to mention having just sat down for a chat with Karl Rexer, I thought it fitting to post a summary that Karl shared of his recent data mining survey:

2007 HIGHLIGHTS:

·   27-item survey of data miners, conducted on-line in early 2007

·   314 responses from individuals in 35 countries

·   Regression, decision trees and cluster analysis were the most commonly used algorithms (mean number of algorithms used: 6.8)

·   Top challenges data miners report are dirty data, data access, and explaining data mining to others

·   SPSS, SPSS Clementine, and SAS are the three most frequently utilized tools (mean number of tools used:

4.5)

·   There is increasing interest in the Oracle Data Mining tool, and decreasing interest in C4.5/C5.0/See5   

·   The primary factors data miners consider when selecting an analytic tool are: 1) the dependability and stability of software, 2) the ability to handle large data sets, and 3) data manipulation capabilities

·   The findings vary somewhat depending on the domain in which the data miner works, the tools used, geography, and several other dimensions

Indsend kommentar

Indholdet af dette felt er privat og bliver ikke vist offentligt.