《數(shù)據(jù)挖掘402207》由會員分享,可在線閱讀,更多相關(guān)《數(shù)據(jù)挖掘402207(18頁珍藏版)》請在裝配圖網(wǎng)上搜索。
1、,Click to edit Master title style,Click to edit Master text styles,Second level,Third level,Fourth level,Fifth level,*,Data Mining:Concepts and Techniques,*,Data Mining:Concepts and Techniques,Slides for Textbook Chapter 4,Zhou Hongfang,Department of Computer Science and Engineering,Xi,an University
2、 of Technology,2024/11/18,1,Data Mining:Concepts and Techniques,Chapter 4:Data Mining Primitives,Languages,and System Architectures,Data mining primitives:What defines a data mining task?,A data mining query language,Design graphical user interfaces based on a data mining query language,Architecture
3、 of data mining systems,Summary,2024/11/18,2,Data Mining:Concepts and Techniques,Why Data Mining Primitives and Languages?,Finding all the patterns autonomously in a database?unrealistic because the patterns could be too many but uninteresting,Data mining should be an interactive process,User direct
4、s what to be mined,Users must be provided with a set of,primitives,to be used to communicate with the data mining system,Incorporating these primitives in a,data mining query language,More flexible user interaction,Foundation for design of graphical user interface,Standardization of data mining indu
5、stry and practice,2024/11/18,3,Data Mining:Concepts and Techniques,What Defines a Data Mining Task?,Task-relevant data,Type of knowledge to be mined,Background knowledge,Pattern interestingness measurements,Visualization of discovered patterns,2024/11/18,4,Data Mining:Concepts and Techniques,Task-Re
6、levant,Data,Database or data warehouse name,Database tables or data warehouse cubes,Condition for data selection,Relevant attributes or dimensions,Data grouping criteria,2024/11/18,5,Data Mining:Concepts and Techniques,Types of knowledge to be mined,Characterization,Discrimination,Association,Classi
7、fication/prediction,Clustering,Outlier analysis,Other data mining tasks,2024/11/18,6,Data Mining:Concepts and Techniques,Background Knowledge:Concept Hierarchies,Schema hierarchy,E.g.,street city province_or_state country,Set-grouping hierarchy,E.g.,20-39=young,40-59=middle_aged,Operation-derived hi
8、erarchy,email address:,dmbook,cs,.,sfu,.ca,login-name department university country,Rule-based hierarchy,low_profit_margin(X)=price(X,P,1,)and cost(X,P,2,)and(P,1,-P,2,)$50,2024/11/18,7,Data Mining:Concepts and Techniques,Measurements of Pattern Interestingness,Simplicity,e.g.,(association)rule leng
9、th,(decision)tree size,Certainty,e.g.,confidence,P(A|B)=#(A and B)/#(B),classification reliability or accuracy,certainty factor,rule strength,rule quality,discriminating weight,etc.,Utility,potential usefulness,e.g.,support(association),noise threshold(description),Novelty,not previously known,surpr
10、ising(used to remove redundant rules),2024/11/18,8,Data Mining:Concepts and Techniques,Visualization of Discovered Patterns,Different backgrounds/usages may require,different forms of representation,E.g.,rules,tables,crosstabs,pie/bar chart etc.,Concept hierarchy,is also important,Discovered knowled
11、ge might be more understandable when represented at,high level of abstraction,Interactive,drill up/down,pivoting,slicing and dicing,provide different perspectives to data,Different kinds of,knowledge,require different representation:association,classification,clustering,etc.,2024/11/18,9,Data Mining
12、:Concepts and Techniques,Chapter 4:Data Mining Primitives,Languages,and System Architectures,Data mining primitives:What defines a data mining task?,A data mining query language,Design graphical user interfaces based on a data mining query language,Architecture of data mining systems,Summary,2024/11
13、/18,10,Data Mining:Concepts and Techniques,A Data Mining Query Language(DMQL),Motivation,A DMQL can provide the ability to,support ad-hoc and interactive data mining,By providing a,standardized language,like SQL,Hope to achieve a similar effect like that SQL has on relational database,Foundation for
14、 system development and evolution,Facilitate information exchange,technology transfer,commercialization and wide acceptance,Design,DMQL is designed with the,primitives,described earlier,2024/11/18,11,Data Mining:Concepts and Techniques,Chapter 4:Data Mining Primitives,Languages,and System Architectu
15、res,Data mining primitives:What defines a data mining task?,A data mining query language,Design graphical user interfaces based on a data mining query language,Architecture of data mining systems,Summary,2024/11/18,12,Data Mining:Concepts and Techniques,Designing Graphical User Interfaces Based on a
16、 Data Mining Query Language,What tasks should be considered in the design GUIs based on a data mining query language?,Data collection and data mining query composition,Presentation of discovered patterns,Hierarchy specification and manipulation,Manipulation of data mining primitives,Interactive multilevel mining,Other miscellaneous information,2024/11/18,13,Data Mining:Concepts and Techniques,Chapter 4:Data Mining Primitives,Languages,and System Architectures,Data mining primitives:What defines