INFORMS Analytics Body of Knowledge

INFORMS Analytics Body of Knowledge

John Wiley and Sons Ltd

12/2018

400

Mole

Inglês

9781119483212

642

Descrição não disponível.
Preface xv List of Contributors xix 1 Introduction to Analytics 1 Philip T. Keenan, Jonathan H. Owen, and Kathryn Schumacher 1.1 Introduction 1 1.2 Conceptual Framework 3 1.2.1 Data-Centric Analytics 3 1.2.2 Decision-Centric Analytics 4 1.2.3 Combining Data- and Decision-Centric Approaches 5 1.3 Categories of Analytics 6 1.3.1 Descriptive Analytics 7 Data Modeling 7 Reporting 10 Visualization 10 Software 10 1.3.2 Predictive Analytics 10 Data Mining and Pattern Recognition 11 Predictive Modeling, Simulation, and Forecasting 11 Leveraging Expertise 12 1.3.3 Prescriptive Analytics 14 1.4 Analytics Within Organizations 16 1.4.1 Projects 17 1.4.2 Communicating Analytics 21 1.4.3 Organizational Capability 21 1.5 Ethical Implications 23 1.6 The Changing World of Analytics 25 1.7 Conclusion 28 References 28 2 Getting Started with Analytics 31 Karl G. Kempf 2.1 Introduction 31 2.2 Five Manageable Tasks 32 2.2.1 Task 1: Selecting the Target Problem 33 2.2.2 Task 2: Assemble the Team 34 Executive Sponsor 35 Project Manager 35 Domain Expert 35 IT Expert 35 Data Scientist 36 Stakeholders 36 2.2.3 Task 3: Prepare the Data 36 2.2.4 Task 4: Selecting Analytics Tools 39 Analytical Specificity or Breadth 39 Access to Data 40 Execution Performance 40 Visualization Capability 40 Data Scientist Skillset 40 Vendor Pricing 41 Team Budget 41 Sharing and Collaboration 41 2.2.5 Task 5: Execute 42 2.3 Real Examples 43 Case 1: Sensor Data and High-Velocity Analytics to Save Operating Costs 43 Case 2: Social Media and High-Velocity Analytics for Quick Response to Customers 44 Case 3: Sensor Data and High-Velocity Analytics to Save Maintenance Costs 44 Case 4: Using Old Data and Analytics to Detect New Fraudulent Claims 45 Case 5: Using Old and New Data Plus Analytics to Decrease Crime 45 Case 6: Collecting the Data and Applying the Analytics Is the Business 45 References 46 Further Reading: Papers 47 Further Reading: Books 48 3 The Analytics Team 49 Thomas H. Davenport 3.1 Introduction 49 3.2 Skills Necessary for Analytics 50 3.2.1 More Advanced or Recent Analytical and Data Science Skills 51 3.2.2 The Larger Team 53 3.3 Managing Analytical Talent 57 3.3.1 Developing Talent 58 3.3.2 Working with the HR Organization 59 3.4 Organizing Analytics 61 3.4.1 Goals of a Particular Analytics Organization 62 3.4.2 Basic Models for Organizing Analytics 63 3.4.3 Coordination Approaches 65 Program Management Office 66 Federation 67 Community 67 Matrix 67 Rotation 67 Assigned Customers 67 What Model Fits Your Business? 68 3.4.4 Organizational Structures for Specific Analytics Strategies and Scenarios 70 3.4.5 Analytical Leadership and the Chief Analytics Officer 70 3.5 To Where Should Analytical Functions Report? 72 Information Technology 72 Strategy 72 Shared Services 72 Finance 73 Marketing or Other Specific Function 73 Product Development 73 3.5.1 Building an Analytical Ecosystem 73 3.5.2 Developing the Analytical Organization over Time 74 References 75 4 The Data 77 Brian T. Downs 4.1 Introduction 77 4.2 Data Collection 77 4.2.1 Data Types 77 4.2.2 Data Discovery 80 4.3 Data Preparation 86 4.4 Data Modeling 93 4.4.1 Relational Databases 93 4.4.2 Nonrelational Databases 95 4.5 Data Management 97 5 Solution Methodologies 99 Mary E. Helander 5.1 Introduction 99 5.1.1 What Exactly Do We Mean by "Solution," "Problem," and "Methodology?" 99 5.1.2 It's All About the Problem 101 5.1.3 Solutions versus Products 101 5.1.4 How This Chapter Is Organized 103 5.1.5 The "Descriptive-Predictive-Prescriptive" Analytics Paradigm 105 5.1.6 The Goals of This Chapter 105 5.2 Macro-Solution Methodologies for the Analytics Practitioner 106 5.2.1 The Scientific Research Methodology 106 5.2.2 The Operations Research Project Methodology 109 5.2.3 The Cross-Industry Standard Process for Data Mining (CRISP-DM) Methodology 112 5.2.4 Software Engineering-Related Solution Methodologies 114 5.2.5 Summary of Macro-Methodologies 114 5.3 Micro-Solution Methodologies for the Analytics Practitioner 116 5.3.1 Micro-Solution Methodology Preliminaries 116 5.3.2 Micro-Solution Methodology Description Framework 117 5.3.3 Group I: Micro-Solution Methodologies for Exploration and Discovery 119 Group I: Problems of Interest 119 Group I: Relevant Models 119 Group I: Data Considerations 120 Group I: Solution Techniques 120 Group I: Relationship to Macro-Methodologies 126 Group I: Takeaways 126 5.3.4 Group II: Micro-Solution Methodologies Using Models Where Techniques to Find Solutions Are Independent of Data 127 Group II: Problems of Interest 127 Group II: Relevant Models 127 Group II: Data Considerations 128 Group II: Solution Techniques 128 Group II: Relationship to Macro-Methodologies 135 Group II: Takeaways 137 5.3.5 Group III: Micro-Solution Methodologies Using Models Where Techniques to Find Solutions Are Dependent on Data 137 Group III: Problems of Interest 137 Group III: Relevant Models 138 Group III: Data Considerations 138 Group III: Solution Techniques 139 Group III: Relationship to Macro-Methodologies 140 Group III: Takeaways 141 5.3.6 Micro-Methodology Summary 141 5.4 General Methodology-Related Considerations 142 5.4.1 Planning an Analytics Project 142 5.4.2 Software and Tool Selection 142 5.4.3 Visualization 143 5.4.4 Fields with Related Methodologies 144 5.5 Summary and Conclusions 144 5.5.1 "Ding Dong, the Scientific Method Is Dead!" 145 5.5.2 "Methodology Cramps My Analytics Style" 145 5.5.3 "There Is Only One Way to Solve This" 146 5.5.4 Perceived Success Is More Important Than the Right Answer 148 5.6 Acknowledgments 149 References 149 6 Modeling 155 Gerald G. Brown 6.1 Introduction 155 6.2 When are Models Appropriate 155 6.2.1 What Is the Problem with This System? 159 6.2.2 Is This Problem Important? 159 6.2.3 How Will This Problem Be Solved Without a New Model? 159 6.2.4 What Modeling Technique Will Be Used? 159 6.2.5 How Will We Know When We Have Succeeded? 160 Who Are the System Operator Stakeholders? 160 6.3 Types of Models 161 6.3.1 Descriptive Models 161 6.3.2 Predictive Models 161 6.3.3 Prescriptive Models 161 6.4 Models Can Also Be Characterized by Whether They Are Deterministic or Stochastic (Random) 161 6.5 Counting 162 6.6 Probability 163 6.7 Probability Perspectives and Subject Matter Experts 165 6.8 Subject Matter Experts 165 6.9 Statistics 166 6.9.1 A Random Sample 166 6.9.2 Descriptive Statistics 166 6.9.3 Parameter Estimation with a Confidence Interval 166 6.9.4 Regression 167 6.10 Inferential Statistics 169 6.11 A Stochastic Process 170 6.12 Digital Simulation 173 6.12.1 Static versus Dynamic Simulations 174 6.13 Mathematical Optimization 174 6.14 Measurement Units 175 6.15 Critical Path Method 176 6.16 Portfolio Optimization Case Study Solved By a Variety of Methods 178 6.16.1 Linear Program 178 6.16.2 Heuristic 179 6.16.3 Assessing Our Progress 179 6.16.4 Relaxations and Bounds 179 6.16.5 Are We Finished Yet? 180 6.17 Game Theory 181 6.18 Decision Theory 184 6.19 Susceptible, Exposed, Infected, Recovered (SEIR) Epidemiology 187 6.20 Search Theory 189 6.21 Lanchester Models of Warfare 189 6.22 Hughes' Salvo Model of Combat 192 6.23 Single-Use Models 193 6.24 The Principle of Optimality and Dynamic Programming 195 6.25 Stack-Based Enumeration 197 6.25.1 Data Structures 197 6.25.2 Discussion 199 6.25.3 Generating Permutations and Combinations 199 6.26 Traveling Salesman Problem: Another Case Study in Alternate Solution Methods 200 6.27 Model Documentation, Management, and Performance 206 6.27.1 Model Formulation 206 6.27.2 Choice of Implementation Language 207 6.27.3 Supervised versus Automated Models 207 6.27.4 Model Fidelity 208 6.27.5 Sensitivity Analysis 210 6.27.6 With Different Methods 211 6.27.7 With Different Variables 212 6.27.8 Stability 213 6.27.9 Reliability 213 6.27.10 Scalability 213 6.27.11 Extensibility 214 6.28 Rules for Data Use 215 6.28.1 Proprietary Data 215 6.28.2 Licensed Data 215 6.28.3 Personally Identifiable Information 216 6.28.4 Protected Critical Infrastructure Information System (PCIIMS) 216 6.28.5 Institutional Review Board (IRB) 216 6.28.6 Department of Defense and Department of Energy Classification 216 6.28.7 Law Enforcement Data 216 6.28.8 Copyright and Trademark 216 6.28.9 Paraphrased and Plagiarized 217 6.28.10 Displays of Model Outputs 217 6.28.11 Data Integrity 217 6.28.12 Multiple Data Evolutions 217 6.29 Data Interpolation and Extrapolation 217 6.30 Model Verification and Validation 218 6.30.1 Verifying 219 6.30.2 Validating 219 6.30.3 Comparing Models 219 6.30.4 Sample Data 220 6.30.5 Data Diagnostics 220 6.30.6 Data Vintage and Provenance 220 6.31 Communicate with Stakeholders 220 6.31.1 Training 221 6.31.2 Report Writers 221 6.31.3 Standard Form Model Statement 222 6.31.4 Persistence and Monotonicity: Examples of Realistic Model Restrictions 223 6.31.5 Model Solutions Require a Lot of Polish and Refinement Before They Can Directly Influence Policy 224 6.31.6 Model Obsolescence and Model-Advised Thumb Rules 226 6.32 Software 227 6.33 Where to Go from Here 228 6.34 Acknowledgments 228 References 229 7 Machine Learning 231 Samuel H. Huddleston and Gerald G. Brown 7.1 Introduction 231 7.2 Supervised, Unsupervised, and Reinforcement Learning 232 7.3 Model Development, Selection, and Deployment for Supervised Learning 235 7.3.1 Goals and Guiding Principles in Machine Learning 235 7.3.2 Algorithmic Modeling Overview 236 7.3.3 Data Acquisition and Cleaning 236 7.3.4 Feature Engineering 237 7.3.5 Modeling Overview 238 7.3.6 Model Fitting (Training) and Feature Selection 240 7.3.7 Model (Algorithm) Selection 241 7.3.8 Model Performance Assessment 242 7.3.9 Model Implementation 242 7.4 Model Fitting, Model Error, and the Bias-Variance Trade-Off 243 7.4.1 Components of (Regression) Model Error 243 7.4.2 Model Fitting: Balancing Bias and Variance 245 7.5 Predictive Performance Evaluation 247 7.5.1 Regression Performance Evaluation 248 7.5.2 Classification Performance Evaluation 249 7.5.3 Performance Evaluation for Time-Dependent Data 253 7.6 An Overview of Supervised Learning Algorithms 254 7.6.1 k-Nearest Neighbors (KNN) 255 7.6.2 Extensions to Regression 256 7.6.3 Classification and Regression Trees 257 7.6.4 Time Series Forecasting 259 7.6.5 Support Vector Machines 261 7.6.6 Artificial Neural Networks 262 7.6.7 Ensemble Methods 265 7.7 Unsupervised Learning Algorithms 267 7.7.1 Kernel Density Estimation 267 7.7.2 Association Rule Mining 268 7.7.3 Clustering Methods 269 7.7.4 Principal Components Analysis (PCA) 270 7.7.5 Bag-of-Words and Vector Space Models 271 7.8 Conclusion 272 7.9 Acknowledgments 272 References 273 8 Deployment and Life Cycle Management 275 Arnie Greenland 8.1 Introduction 275 8.2 The Analytics Methodology: Understanding the Critical Steps in Deployment and Life Cycle Management 276 8.2.1 CRISP-DM Phase 1: Business Understanding 278 8.2.2 JTA Domain I, Task 1: Obtain or Receive Problem Statement and Usability 278 8.2.3 JTA Domain I, Task 2: Identify Stakeholders 279 8.2.4 JTA Domain I, Task 3: Determine if the Problem Is Amenable to an Analytics Solution 281 8.2.5 JTA Domain I, Task 4: Refine the Problem Statement and Delineate Constraints 281 8.2.6 JTA Domain I, Task 5: Define an Initial Set of Business Benefits 281 8.2.7 JTA Domain I, Task 6: Obtain Stakeholder Agreement on the Business Statement 282 8.2.8 JTA Domain II, Task 1: Reformulate the Problem Statement as an Analytics Problem 283 8.2.9 JTA Domain II, Task 2: Develop a Proposed Set of Drivers and Relationships to Outputs 285 8.2.10 JTA Domain II, Task 3: State the Set of Assumptions Related to the Problem 286 8.2.11 JTA Domain II, Task 4: Define the Key Metrics of Success 287 8.2.12 JTA Domain II, Task 5: Obtain Stakeholder Agreement 287 8.2.13 CRISP-DM Phases 2 and 3: Data Understanding and Data Preparation 288 8.2.14 JTA Domain III, Task 1: Identify and Prioritize Data Needs and Sources 290 8.2.15 JTA Domain III, Task 2: Acquire Data 290 8.2.16 JTA Domain III, Task 3: Harmonize, Rescale, Clean, and Share Data 291 8.2.17 JTA Domain III, Task 4: Identify Relationships in the Data 292 8.2.18 JTA Domain III, Task 5: Document and Report Finding 293 8.2.19 JTA Domain III, Task 6: Refine the Business and Analytics Problem Statements 293 8.2.20 CRISP-DM Phase 4: Modeling 293 8.2.21 CRISP-DM Phase 5: Evaluation 294 8.2.22 CRISP-DM Phase 6: Deployment 297 8.2.23 Deployment of the Analytics Model (Up to Delivery) 298 8.2.24 Post-deployment Activities (Domain VI: Model Life Cycle Management) 301 8.3 Overarching Issues of Life Cycle Management 303 8.3.1 Documentation 303 8.3.2 Communication 305 8.3.3 Testing 307 8.3.4 Metrics 308 9 The Blossoming Analytics Talent Pool: An Overview of the Analytics Ecosystem 311 Ramesh Sharda and Pankush Kalgotra 9.1 Introduction 311 9.2 Analytics Industry Ecosystem 312 9.2.1 Data Generation Infrastructure Providers 314 9.2.2 Data Management Infrastructure Providers 315 9.2.3 Data Warehouse Providers 316 9.2.4 Middleware Providers 316 9.2.5 Data Service Providers 316 9.2.6 Analytics-Focused Software Developers 317 Reporting/Descriptive Analytics 317 Predictive Analytics 318 Prescriptive Analytics 318 9.2.7 Application Developers: Industry-Specific or General 319 9.2.8 Analytics Industry Analysts and Influencers 321 9.2.9 Academic Institutions and Certification Agencies 322 9.2.10 Regulators and Policy Makers 323 9.2.11 Analytics User Organizations 323 9.3 Conclusions 325 References 326 Appendix: Writing and Teaching Analytics with Cases 327 James J. Cochran Index 355
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
analytics; INFORMS; ABOK; statistics; analytics for statistics; analytics for operations research; analytics for economics; analytics for computer science; analytics operations; analytics for industrial engineering; math and analytics; analytics for mathematics; big data; analytics and big data; data science; analytics and machine learning; machine learning; applied statistics; statistical analytics; understanding analytics; applying analytics; guide to analytics; Analytics Body of Knowledge; the Institute for Operations Research and the Management Sciences; INFORMS ABOK